Search results for: data mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7524

Search results for: data mining

6804 Making Data Structures and Algorithms more Understandable by Programming Sudoku the Human Way

Authors: Roelien Goede

Abstract:

Data Structures and Algorithms is a module in most Computer Science or Information Technology curricula. It is one of the modules most students identify as being difficult. This paper demonstrates how programming a solution for Sudoku can make abstract concepts more concrete. The paper relates concepts of a typical Data Structures and Algorithms module to a step by step solution for Sudoku in a human type as opposed to a computer oriented solution.

Keywords: Data Structures, Algorithms, Sudoku, ObjectOriented Programming, Programming Teaching, Education.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3073
6803 Construction Of Decentralized Lifetime Maximizing Tree for Data Aggregation in Wireless Sensor Networks

Authors: Deepali Virmani , Satbir Jain

Abstract:

To meet the demands of wireless sensor networks (WSNs) where data are usually aggregated at a single source prior to transmitting to any distant user, there is a need to establish a tree structure inside any given event region. In this paper , a novel technique to create one such tree is proposed .This tree preserves the energy and maximizes the lifetime of event sources while they are constantly transmitting for data aggregation. The term Decentralized Lifetime Maximizing Tree (DLMT) is used to denote this tree. DLMT features in nodes with higher energy tend to be chosen as data aggregating parents so that the time to detect the first broken tree link can be extended and less energy is involved in tree maintenance. By constructing the tree in such a way, the protocol is able to reduce the frequency of tree reconstruction, minimize the amount of data loss ,minimize the delay during data collection and preserves the energy.

Keywords: branch energy, decentralized, energy level , lifetime, tree energy, wireless sensor networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1472
6802 Effects of Data Correlation in a Sparse-View Compressive Sensing Based Image Reconstruction

Authors: Sajid Abbas, Joon Pyo Hong, Jung-Ryun Lee, Seungryong Cho

Abstract:

Computed tomography and laminography are heavily investigated in a compressive sensing based image reconstruction framework to reduce the dose to the patients as well as to the radiosensitive devices such as multilayer microelectronic circuit boards. Nowadays researchers are actively working on optimizing the compressive sensing based iterative image reconstruction algorithm to obtain better quality images. However, the effects of the sampled data’s properties on reconstructed the image’s quality, particularly in an insufficient sampled data conditions have not been explored in computed laminography. In this paper, we investigated the effects of two data properties i.e. sampling density and data incoherence on the reconstructed image obtained by conventional computed laminography and a recently proposed method called spherical sinusoidal scanning scheme. We have found that in a compressive sensing based image reconstruction framework, the image quality mainly depends upon the data incoherence when the data is uniformly sampled.

Keywords: Computed tomography, Computed laminography, Compressive sending, Low-dose.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1655
6801 Petrology Investigation of Apatite Minerals in the Esfordi Mine, Yazd, Iran

Authors: Haleh Rezaei Zanjirabadi, Fatemeh Saberi, Bahman Rahimzadeh, Fariborz Masoudi, Mohammad Rahgosha

Abstract:

In this study, apatite minerals from the iron-phosphate deposit of Yazd have been investigated within the microcontinent zone of Iran in the Zagros structural zone. The geological units in the Esfordi area belong to the pre-Cambrian to lower-Cambrian age, consisting of a succession of carbonate rocks (dolomite), shale, tuff, sandstone, and volcanic rocks. In addition to the mentioned sedimentary and volcanic rocks, the granitoid mass of Bahabad, which is the largest intrusive mass in the region, has intruded into the eastern part of this series and has caused its metamorphism and alteration. After collecting the available data, various samples of Esfordi’s apatite were prepared, and their mineralogy and crystallography were investigated using laboratory methods such as petrographic microscopy, Raman spectroscopy, EDS (Energy Dispersive Spectroscopy), and Scanning Electron Microscopy (SEM). In non-destructive Raman spectroscopy, the molecular structure of apatite minerals was revealed in four distinct spectral ranges. Initially, the spectra of phosphate and aluminum bonds with O2HO, OH, were observed, followed by the identification of Cl, OH, Al, Na, Ca and hydroxyl units depending on the type of apatite mineral family. In SEM analysis, based on various shapes and different phases of apatites, their constituent major elements were identified through EDS, indicating that the samples from the Esfordi mining area exhibit a dense and coherent texture with smooth surfaces. Based on the elemental analysis results by EDS, the apatites in the Esfordi area are classified into the calcic apatite group.

Keywords: Petrology, apatite, Esfordi, EDS, SEM, Scanning Electron Microscopy, Raman spectroscopy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 95
6800 Real-Time Data Stream Partitioning over a Sliding Window in Real-Time Spatial Big Data

Authors: Sana Hamdi, Emna Bouazizi, Sami Faiz

Abstract:

In recent years, real-time spatial applications, like location-aware services and traffic monitoring, have become more and more important. Such applications result dynamic environments where data as well as queries are continuously moving. As a result, there is a tremendous amount of real-time spatial data generated every day. The growth of the data volume seems to outspeed the advance of our computing infrastructure. For instance, in real-time spatial Big Data, users expect to receive the results of each query within a short time period without holding in account the load of the system. But with a huge amount of real-time spatial data generated, the system performance degrades rapidly especially in overload situations. To solve this problem, we propose the use of data partitioning as an optimization technique. Traditional horizontal and vertical partitioning can increase the performance of the system and simplify data management. But they remain insufficient for real-time spatial Big data; they can’t deal with real-time and stream queries efficiently. Thus, in this paper, we propose a novel data partitioning approach for real-time spatial Big data named VPA-RTSBD (Vertical Partitioning Approach for Real-Time Spatial Big data). This contribution is an implementation of the Matching algorithm for traditional vertical partitioning. We find, firstly, the optimal attribute sequence by the use of Matching algorithm. Then, we propose a new cost model used for database partitioning, for keeping the data amount of each partition more balanced limit and for providing a parallel execution guarantees for the most frequent queries. VPA-RTSBD aims to obtain a real-time partitioning scheme and deals with stream data. It improves the performance of query execution by maximizing the degree of parallel execution. This affects QoS (Quality Of Service) improvement in real-time spatial Big Data especially with a huge volume of stream data. The performance of our contribution is evaluated via simulation experiments. The results show that the proposed algorithm is both efficient and scalable, and that it outperforms comparable algorithms.

Keywords: Real-Time Spatial Big Data, Quality Of Service, Vertical partitioning, Horizontal partitioning, Matching algorithm, Hamming distance, Stream query.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1034
6799 The Impact of the General Data Protection Regulation on Human Resources Management in Schools

Authors: Alexandra Aslanidou

Abstract:

The General Data Protection Regulation (GDPR), concerning the protection of natural persons within the European Union with regard to the processing of personal data and on the free movement of such data, became applicable in the European Union (EU) on 25 May 2018 and transformed the way personal data were being treated under the Data Protection Directive (DPD) regime, generating sweeping organizational changes to both public sector and business. A social practice that is considerably influenced in the way of its day-to-day operations is Human Resource (HR) management, for which the importance of GDPR cannot be underestimated. That is because HR processes personal data coming in all shapes and sizes from many different systems and sources. The significance of the proper functioning of an HR department, specifically in human-centered, service-oriented environments such as the education field, is decisive due to the fact that HR operations in schools, conducted effectively, determine the quality of the provided services and consequently have a considerable impact on the success of the educational system. The purpose of this paper is to analyze the decisive role that GDPR plays in HR departments that operate in schools and in order to practically evaluate the aftermath of the Regulation during the first months of its applicability; a comparative use cases analysis in five highly dynamic schools, across three EU Member States, was attempted.

Keywords: General data protection regulation, human resource management, educational system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 709
6798 Issue Reorganization Using the Measure of Relevance

Authors: William Wong Xiu Shun, Yoonjin Hyun, Mingyu Kim, Seongi Choi, Namgyu Kim

Abstract:

The need to extract R&D keywords from issues and use them to retrieve R&D information is increasing rapidly. However, it is difficult to identify related issues or distinguish them. Although the similarity between issues cannot be identified, with an R&D lexicon, issues that always share the same R&D keywords can be determined. In detail, the R&D keywords that are associated with a particular issue imply the key technology elements that are needed to solve a particular issue. Furthermore, the relationship among issues that share the same R&D keywords can be shown in a more systematic way by clustering them according to keywords. Thus, sharing R&D results and reusing R&D technology can be facilitated. Indirectly, redundant investment in R&D can be reduced as the relevant R&D information can be shared among corresponding issues and the reusability of related R&D can be improved. Therefore, a methodology to cluster issues from the perspective of common R&D keywords is proposed to satisfy these demands.

Keywords: Clustering, Social Network Analysis, Text Mining, Topic Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2022
6797 A Data Warehouse System to Help Assist Breast Cancer Screening in Diagnosis, Education and Research

Authors: Souâd Demigha

Abstract:

Early detection of breast cancer is considered as a major public health issue. Breast cancer screening is not generalized to the entire population due to a lack of resources, staff and appropriate tools. Systematic screening can result in a volume of data which can not be managed by present computer architecture, either in terms of storage capabilities or in terms of exploitation tools. We propose in this paper to design and develop a data warehouse system in radiology-senology (DWRS). The aim of such a system is on one hand, to support this important volume of information providing from multiple sources of data and images and for the other hand, to help assist breast cancer screening in diagnosis, education and research.

Keywords: Breast cancer screening, data warehouse, diagnosis, education, research.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1698
6796 Data Security in a DApp Twitter Alike on Web 3.0 With Blockchain Based Technology

Authors: Vishal Awasthi, Tanya Soni, Vigya Awasthi, Swati Singh, Shivali Verma

Abstract:

There is a growing demand for a network that grants a high level of data security and confidentiality. For this reason, the semantic web was introduced, which allows data to be shared and reused across applications while safeguarding users privacy and user’s will grab back control of their data. The earlier Web 1.0 and Web 2.0 versions were built on client-server architecture, in  which there was the risk of data theft and unconsented sale of user data. A decentralized version, Known as Web 3.0, that is mostly built on blockchain technology was interjected to resolve these issues. The recent research focuses on blockchain technology, deals with privacy, security, transparency, and innovation of decentralized applications (DApps), e.g. a Twitter Clone, Whatsapp clone. In this paper the Twitter Alike built on the Ethereum blockchain will replace traditional techniques with improved latency, throughput, and data ownership. The central principle of this DApp is smart contract implemented using Solidity which is an object- oriented and highlevel language. Consequently, this will provide a better Quality Services, high data security, and integrity for both present and future internet technologies.

Keywords: Blockchain, DApps, Ethereum, Semantic Web, Smart Contract, Solidity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 260
6795 Improvement of Data Transfer over Simple Object Access Protocol (SOAP)

Authors: Khaled Ahmed Kadouh, Kamal Ali Albashiri

Abstract:

This paper presents a designed algorithm involves improvement of transferring data over Simple Object Access Protocol (SOAP). The aim of this work is to establish whether using SOAP in exchanging XML messages has any added advantages or not. The results showed that XML messages without SOAP take longer time and consume more memory, especially with binary data.

Keywords: JAX-WS, SMTP, SOAP, Web service, XML.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2109
6794 Numerical Simulations of Flood and Inundation in Jobaru River Basin Using Laser Profiler Data

Authors: Hiroto Nakashima, Toshihiro Morita, Koichiro Ohgushi

Abstract:

Laser Profiler (LP) data from aerial laser surveys have been increasingly used as topographical inputs to numerical simulations of flooding and inundation in river basins. LP data has great potential for reproducing topography, but its effective usage has not yet been fully established. In this study, flooding and inundation are simulated numerically using LP data for the Jobaru River basin of Japan’s Saga Plain. The analysis shows that the topography is reproduced satisfactorily in the computational domain with urban and agricultural areas requiring different grid sizes. A 2-D numerical simulation shows that flood flow behavior changes as grid size is varied.

Keywords: LP data, numerical simulation, topological analysis, mesh size.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1513
6793 Channels Splitting Strategy for Optical Local Area Networks of Passive Star Topology

Authors: Peristera Baziana

Abstract:

In this paper, we present a network configuration for a WDM LANs of passive star topology that assume that the set of data WDM channels is split into two separate sets of channels, with different access rights over them. Especially, a synchronous transmission WDMA access algorithm is adopted in order to increase the probability of successful transmission over the data channels and consequently to reduce the probability of data packets transmission cancellation in order to avoid the data channels collisions. Thus, a control pre-transmission access scheme is followed over a separate control channel. An analytical Markovian model is studied and the average throughput is mathematically derived. The performance is studied for several numbers of data channels and various values of control phase duration.

Keywords: Access algorithm, channels division, collisions avoidance, wavelength division multiplexing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 994
6792 Data-Reusing Adaptive Filtering Algorithms with Adaptive Error Constraint

Authors: Young-Seok Choi

Abstract:

We present a family of data-reusing and affine projection algorithms. For identification of a noisy linear finite impulse response channel, a partial knowledge of a channel, especially noise, can be used to improve the performance of the adaptive filter. Motivated by this fact, the proposed scheme incorporates an estimate of a knowledge of noise. A constraint, called the adaptive noise constraint, estimates an unknown information of noise. By imposing this constraint on a cost function of data-reusing and affine projection algorithms, a cost function based on the adaptive noise constraint and Lagrange multiplier is defined. Minimizing the new cost function leads to the adaptive noise constrained (ANC) data-reusing and affine projection algorithms. Experimental results comparing the proposed schemes to standard data-reusing and affine projection algorithms clearly indicate their superior performance.

Keywords: Data-reusing, affine projection algorithm, error constraint, system identification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1596
6791 Balancing Strategies for Parallel Content-based Data Retrieval Algorithms in a k-tree Structured Database

Authors: Radu Dobrescu, Matei Dobrescu, Daniela Hossu

Abstract:

The paper proposes a unified model for multimedia data retrieval which includes data representatives, content representatives, index structure, and search algorithms. The multimedia data are defined as k-dimensional signals indexed in a multidimensional k-tree structure. The benefits of using the k-tree unified model were demonstrated by running the data retrieval application on a six networked nodes test bed cluster. The tests were performed with two retrieval algorithms, one that allows parallel searching using a single feature, the second that performs a weighted cascade search for multiple features querying. The experiments show a significant reduction of retrieval time while maintaining the quality of results.

Keywords: balancing strategies, multimedia databases, parallelprocessing, retrieval algorithms

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1408
6790 Handling Mobility using Virtual Grid in Static Wireless Sensor Networks

Authors: T.P. Sharma

Abstract:

Querying a data source and routing data towards sink becomes a serious challenge in static wireless sensor networks if sink and/or data source are mobile. Many a times the event to be observed either moves or spreads across wide area making maintenance of continuous path between source and sink a challenge. Also, sink can move while query is being issued or data is on its way towards sink. In this paper, we extend our already proposed Grid Based Data Dissemination (GBDD) scheme which is a virtual grid based topology management scheme restricting impact of movement of sink(s) and event(s) to some specific cells of a grid. This obviates the need for frequent path modifications and hence maintains continuous flow of data while minimizing the network energy consumptions. Simulation experiments show significant improvements in network energy savings and average packet delay for a packet to reach at sink.

Keywords: Mobility in WSNs, virtual grid, GBDD, clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1534
6789 Ozone Decomposition over Silver-Loaded Perlite

Authors: Krassimir Genov, Vladimir Georgiev, Todor Batakliev, Dipak K. Sarker

Abstract:

The Bulgarian natural expanded mineral obtained from Bentonite AD perlite (A deposit of "The Broken Mountain" for perlite mining, near by the village of Vodenicharsko, in the municipality of Djebel), was loaded with silver (as ion form - Ag+ 2 and 5 wt% by the incipient wetness impregnation method), and as atomic silver - Ag0 using Tollen-s reagent (silver mirror reaction). Some physicochemical characterization of the samples are provided via: DC arc-AES, XRD, DR-IR and UV-VIS. The aim of this work was to obtain and test the silver-loaded catalyst for ozone decomposition. So the samples loaded with atomic silver show ca. 80% conversion of ozone 20 minutes after the reaction start. Then conversion decreases to ca. 20 % but stay stable during the prolongation of time.

Keywords: aluminum-silicates, Ag/perlite expanded glass, ozone decomposition

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2244
6788 Experimental Modal Analysis and Model Validation of Antenna Structures

Authors: B.R. Potgieter, G. Venter

Abstract:

Numerical design optimization is a powerful tool that can be used by engineers during any stage of the design process. There are many different applications for structural optimization. A specific application that will be discussed in the following paper is experimental data matching. Data obtained through tests on a physical structure will be matched with data from a numerical model of that same structure. The data of interest will be the dynamic characteristics of an antenna structure focusing on the mode shapes and modal frequencies. The structure used was a scaled and simplified model of the Karoo Array Telescope-7 (KAT-7) antenna structure. This kind of data matching is a complex and difficult task. This paper discusses how optimization can assist an engineer during the process of correlating a finite element model with vibration test data.

Keywords: Finite Element Model (FEM), Karoo Array Telescope(KAT-7), modal frequencies, mode shapes, optimization, shape optimization, size optimization, vibration tests

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1825
6787 Compressed Suffix Arrays to Self-Indexes Based on Partitioned Elias-Fano

Authors: Guo Wenyu, Qu Youli

Abstract:

A practical and simple self-indexing data structure, Partitioned Elias-Fano (PEF) - Compressed Suffix Arrays (CSA), is built in linear time for the CSA based on PEF indexes. Moreover, the PEF-CSA is compared with two classical compressed indexing methods, Ferragina and Manzini implementation (FMI) and Sad-CSA on different type and size files in Pizza & Chili. The PEF-CSA performs better on the existing data in terms of the compression ratio, count, and locates time except for the evenly distributed data such as proteins data. The observations of the experiments are that the distribution of the φ is more important than the alphabet size on the compression ratio. Unevenly distributed data φ makes better compression effect, and the larger the size of the hit counts, the longer the count and locate time.

Keywords: Compressed suffix array, self-indexing, partitioned Elias-Fano, PEF-CSA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1060
6786 A Decision Matrix for the Evaluation of Triplestores for Use in a Virtual Research Environment

Authors: Tristan O’Neill, Trina Myers, Jarrod Trevathan

Abstract:

The Tropical Data Hub (TDH) is a virtual research environment that provides researchers with an e-research infrastructure to congregate significant tropical data sets for data reuse, integration, searching, and correlation. However, researchers often require data and metadata synthesis across disciplines for cross-domain analyses and knowledge discovery. A triplestore offers a semantic layer to achieve a more intelligent method of search to support the synthesis requirements by automating latent linkages in the data and metadata. Presently, the benchmarks to aid the decision of which triplestore is best suited for use in an application environment like the TDH are limited to performance. This paper describes a new evaluation tool developed to analyze both features and performance. The tool comprises a weighted decision matrix to evaluate the interoperability, functionality, performance, and support availability of a range of integrated and native triplestores to rank them according to requirements of the TDH.

Keywords: Virtual research environment, Semantic Web, performance analysis, tropical data hub.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1680
6785 Real-time Performance Study of EPA Periodic Data Transmission

Authors: Liu Ning, Zhong Chongquan, Teng Hongfei

Abstract:

EPA (Ethernet for Plant Automation) resolves the nondeterministic problem of standard Ethernet and accomplishes real-time communication by means of micro-segment topology and deterministic scheduling mechanism. This paper studies the real-time performance of EPA periodic data transmission from theoretical and experimental perspective. By analyzing information transmission characteristics and EPA deterministic scheduling mechanism, 5 indicators including delivery time, time synchronization accuracy, data-sending time offset accuracy, utilization percentage of configured timeslice and non-RTE bandwidth that can be used to specify the real-time performance of EPA periodic data transmission are presented and investigated. On this basis, the test principles and test methods of the indicators are respectively studied and some formulas for real-time performance of EPA system are derived. Furthermore, an experiment platform is developed to test the indicators of EPA periodic data transmission in a micro-segment. According to the analysis and the experiment, the methods to improve the real-time performance of EPA periodic data transmission including optimizing network structure, studying self-adaptive adjustment method of timeslice and providing data-sending time offset accuracy for configuration are proposed.

Keywords: EPA system, Industrial Ethernet, Periodic data, Real-time performance

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1453
6784 Data Quality Enhancement with String Length Distribution

Authors: Qi Xiu, Hiromu Hota, Yohsuke Ishii, Takuya Oda

Abstract:

Recently, collectable manufacturing data are rapidly increasing. On the other hand, mega recall is getting serious as a social problem. Under such circumstances, there are increasing needs for preventing mega recalls by defect analysis such as root cause analysis and abnormal detection utilizing manufacturing data. However, the time to classify strings in manufacturing data by traditional method is too long to meet requirement of quick defect analysis. Therefore, we present String Length Distribution Classification method (SLDC) to correctly classify strings in a short time. This method learns character features, especially string length distribution from Product ID, Machine ID in BOM and asset list. By applying the proposal to strings in actual manufacturing data, we verified that the classification time of strings can be reduced by 80%. As a result, it can be estimated that the requirement of quick defect analysis can be fulfilled.

Keywords: Data quality, feature selection, probability distribution, string classification, string length.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1309
6783 Author Profiling: Prediction of Learners’ Gender on a MOOC Platform Based on Learners’ Comments

Authors: Tahani Aljohani, Jialin Yu, Alexandra. I. Cristea

Abstract:

The more an educational system knows about a learner, the more personalised interaction it can provide, which leads to better learning. However, asking a learner directly is potentially disruptive, and often ignored by learners. Especially in the booming realm of MOOC Massive Online Learning platforms, only a very low percentage of users disclose demographic information about themselves. Thus, in this paper, we aim to predict learners’ demographic characteristics, by proposing an approach using linguistically motivated Deep Learning Architectures for Learner Profiling, particularly targeting gender prediction on a FutureLearn MOOC platform. Additionally, we tackle here the difficult problem of predicting the gender of learners based on their comments only – which are often available across MOOCs. The most common current approaches to text classification use the Long Short-Term Memory (LSTM) model, considering sentences as sequences. However, human language also has structures. In this research, rather than considering sentences as plain sequences, we hypothesise that higher semantic - and syntactic level sentence processing based on linguistics will render a richer representation. We thus evaluate, the traditional LSTM versus other bleeding edge models, which take into account syntactic structure, such as tree-structured LSTM, Stack-augmented Parser-Interpreter Neural Network (SPINN) and the Structure-Aware Tag Augmented model (SATA). Additionally, we explore using different word-level encoding functions. We have implemented these methods on Our MOOC dataset, which is the most performant one comparing with a public dataset on sentiment analysis that is further used as a cross-examining for the models' results.

Keywords: Deep learning, data mining, gender predication, MOOCs.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1335
6782 Efficient and Extensible Data Processing Framework in Ubiquitious Sensor Networks

Authors: Junghoon Lee, Gyung-Leen Park, Ho-Young Kwak, Cheol Min Kim

Abstract:

This paper presents the design and implements the prototype of an intelligent data processing framework in ubiquitous sensor networks. Much focus is put on how to handle the sensor data stream as well as the interoperability between the low-level sensor data and application clients. Our framework first addresses systematic middleware which mitigates the interaction between the application layer and low-level sensors, for the sake of analyzing a great volume of sensor data by filtering and integrating to create value-added context information. Then, an agent-based architecture is proposed for real-time data distribution to efficiently forward a specific event to the appropriate application registered in the directory service via the open interface. The prototype implementation demonstrates that our framework can host a sophisticated application on the ubiquitous sensor network and it can autonomously evolve to new middleware, taking advantages of promising technologies such as software agents, XML, cloud computing, and the like.

Keywords: sensor network, intelligent farm, middleware, event detection

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1338
6781 Eliciting and Confirming Data, Information, Knowledge and Wisdom in a Specialist Health Care Setting: The WICKED Method

Authors: S. Impey, D. Berry, S. Furtado, M. Galvin, L. Grogan, O. Hardiman, L. Hederman, M. Heverin, V. Wade, L. Douris, D. O'Sullivan, G. Stephens

Abstract:

Healthcare is a knowledge-rich environment. This knowledge, while valuable, is not always accessible outside the borders of individual clinics. This research aims to address part of this problem (at a study site) by constructing a maximal data set (knowledge artefact) for motor neurone disease (MND). This data set is proposed as an initial knowledge base for a concurrent project to develop an MND patient data platform. It represents the domain knowledge at the study site for the duration of the research (12 months). A knowledge elicitation method was also developed from the lessons learned during this process - the WICKED method. WICKED is an anagram of the words: eliciting and confirming data, information, knowledge, wisdom. But it is also a reference to the concept of wicked problems, which are complex and challenging, as is eliciting expert knowledge. The method was evaluated at a second site, and benefits and limitations were noted. Benefits include that the method provided a systematic way to manage data, information, knowledge and wisdom (DIKW) from various sources, including healthcare specialists and existing data sets. Limitations surrounded the time required and how the data set produced only represents DIKW known during the research period. Future work is underway to address these limitations.

Keywords: Healthcare, knowledge acquisition, maximal data sets, action design science.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 460
6780 Temporally Coherent 3D Animation Reconstruction from RGB-D Video Data

Authors: Salam Khalifa, Naveed Ahmed

Abstract:

We present a new method to reconstruct a temporally coherent 3D animation from single or multi-view RGB-D video data using unbiased feature point sampling. Given RGB-D video data, in form of a 3D point cloud sequence, our method first extracts feature points using both color and depth information. In the subsequent steps, these feature points are used to match two 3D point clouds in consecutive frames independent of their resolution. Our new motion vectors based dynamic alignement method then fully reconstruct a spatio-temporally coherent 3D animation. We perform extensive quantitative validation using novel error functions to analyze the results. We show that despite the limiting factors of temporal and spatial noise associated to RGB-D data, it is possible to extract temporal coherence to faithfully reconstruct a temporally coherent 3D animation from RGB-D video data.

Keywords: 3D video, 3D animation, RGB-D video, Temporally Coherent 3D Animation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2054
6779 Application of Multi-Dimensional Principal Component Analysis to Medical Data

Authors: Naoki Yamamoto, Jun Murakami, Chiharu Okuma, Yutaro Shigeto, Satoko Saito, Takashi Izumi, Nozomi Hayashida

Abstract:

Multi-dimensional principal component analysis (PCA) is the extension of the PCA, which is used widely as the dimensionality reduction technique in multivariate data analysis, to handle multi-dimensional data. To calculate the PCA the singular value decomposition (SVD) is commonly employed by the reason of its numerical stability. The multi-dimensional PCA can be calculated by using the higher-order SVD (HOSVD), which is proposed by Lathauwer et al., similarly with the case of ordinary PCA. In this paper, we apply the multi-dimensional PCA to the multi-dimensional medical data including the functional independence measure (FIM) score, and describe the results of experimental analysis.

Keywords: multi-dimensional principal component analysis, higher-order SVD (HOSVD), functional independence measure (FIM), medical data, tensor decomposition

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2478
6778 Methodology of Restoration Research in Czech Republic

Authors: M. Rehor, V. Ondracek

Abstract:

Restoration research has become important on principle recently in Czech Republic. The reason is simple. More than 70 % of mined brown coal comes from the North Bohemian Basin these days. Open cast brown coal mining has lead to large damage on the landscape. Reclamation of phytotoxic areas is one of the serious problems in the North Bohemian Basin. It mainly concerns the areas with the occurrence of overburden rocks from the coal bed enriched with coal. The presented paper includes the characteristics of the important phytotoxic areas and the methodology of their reclamation. The results are documented with the long term monitoring of physical, mineralogical, chemical and pedological parameters of rocks in the testing areas.

Keywords: Brown coal, dump, methodology, restoration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1531
6777 Procedure Model for Data-Driven Decision Support Regarding the Integration of Renewable Energies into Industrial Energy Management

Authors: M. Graus, K. Westhoff, X. Xu

Abstract:

The climate change causes a change in all aspects of society. While the expansion of renewable energies proceeds, industry could not be convinced based on general studies about the potential of demand side management to reinforce smart grid considerations in their operational business. In this article, a procedure model for a case-specific data-driven decision support for industrial energy management based on a holistic data analytics approach is presented. The model is executed on the example of the strategic decision problem, to integrate the aspect of renewable energies into industrial energy management. This question is induced due to considerations of changing the electricity contract model from a standard rate to volatile energy prices corresponding to the energy spot market which is increasingly more affected by renewable energies. The procedure model corresponds to a data analytics process consisting on a data model, analysis, simulation and optimization step. This procedure will help to quantify the potentials of sustainable production concepts based on the data from a factory. The model is validated with data from a printer in analogy to a simple production machine. The overall goal is to establish smart grid principles for industry via the transformation from knowledge-driven to data-driven decisions within manufacturing companies.

Keywords: Data analytics, green production, industrial energy management, optimization, renewable energies, simulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1704
6776 Dynamic Data Partition Algorithm for a Parallel H.264 Encoder

Authors: Juntae Kim, Jaeyoung Park, Kyoungkun Lee, Jong Tae Kim

Abstract:

The H.264/AVC standard is a highly efficient video codec providing high-quality videos at low bit-rates. As employing advanced techniques, the computational complexity has been increased. The complexity brings about the major problem in the implementation of a real-time encoder and decoder. Parallelism is the one of approaches which can be implemented by multi-core system. We analyze macroblock-level parallelism which ensures the same bit rate with high concurrency of processors. In order to reduce the encoding time, dynamic data partition based on macroblock region is proposed. The data partition has the advantages in load balancing and data communication overhead. Using the data partition, the encoder obtains more than 3.59x speed-up on a four-processor system. This work can be applied to other multimedia processing applications.

Keywords: H.264/AVC, video coding, thread-level parallelism, OpenMP, multimedia

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1780
6775 XML Schema Automatic Matching Solution

Authors: Huynh Quyet Thang, Vo Sy Nam

Abstract:

Schema matching plays a key role in many different applications, such as schema integration, data integration, data warehousing, data transformation, E-commerce, peer-to-peer data management, ontology matching and integration, semantic Web, semantic query processing, etc. Manual matching is expensive and error-prone, so it is therefore important to develop techniques to automate the schema matching process. In this paper, we present a solution for XML schema automated matching problem which produces semantic mappings between corresponding schema elements of given source and target schemas. This solution contributed in solving more comprehensively and efficiently XML schema automated matching problem. Our solution based on combining linguistic similarity, data type compatibility and structural similarity of XML schema elements. After describing our solution, we present experimental results that demonstrate the effectiveness of this approach.

Keywords: XML Schema, Schema Matching, SemanticMatching, Automatic XML Schema Matching.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1811