Search results for: generic data warehousing
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 7461

Search results for: generic data warehousing

7191 Incremental Algorithm to Cluster the Categorical Data with Frequency Based Similarity Measure

Authors: S.Aranganayagi, K.Thangavel

Abstract:

Clustering categorical data is more complicated than the numerical clustering because of its special properties. Scalability and memory constraint is the challenging problem in clustering large data set. This paper presents an incremental algorithm to cluster the categorical data. Frequencies of attribute values contribute much in clustering similar categorical objects. In this paper we propose new similarity measures based on the frequencies of attribute values and its cardinalities. The proposed measures and the algorithm are experimented with the data sets from UCI data repository. Results prove that the proposed method generates better clusters than the existing one.

Keywords: Clustering, Categorical, Incremental, Frequency, Domain

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1783
7190 Multimedia Data Fusion for Event Detection in Twitter by Using Dempster-Shafer Evidence Theory

Authors: Samar M. Alqhtani, Suhuai Luo, Brian Regan

Abstract:

Data fusion technology can be the best way to extract useful information from multiple sources of data. It has been widely applied in various applications. This paper presents a data fusion approach in multimedia data for event detection in twitter by using Dempster-Shafer evidence theory. The methodology applies a mining algorithm to detect the event. There are two types of data in the fusion. The first is features extracted from text by using the bag-ofwords method which is calculated using the term frequency-inverse document frequency (TF-IDF). The second is the visual features extracted by applying scale-invariant feature transform (SIFT). The Dempster - Shafer theory of evidence is applied in order to fuse the information from these two sources. Our experiments have indicated that comparing to the approaches using individual data source, the proposed data fusion approach can increase the prediction accuracy for event detection. The experimental result showed that the proposed method achieved a high accuracy of 0.97, comparing with 0.93 with texts only, and 0.86 with images only.

Keywords: Data fusion, Dempster-Shafer theory, data mining, event detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1751
7189 Towards Model-Driven Communications

Authors: Antonio Natali, Ambra Molesini

Abstract:

In modern distributed software systems, the issue of communication among composing parts represents a critical point, but the idea of extending conventional programming languages with general purpose communication constructs seems difficult to realize. As a consequence, there is a (growing) gap between the abstraction level required by distributed applications and the concepts provided by platforms that enable communication. This work intends to discuss how the Model Driven Software Development approach can be considered as a mature technology to generate in automatic way the schematic part of applications related to communication, by providing at the same time high level specialized languages useful in all the phases of software production. To achieve the goal, a stack of languages (meta-meta¬models) has been introduced in order to describe – at different levels of abstraction – the collaborative behavior of generic entities in terms of communication actions related to a taxonomy of messages. Finally, the generation of platforms for communication is viewed as a form of specification of language semantics, that provides executable models of applications together with model-checking supports and effective runtime environments.

Keywords: Interactions, specific languages, meta-models, model driven development.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1810
7188 Model Order Reduction for Frequency Response and Effect of Order of Method for Matching Condition

Authors: Aref Ghafouri, Mohammad Javad Mollakazemi, Farhad Asadi

Abstract:

In this paper, model order reduction method is used for approximation in linear and nonlinearity aspects in some experimental data. This method can be used for obtaining offline reduced model for approximation of experimental data and can produce and follow the data and order of system and also it can match to experimental data in some frequency ratios. In this study, the method is compared in different experimental data and influence of choosing of order of the model reduction for obtaining the best and sufficient matching condition for following the data is investigated in format of imaginary and reality part of the frequency response curve and finally the effect and important parameter of number of order reduction in nonlinear experimental data is explained further.

Keywords: Frequency response, Order of model reduction, frequency matching condition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2013
7187 Building a Scalable Telemetry Based Multiclass Predictive Maintenance Model in R

Authors: Jaya Mathew

Abstract:

Many organizations are faced with the challenge of how to analyze and build Machine Learning models using their sensitive telemetry data. In this paper, we discuss how users can leverage the power of R without having to move their big data around as well as a cloud based solution for organizations willing to host their data in the cloud. By using ScaleR technology to benefit from parallelization and remote computing or R Services on premise or in the cloud, users can leverage the power of R at scale without having to move their data around.

Keywords: Predictive maintenance, machine learning, big data, cloud, on premise SQL, R.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1871
7186 Big Data Strategy for Telco: Network Transformation

Authors: F. Amin, S. Feizi

Abstract:

Big data has the potential to improve the quality of services; enable infrastructure that businesses depend on to adapt continually and efficiently; improve the performance of employees; help organizations better understand customers; and reduce liability risks. Analytics and marketing models of fixed and mobile operators are falling short in combating churn and declining revenue per user. Big Data presents new method to reverse the way and improve profitability. The benefits of Big Data and next-generation network, however, are more exorbitant than improved customer relationship management. Next generation of networks are in a prime position to monetize rich supplies of customer information—while being mindful of legal and privacy issues. As data assets are transformed into new revenue streams will become integral to high performance.

Keywords: Big Data, Next Generation Networks, Network Transformation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2469
7185 Voltage Sag Characteristics during Symmetrical and Asymmetrical Faults

Authors: Ioannis Binas, Marios Moschakis

Abstract:

Electrical faults in transmission and distribution networks can have great impact on the electrical equipment used. Fault effects depend on the characteristics of the fault as well as the network itself. It is important to anticipate the network’s behavior during faults when planning a new equipment installation, as well as troubleshooting. Moreover, working backwards, we could be able to estimate the characteristics of the fault when checking the perceived effects. Different transformer winding connections dominantly used in the Greek power transfer and distribution networks and the effects of 1-phase to neutral, phase-to-phase, 2-phases to neutral and 3-phase faults on different locations of the network were simulated in order to present voltage sag characteristics. The study was performed on a generic network with three steps down transformers on two voltage level buses (one 150 kV/20 kV transformer and two 20 kV/0.4 kV). We found that during faults, there are significant changes both on voltage magnitudes and on phase angles. The simulations and short-circuit analysis were performed using the PSCAD simulation package. This paper presents voltage characteristics calculated for the simulated network, with different approaches on the transformer winding connections during symmetrical and asymmetrical faults on various locations.

Keywords: Phase angle shift, power quality, transformer winding connections, voltage sag propagation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 752
7184 Using Perspective Schemata to Model the ETL Process

Authors: Valeria M. Pequeno, Joao Carlos G. M. Pires

Abstract:

Data Warehouses (DWs) are repositories which contain the unified history of an enterprise for decision support. The data must be Extracted from information sources, Transformed and integrated to be Loaded (ETL) into the DW, using ETL tools. These tools focus on data movement, where the models are only used as a means to this aim. Under a conceptual viewpoint, the authors want to innovate the ETL process in two ways: 1) to make clear compatibility between models in a declarative fashion, using correspondence assertions and 2) to identify the instances of different sources that represent the same entity in the real-world. This paper presents the overview of the proposed framework to model the ETL process, which is based on the use of a reference model and perspective schemata. This approach provides the designer with a better understanding of the semantic associated with the ETL process.

Keywords: conceptual data model, correspondence assertions, data warehouse, data integration, ETL process, object relational database.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1462
7183 Collaborative Education Practice in a Data Structure E-Learning Course

Authors: Gang Chen, Ruimin Shen

Abstract:

This paper presented a collaborative education model, which consists four parts: collaborative teaching, collaborative working, collaborative training and interaction. Supported by an e-learning platform, collaborative education was practiced in a data structure e-learning course. Data collected shows that most of students accept collaborative education. This paper goes one step attempting to determine which aspects appear to be most important or helpful in collaborative education.

Keywords: Collaborative work, education, data structures.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1637
7182 An Algebra for Protein Structure Data

Authors: Yanchao Wang, Rajshekhar Sunderraman

Abstract:

This paper presents an algebraic approach to optimize queries in domain-specific database management system for protein structure data. The approach involves the introduction of several protein structure specific algebraic operators to query the complex data stored in an object-oriented database system. The Protein Algebra provides an extensible set of high-level Genomic Data Types and Protein Data Types along with a comprehensive collection of appropriate genomic and protein functions. The paper also presents a query translator that converts high-level query specifications in algebra into low-level query specifications in Protein-QL, a query language designed to query protein structure data. The query transformation process uses a Protein Ontology that serves the purpose of a dictionary.

Keywords: Domain-Specific Data Management, Protein Algebra, Protein Ontology, Protein Structure Data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1489
7181 Meta-Learning for Hierarchical Classification and Applications in Bioinformatics

Authors: Fabio Fabris, Alex A. Freitas

Abstract:

Hierarchical classification is a special type of classification task where the class labels are organised into a hierarchy, with more generic class labels being ancestors of more specific ones. Meta-learning for classification-algorithm recommendation consists of recommending to the user a classification algorithm, from a pool of candidate algorithms, for a dataset, based on the past performance of the candidate algorithms in other datasets. Meta-learning is normally used in conventional, non-hierarchical classification. By contrast, this paper proposes a meta-learning approach for more challenging task of hierarchical classification, and evaluates it in a large number of bioinformatics datasets. Hierarchical classification is especially relevant for bioinformatics problems, as protein and gene functions tend to be organised into a hierarchy of class labels. This work proposes meta-learning approach for recommending the best hierarchical classification algorithm to a hierarchical classification dataset. This work’s contributions are: 1) proposing an algorithm for splitting hierarchical datasets into new datasets to increase the number of meta-instances, 2) proposing meta-features for hierarchical classification, and 3) interpreting decision-tree meta-models for hierarchical classification algorithm recommendation.

Keywords: Algorithm recommendation, meta-learning, bioinformatics, hierarchical classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1294
7180 Learning through Shared Procedures -A Case of Using Technology to Bridge the Gap between Theory and Practice in Officer Education

Authors: O. Boe, S-T. Kristiansen, R. Wold

Abstract:

In this article we explore how computer assisted exercises may allow for bridging the traditional gap between theory and practice in professional education. To educate officers able to master the complexity of the battlefield the Norwegian Military Academy needs to develop a learning environment that allows for creating viable connections between the educational environment and the field of practice. In response to this challenge we explore the conditions necessary to make computer assisted training systems (CATS) a useful tool to create structural similarities between an educational context and the field of military practice. Although, CATS may facilitate work procedures close to real life situations, this case do demonstrate how professional competence also must build on viable learning theories and environments. This paper explores the conditions that allow for using simulators to facilitate professional competence from within an educational setting. We develop a generic didactic model that ascribes learning to participation in iterative cycles of action and reflection. The development of this model is motivated by the need to develop an interdisciplinary professional education rooted in the pattern of military practice.

Keywords: Development in higher education, experiential learning, professional education, simulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1150
7179 A Combined Cipher Text Policy Attribute-Based Encryption and Timed-Release Encryption Method for Securing Medical Data in Cloud

Authors: G. Shruthi, Purohit Shrinivasacharya

Abstract:

The biggest problem in cloud is securing an outsourcing data. A cloud environment cannot be considered to be trusted. It becomes more challenging when outsourced data sources are managed by multiple outsourcers with different access rights. Several methods have been proposed to protect data confidentiality against the cloud service provider to support fine-grained data access control. We propose a method with combined Cipher Text Policy Attribute-based Encryption (CP-ABE) and Timed-release encryption (TRE) secure method to control medical data storage in public cloud.

Keywords: Attribute, encryption, security, trapdoor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 691
7178 Data Mining Classification Methods Applied in Drug Design

Authors: Mária Stachová, Lukáš Sobíšek

Abstract:

Data mining incorporates a group of statistical methods used to analyze a set of information, or a data set. It operates with models and algorithms, which are powerful tools with the great potential. They can help people to understand the patterns in certain chunk of information so it is obvious that the data mining tools have a wide area of applications. For example in the theoretical chemistry data mining tools can be used to predict moleculeproperties or improve computer-assisted drug design. Classification analysis is one of the major data mining methodologies. The aim of thecontribution is to create a classification model, which would be able to deal with a huge data set with high accuracy. For this purpose logistic regression, Bayesian logistic regression and random forest models were built using R software. TheBayesian logistic regression in Latent GOLD software was created as well. These classification methods belong to supervised learning methods. It was necessary to reduce data matrix dimension before construct models and thus the factor analysis (FA) was used. Those models were applied to predict the biological activity of molecules, potential new drug candidates.

Keywords: data mining, classification, drug design, QSAR

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2798
7177 EPR Hiding in Medical Images for Telemedicine

Authors: K. A. Navas, S. Archana Thampy, M. Sasikumar

Abstract:

Medical image data hiding has strict constrains such as high imperceptibility, high capacity and high robustness. Achieving these three requirements simultaneously is highly cumbersome. Some works have been reported in the literature on data hiding, watermarking and stegnography which are suitable for telemedicine applications. None is reliable in all aspects. Electronic Patient Report (EPR) data hiding for telemedicine demand it blind and reversible. This paper proposes a novel approach to blind reversible data hiding based on integer wavelet transform. Experimental results shows that this scheme outperforms the prior arts in terms of zero BER (Bit Error Rate), higher PSNR (Peak Signal to Noise Ratio), and large EPR data embedding capacity with WPSNR (Weighted Peak Signal to Noise Ratio) around 53 dB, compared with the existing reversible data hiding schemes.

Keywords: Biomedical imaging, Data security, Datacommunication, Teleconferencing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2696
7176 A Robust Method for Encrypted Data Hiding Technique Based on Neighborhood Pixels Information

Authors: Ali Shariq Imran, M. Younus Javed, Naveed Sarfraz Khattak

Abstract:

This paper presents a novel method for data hiding based on neighborhood pixels information to calculate the number of bits that can be used for substitution and modified Least Significant Bits technique for data embedding. The modified solution is independent of the nature of the data to be hidden and gives correct results along with un-noticeable image degradation. The technique, to find the number of bits that can be used for data hiding, uses the green component of the image as it is less sensitive to human eye and thus it is totally impossible for human eye to predict whether the image is encrypted or not. The application further encrypts the data using a custom designed algorithm before embedding bits into image for further security. The overall process consists of three main modules namely embedding, encryption and extraction cm.

Keywords: Data hiding, image processing, information security, stagonography.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2301
7175 Unsupervised Outlier Detection in Streaming Data Using Weighted Clustering

Authors: Yogita, Durga Toshniwal

Abstract:

Outlier detection in streaming data is very challenging because streaming data cannot be scanned multiple times and also new concepts may keep evolving. Irrelevant attributes can be termed as noisy attributes and such attributes further magnify the challenge of working with data streams. In this paper, we propose an unsupervised outlier detection scheme for streaming data. This scheme is based on clustering as clustering is an unsupervised data mining task and it does not require labeled data, both density based and partitioning clustering are combined for outlier detection. In this scheme partitioning clustering is also used to assign weights to attributes depending upon their respective relevance and weights are adaptive. Weighted attributes are helpful to reduce or remove the effect of noisy attributes. Keeping in view the challenges of streaming data, the proposed scheme is incremental and adaptive to concept evolution. Experimental results on synthetic and real world data sets show that our proposed approach outperforms other existing approach (CORM) in terms of outlier detection rate, false alarm rate, and increasing percentages of outliers.

Keywords: Concept Evolution, Irrelevant Attributes, Streaming Data, Unsupervised Outlier Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2586
7174 The Effect of Measurement Distribution on System Identification and Detection of Behavior of Nonlinearities of Data

Authors: Mohammad Javad Mollakazemi, Farhad Asadi, Aref Ghafouri

Abstract:

In this paper, we considered and applied parametric modeling for some experimental data of dynamical system. In this study, we investigated the different distribution of output measurement from some dynamical systems. Also, with variance processing in experimental data we obtained the region of nonlinearity in experimental data and then identification of output section is applied in different situation and data distribution. Finally, the effect of the spanning the measurement such as variance to identification and limitation of this approach is explained.

Keywords: Gaussian process, Nonlinearity distribution, Particle filter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1665
7173 Exponentially Weighted Simultaneous Estimation of Several Quantiles

Authors: Valeriy Naumov, Olli Martikainen

Abstract:

In this paper we propose new method for simultaneous generating multiple quantiles corresponding to given probability levels from data streams and massive data sets. This method provides a basis for development of single-pass low-storage quantile estimation algorithms, which differ in complexity, storage requirement and accuracy. We demonstrate that such algorithms may perform well even for heavy-tailed data.

Keywords: Quantile estimation, data stream, heavy-taileddistribution, tail index.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1484
7172 CyberSecurity Malaysia: Towards Becoming a National Certification Body for Information Security Management Systems Internal Auditors

Authors: M. S. Razana, Z. W. Shafiuddin

Abstract:

Internal auditing is one of the most important activities for organizations that implement information security management systems (ISMS). The purpose of internal audits is to ensure the ISMS implementation is in accordance to the ISO/IEC 27001 standard and the organization’s own requirements for its ISMS. Competent internal auditors are the main element that contributes to the effectiveness of internal auditing activities. To realize this need, CyberSecurity Malaysia is now in the process of becoming a certification body that certifies ISMS internal auditors. The certification scheme will assess the competence of internal auditors in generic knowledge and skills in management systems, and also in ISMS-specific knowledge and skills. The certification assessment is based on the ISO/IEC 19011 Guidelines for auditing management systems, ISO/IEC 27007 Guidelines for information security management systems auditing and ISO/IEC 27001 Information security management systems requirements. The certification scheme complies with the ISO/IEC 17024 General requirements for bodies operating certification systems of persons. Candidates who pass the exam will be certified as an ISMS Internal Auditor, whose competency will be evaluated every three years.

Keywords: ISMS internal audit, ISMS internal auditor, ISO/IEC 17024, Competence, Certification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1777
7171 Enhanced Data Access Control of Cooperative Environment used for DMU Based Design

Authors: Wei Lifan, Zhang Huaiyu, Yang Yunbin, Li Jia

Abstract:

Through the analysis of the process digital design based on digital mockup, the fact indicates that a distributed cooperative supporting environment is the foundation conditions to adopt design approach based on DMU. Data access authorization is concerned firstly because the value and sensitivity of the data for the enterprise. The access control for administrators is often rather weak other than business user. So authors established an enhanced system to avoid the administrators accessing the engineering data by potential approach and without authorization. Thus the data security is improved.

Keywords: access control, DMU, PLM, virtual prototype.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1429
7170 Pattern Recognition Using Feature Based Die-Map Clusteringin the Semiconductor Manufacturing Process

Authors: Seung Hwan Park, Cheng-Sool Park, Jun Seok Kim, Youngji Yoo, Daewoong An, Jun-Geol Baek

Abstract:

Depending on the big data analysis becomes important, yield prediction using data from the semiconductor process is essential. In general, yield prediction and analysis of the causes of the failure are closely related. The purpose of this study is to analyze pattern affects the final test results using a die map based clustering. Many researches have been conducted using die data from the semiconductor test process. However, analysis has limitation as the test data is less directly related to the final test results. Therefore, this study proposes a framework for analysis through clustering using more detailed data than existing die data. This study consists of three phases. In the first phase, die map is created through fail bit data in each sub-area of die. In the second phase, clustering using map data is performed. And the third stage is to find patterns that affect final test result. Finally, the proposed three steps are applied to actual industrial data and experimental results showed the potential field application.

Keywords: Die-Map Clustering, Feature Extraction, Pattern Recognition, Semiconductor Manufacturing Process.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3096
7169 Laboratory Experimentation for Supporting Collaborative Working in Engineering Education over the Internet

Authors: S. Odeh, E. Abdelghani

Abstract:

Collaborative working environments for distance education can be considered as a more generic form of contemporary remote labs. At present, the majority of existing real laboratories are not constructed to allow the involved participants to collaborate in real time. To make this revolutionary learning environment possible we must allow the different users to carry out an experiment simultaneously. In recent times, multi-user environments are successfully applied in many applications such as air traffic control systems, team-oriented military systems, chat-text tools, multi-player games etc. Thus, understanding the ideas and techniques behind these systems could be of great importance in the contribution of ideas to our e-learning environment for collaborative working. In this investigation, collaborative working environments from theoretical and practical perspectives are considered in order to build an effective collaborative real laboratory, which allows two students or more to conduct remote experiments at the same time as a team. In order to achieve this goal, we have implemented distributed system architecture, enabling students to obtain an automated help by either a human tutor or a rule-based e-tutor.

Keywords: Collaboration environment, e-tutor, multi-user environments, socio-technical system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1435
7168 Speed Characteristics of Mixed Traffic Flow on Urban Arterials

Authors: Ashish Dhamaniya, Satish Chandra

Abstract:

Speed and traffic volume data are collected on different sections of four lane and six lane roads in three metropolitan cities in India. Speed data are analyzed to fit the statistical distribution to individual vehicle speed data and all vehicles speed data. It is noted that speed data of individual vehicle generally follows a normal distribution but speed data of all vehicle combined at a section of urban road may or may not follow the normal distribution depending upon the composition of traffic stream. A new term Speed Spread Ratio (SSR) is introduced in this paper which is the ratio of difference in 85th and 50th percentile speed to the difference in 50th and 15th percentile speed. If SSR is unity then speed data are truly normally distributed. It is noted that on six lane urban roads, speed data follow a normal distribution only when SSR is in the range of 0.86 – 1.11. The range of SSR is validated on four lane roads also.

Keywords: Normal distribution, percentile speed, speed spread ratio, traffic volume.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4194
7167 A Comparative Study between Discrete Wavelet Transform and Maximal Overlap Discrete Wavelet Transform for Testing Stationarity

Authors: Amel Abdoullah Ahmed Dghais, Mohd Tahir Ismail

Abstract:

In this paper the core objective is to apply discrete wavelet transform and maximal overlap discrete wavelet transform functions namely Haar, Daubechies2, Symmlet4, Coiflet2 and discrete approximation of the Meyer wavelets in non stationary financial time series data from Dow Jones index (DJIA30) of US stock market. The data consists of 2048 daily data of closing index from December 17, 2004 to October 23, 2012. Unit root test affirms that the data is non stationary in the level. A comparison between the results to transform non stationary data to stationary data using aforesaid transforms is given which clearly shows that the decomposition stock market index by discrete wavelet transform is better than maximal overlap discrete wavelet transform for original data.

Keywords: Discrete wavelet transform, maximal overlap discrete wavelet transform, stationarity, autocorrelation function.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4671
7166 Comparative Study of Transformed and Concealed Data in Experimental Designs and Analyses

Authors: K. Chinda, P. Luangpaiboon

Abstract:

This paper presents the comparative study of coded data methods for finding the benefit of concealing the natural data which is the mercantile secret. Influential parameters of the number of replicates (rep), treatment effects (τ) and standard deviation (σ) against the efficiency of each transformation method are investigated. The experimental data are generated via computer simulations under the specified condition of the process with the completely randomized design (CRD). Three ways of data transformation consist of Box-Cox, arcsine and logit methods. The difference values of F statistic between coded data and natural data (Fc-Fn) and hypothesis testing results were determined. The experimental results indicate that the Box-Cox results are significantly different from natural data in cases of smaller levels of replicates and seem to be improper when the parameter of minus lambda has been assigned. On the other hand, arcsine and logit transformations are more robust and obviously, provide more precise numerical results. In addition, the alternate ways to select the lambda in the power transformation are also offered to achieve much more appropriate outcomes.

Keywords: Experimental Designs, Box-Cox, Arcsine, Logit Transformations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1578
7165 Design of a Low Cost Motion Data Acquisition Setup for Mechatronic Systems

Authors: Barış Can Yalçın

Abstract:

Motion sensors have been commonly used as a valuable component in mechatronic systems, however, many mechatronic designs and applications that need motion sensors cost enormous amount of money, especially high-tech systems. Design of a software for communication protocol between data acquisition card and motion sensor is another issue that has to be solved. This study presents how to design a low cost motion data acquisition setup consisting of MPU 6050 motion sensor (gyro and accelerometer in 3 axes) and Arduino Mega2560 microcontroller. Design parameters are calibration of the sensor, identification and communication between sensor and data acquisition card, interpretation of data collected by the sensor.

Keywords: Calibration of sensors, data acquisition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4292
7164 Conceptual Multidimensional Model

Authors: Manpreet Singh, Parvinder Singh, Suman

Abstract:

The data is available in abundance in any business organization. It includes the records for finance, maintenance, inventory, progress reports etc. As the time progresses, the data keep on accumulating and the challenge is to extract the information from this data bank. Knowledge discovery from these large and complex databases is the key problem of this era. Data mining and machine learning techniques are needed which can scale to the size of the problems and can be customized to the application of business. For the development of accurate and required information for particular problem, business analyst needs to develop multidimensional models which give the reliable information so that they can take right decision for particular problem. If the multidimensional model does not possess the advance features, the accuracy cannot be expected. The present work involves the development of a Multidimensional data model incorporating advance features. The criterion of computation is based on the data precision and to include slowly change time dimension. The final results are displayed in graphical form.

Keywords: Multidimensional, data precision.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1408
7163 Real Time Approach for Data Placement in Wireless Sensor Networks

Authors: Sanjeev Gupta, Mayank Dave

Abstract:

The issue of real-time and reliable report delivery is extremely important for taking effective decision in a real world mission critical Wireless Sensor Network (WSN) based application. The sensor data behaves differently in many ways from the data in traditional databases. WSNs need a mechanism to register, process queries, and disseminate data. In this paper we propose an architectural framework for data placement and management. We propose a reliable and real time approach for data placement and achieving data integrity using self organized sensor clusters. Instead of storing information in individual cluster heads as suggested in some protocols, in our architecture we suggest storing of information of all clusters within a cell in the corresponding base station. For data dissemination and action in the wireless sensor network we propose to use Action and Relay Stations (ARS). To reduce average energy dissipation of sensor nodes, the data is sent to the nearest ARS rather than base station. We have designed our architecture in such a way so as to achieve greater energy savings, enhanced availability and reliability.

Keywords: Cluster head, data reliability, real time communication, wireless sensor networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1770
7162 Data Mining in Medicine Domain Using Decision Trees and Vector Support Machine

Authors: Djamila Benhaddouche, Abdelkader Benyettou

Abstract:

In this paper, we used data mining to extract biomedical knowledge. In general, complex biomedical data collected in studies of populations are treated by statistical methods, although they are robust, they are not sufficient in themselves to harness the potential wealth of data. For that you used in step two learning algorithms: the Decision Trees and Support Vector Machine (SVM). These supervised classification methods are used to make the diagnosis of thyroid disease. In this context, we propose to promote the study and use of symbolic data mining techniques.

Keywords: A classifier, Algorithms decision tree, knowledge extraction, Support Vector Machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1826