Search results for: Stated Preference Data.

7396 Secure Data Aggregation Using Clusters in Sensor Networks

Authors: Prakash G L, Thejaswini M, S H Manjula, K R Venugopal, L M Patnaik

Abstract:

Wireless sensor network can be applied to both abominable and military environments. A primary goal in the design of wireless sensor networks is lifetime maximization, constrained by the energy capacity of batteries. One well-known method to reduce energy consumption in such networks is data aggregation. Providing efcient data aggregation while preserving data privacy is a challenging problem in wireless sensor networks research. In this paper, we present privacy-preserving data aggregation scheme for additive aggregation functions. The Cluster-based Private Data Aggregation (CPDA)leverages clustering protocol and algebraic properties of polynomials. It has the advantage of incurring less communication overhead. The goal of our work is to bridge the gap between collaborative data collection by wireless sensor networks and data privacy. We present simulation results of our schemes and compare their performance to a typical data aggregation scheme TAG, where no data privacy protection is provided. Results show the efficacy and efficiency of our schemes.

Keywords: Aggregation, Clustering, Query Processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1734

7395 A New Protocol for Concealed Data Aggregation in Wireless Sensor Networks

Authors: M. Abbasi Dezfouli, S. Mazraeh, M. H. Yektaie

Abstract:

Wireless sensor networks (WSN) consists of many sensor nodes that are placed on unattended environments such as military sites in order to collect important information. Implementing a secure protocol that can prevent forwarding forged data and modifying content of aggregated data and has low delay and overhead of communication, computing and storage is very important. This paper presents a new protocol for concealed data aggregation (CDA). In this protocol, the network is divided to virtual cells, nodes within each cell produce a shared key to send and receive of concealed data with each other. Considering to data aggregation in each cell is locally and implementing a secure authentication mechanism, data aggregation delay is very low and producing false data in the network by malicious nodes is not possible. To evaluate the performance of our proposed protocol, we have presented computational models that show the performance and low overhead in our protocol.

Keywords: Wireless Sensor Networks, Security, Concealed Data Aggregation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1735

7394 IMDC: An Image-Mapped Data Clustering Technique for Large Datasets

Authors: Faruq A. Al-Omari, Nabeel I. Al-Fayoumi

Abstract:

In this paper, we present a new algorithm for clustering data in large datasets using image processing approaches. First the dataset is mapped into a binary image plane. The synthesized image is then processed utilizing efficient image processing techniques to cluster the data in the dataset. Henceforth, the algorithm avoids exhaustive search to identify clusters. The algorithm considers only a small set of the data that contains critical boundary information sufficient to identify contained clusters. Compared to available data clustering techniques, the proposed algorithm produces similar quality results and outperforms them in execution time and storage requirements.

Keywords: Data clustering, Data mining, Image-mapping, Pattern discovery, Predictive analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1500

7393 The New Method of Concealed Data Aggregation in Wireless Sensor: A Case Study

Authors: M. Abbasi Dezfouli, S. Mazraeh, M. H. Yektaie

Abstract:

Wireless sensor networks (WSN) consists of many sensor nodes that are placed on unattended environments such as military sites in order to collect important information. Implementing a secure protocol that can prevent forwarding forged data and modifying content of aggregated data and has low delay and overhead of communication, computing and storage is very important. This paper presents a new protocol for concealed data aggregation (CDA). In this protocol, the network is divided to virtual cells, nodes within each cell produce a shared key to send and receive of concealed data with each other. Considering to data aggregation in each cell is locally and implementing a secure authentication mechanism, data aggregation delay is very low and producing false data in the network by malicious nodes is not possible. To evaluate the performance of our proposed protocol, we have presented computational models that show the performance and low overhead in our protocol.

Keywords: Wireless Sensor Networks, Security, Concealed Data Aggregation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1768

7392 Engagement Strategies for Stakeholder Management in New Technology Development in the Fertilizer Industry – A Conceptual Framework

Authors: Ammar Redza Ahmad Rizal, Shahrina Md Nordin, Mohd Shamsuri Saad, Kamariah Ismail

Abstract:

Communication is becoming a significant tool to engage stakeholders since half of the century ago. In the recent years, there has been rapid growth of new technology developments. In tandem with such developments, there has been growing emphasis in communication strategies and management especially in determining the level of influence and management strategies among the said stakeholders on particular field. This paper presents a research conceptual framework focusing on stakeholder theories, communication and management strategies to be implied on the engagement of stakeholders of new technology developments of fertilizer industry in Malaysia. Framework espoused in this paper will provide insights into the various stakeholder theories and engagement strategies from different principal necessary for a successful introduction of new technology development in the above stated industry. The proposed framework has theoretical significance in filling the gap of the body of knowledge in the implementation of communication strategies in Malaysian fertilizer industry.

Keywords: Communication Strategies, Fertilizer Industry, New Technology Development, Stakeholders Management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2872

7391 Peakwise Smoothing of Data Models using Wavelets

Authors: D Sudheer Reddy, N Gopal Reddy, P V Radhadevi, J Saibaba, Geeta Varadan

Abstract:

Smoothing or filtering of data is first preprocessing step for noise suppression in many applications involving data analysis. Moving average is the most popular method of smoothing the data, generalization of this led to the development of Savitzky-Golay filter. Many window smoothing methods were developed by convolving the data with different window functions for different applications; most widely used window functions are Gaussian or Kaiser. Function approximation of the data by polynomial regression or Fourier expansion or wavelet expansion also gives a smoothed data. Wavelets also smooth the data to great extent by thresholding the wavelet coefficients. Almost all smoothing methods destroys the peaks and flatten them when the support of the window is increased. In certain applications it is desirable to retain peaks while smoothing the data as much as possible. In this paper we present a methodology called as peak-wise smoothing that will smooth the data to any desired level without losing the major peak features.

Keywords: smoothing, moving average, peakwise smoothing, spatialdensity models, planar shape models, wavelets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1750

7390 A New Precautionary Method for Measurement and Improvement the Data Quality

Authors: Seyed Mohammad Hossein Moossavizadeh, Mehran Mohsenzadeh, Nasrin Arshadi

Abstract:

the data quality is a kind of complex and unstructured concept, which is concerned by information systems managers. The reason of this attention is the high amount of Expenses for maintenance and cleaning of the inefficient data. Such a data more than its expenses of lack of quality, cause wrong statistics, analysis and decisions in organizations. Therefor the managers intend to improve the quality of their information systems' data. One of the basic subjects of quality improvement is the evaluation of the amount of it. In this paper, we present a precautionary method, which with its application the data of information systems would have a better quality. Our method would cover different dimensions of data quality; therefor it has necessary integrity. The presented method has tested on three dimensions of accuracy, value-added and believability and the results confirm the improvement and integrity of this method.

Keywords: Data quality, precaution, information system, measurement, improvement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1468

7389 An Efficient Data Mining Approach on Compressed Transactions

Authors: Jia-Yu Dai, Don-Lin Yang, Jungpin Wu, Ming-Chuan Hung

Abstract:

In an era of knowledge explosion, the growth of data increases rapidly day by day. Since data storage is a limited resource, how to reduce the data space in the process becomes a challenge issue. Data compression provides a good solution which can lower the required space. Data mining has many useful applications in recent years because it can help users discover interesting knowledge in large databases. However, existing compression algorithms are not appropriate for data mining. In [1, 2], two different approaches were proposed to compress databases and then perform the data mining process. However, they all lack the ability to decompress the data to their original state and improve the data mining performance. In this research a new approach called Mining Merged Transactions with the Quantification Table (M2TQT) was proposed to solve these problems. M2TQT uses the relationship of transactions to merge related transactions and builds a quantification table to prune the candidate itemsets which are impossible to become frequent in order to improve the performance of mining association rules. The experiments show that M2TQT performs better than existing approaches.

Keywords: Association rule, data mining, merged transaction, quantification table.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1960

7388 A Critical Review on the Development of a Theoretical Framework for Managing Environmental Impacts of Construction Project

Authors: Sami Mustafa M. E. Ahmed, Noor Amila Wan Abdullah Zawawi, Zulkipli B. Ghazali

Abstract:

Construction industry is considered as one of the main contributor of natural resources depletion, responsible for high level pollution and it is one of the attributes that pose climate changes and other environmental threats. A lot of efforts had and have been done to reduce and control these impacts. Project Environmental Management (PEM) includes the processes required to ensure that the impacts of the project execution to the surrounding environment will remain within the limits stated in legal permits. The main aim of most of researches conducted managing Environmental Impacts (EI) is to protect earth planet from pollution. Those researches are presenting four major environmental elements; Environmental Management Systems (EMS), Environmental Design (ED), Environmental Planning (EP) and Environmental Impacts Assessments (EIA). Although everything has been said about environmental management for construction projects, but almost everything remains to be said and therefore to be explored or rediscovered because incontestably, almost everything remains to be done. This paper aimed at reviewing some of what has been said about PEM. Also one of its objectives is to explore and rediscover the whole view of managing the EI problems by proposing a framework that based on the relation between these environmental researches.

Keywords: Environmental planning, sustainable design, EIA and EMS.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2419

7387 Weigh-in-Motion Data Analysis Software for Developing Traffic Data for Mechanistic Empirical Pavement Design

Authors: M. A. Hasan, M. R. Islam, R. A. Tarefder

Abstract:

Currently, there are few user friendly Weigh-in- Motion (WIM) data analysis softwares available which can produce traffic input data for the recently developed AASHTOWare pavement Mechanistic-Empirical (ME) design software. However, these softwares have only rudimentary Quality Control (QC) processes. Therefore, they cannot properly deal with erroneous WIM data. As the pavement performance is highly sensible to the quality of WIM data, it is highly recommended to use more refined QC process on raw WIM data to get a good result. This study develops a userfriendly software, which can produce traffic input for the ME design software. This software takes the raw data (Class and Weight data) collected from the WIM station and processes it with a sophisticated QC procedure. Traffic data such as traffic volume, traffic distribution, axle load spectra, etc. can be obtained from this software; which can directly be used in the ME design software.

Keywords: Weigh-in-motion, software, axle load spectra, traffic distribution, AASHTOWare.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1896

7386 Material Handling Equipment Selection using Hybrid Monte Carlo Simulation and Analytic Hierarchy Process

Authors: Amer M. Momani, Abdulaziz A. Ahmed

Abstract:

The many feasible alternatives and conflicting objectives make equipment selection in materials handling a complicated task. This paper presents utilizing Monte Carlo (MC) simulation combined with the Analytic Hierarchy Process (AHP) to evaluate and select the most appropriate Material Handling Equipment (MHE). The proposed hybrid model was built on the base of material handling equation to identify main and sub criteria critical to MHE selection. The criteria illustrate the properties of the material to be moved, characteristics of the move, and the means by which the materials will be moved. The use of MC simulation beside the AHP is very powerful where it allows the decision maker to represent his/her possible preference judgments as random variables. This will reduce the uncertainty of single point judgment at conventional AHP, and provide more confidence in the decision problem results. A small business pharmaceutical company is used as an example to illustrate the development and application of the proposed model.

Keywords: Analytic Hierarchy Process (AHP), Materialhandling equipment selection, Monte Carlo simulation, Multi-criteriadecision making

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3138

7385 A Stereo Image Processing System for Visually Impaired

Authors: G. Balakrishnan, G. Sainarayanan, R. Nagarajan, Sazali Yaacob

Abstract:

This paper presents a review on vision aided systems and proposes an approach for visual rehabilitation using stereo vision technology. The proposed system utilizes stereo vision, image processing methodology and a sonification procedure to support blind navigation. The developed system includes a wearable computer, stereo cameras as vision sensor and stereo earphones, all moulded in a helmet. The image of the scene infront of visually handicapped is captured by the vision sensors. The captured images are processed to enhance the important features in the scene in front, for navigation assistance. The image processing is designed as model of human vision by identifying the obstacles and their depth information. The processed image is mapped on to musical stereo sound for the blind-s understanding of the scene infront. The developed method has been tested in the indoor and outdoor environments and the proposed image processing methodology is found to be effective for object identification.

Keywords: Blind navigation, stereo vision, image processing, object preference, music tones.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4115

7384 Human Growth Curve Estimation through a Combination of Longitudinal and Cross-sectional Data

Authors: Sedigheh Mirzaei S., Debasis Sengupta

Abstract:

Parametric models have been quite popular for studying human growth, particularly in relation to biological parameters such as peak size velocity and age at peak size velocity. Longitudinal data are generally considered to be vital for fittinga parametric model to individual-specific data, and for studying the distribution of these biological parameters in a human population. However, cross-sectional data are easier to obtain than longitudinal data. In this paper, we present a method of combining longitudinal and cross-sectional data for the purpose of estimating the distribution of the biological parameters. We demonstrate, through simulations in the special case ofthePreece Baines model, how estimates based on longitudinal data can be improved upon by harnessing the information contained in cross-sectional data.We study the extent of improvement for different mixes of the two types of data, and finally illustrate the use of the method through data collected by the Indian Statistical Institute.

Keywords: Preece-Baines growth model, MCMC method, Mixed effect model

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2139

7383 Semantic Support for Hypothesis-Based Research from Smart Environment Monitoring and Analysis Technologies

Authors: T. S. Myers, J. Trevathan

Abstract:

Improvements in the data fusion and data analysis phase of research are imperative due to the exponential growth of sensed data. Currently, there are developments in the Semantic Sensor Web community to explore efficient methods for reuse, correlation and integration of web-based data sets and live data streams. This paper describes the integration of remotely sensed data with web-available static data for use in observational hypothesis testing and the analysis phase of research. The Semantic Reef system combines semantic technologies (e.g., well-defined ontologies and logic systems) with scientific workflows to enable hypothesis-based research. A framework is presented for how the data fusion concepts from the Semantic Reef architecture map to the Smart Environment Monitoring and Analysis Technologies (SEMAT) intelligent sensor network initiative. The data collected via SEMAT and the inferred knowledge from the Semantic Reef system are ingested to the Tropical Data Hub for data discovery, reuse, curation and publication.

Keywords: Information architecture, Semantic technologies Sensor networks, Ontologies.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1715

7382 Data Migration between Document-Oriented and Relational Databases

Authors: Bogdan Walek, Cyril Klimes

Abstract:

Current tools for data migration between documentoriented and relational databases have several disadvantages. We propose a new approach for data migration between documentoriented and relational databases. During data migration the relational schema of the target (relational database) is automatically created from collection of XML documents. Proposed approach is verified on data migration between document-oriented database IBM Lotus/ Notes Domino and relational database implemented in relational database management system (RDBMS) MySQL.

Keywords: data migration, database, document-oriented database, XML, relational schema

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3525

7381 Identity Verification Using k-NN Classifiers and Autistic Genetic Data

Authors: Fuad M. Alkoot

Abstract:

DNA data have been used in forensics for decades. However, current research looks at using the DNA as a biometric identity verification modality. The goal is to improve the speed of identification. We aim at using gene data that was initially used for autism detection to find if and how accurate is this data for identification applications. Mainly our goal is to find if our data preprocessing technique yields data useful as a biometric identification tool. We experiment with using the nearest neighbor classifier to identify subjects. Results show that optimal classification rate is achieved when the test set is corrupted by normally distributed noise with zero mean and standard deviation of 1. The classification rate is close to optimal at higher noise standard deviation reaching 3. This shows that the data can be used for identity verification with high accuracy using a simple classifier such as the k-nearest neighbor (k-NN).

Keywords: Biometrics, identity verification, genetic data, k-nearest neighbor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1120

7380 Aircraft Selection Problem Using Decision Uncertainty Distance in Fuzzy Multiple Criteria Decision Making Analysis

Authors: C. Ardil

Abstract:

Aircraft have different capabilities and specifications according to the required strategic goals and objectives in operations. With various types on the market with different aircraft characteristics, it becomes difficult to select a suitable aircraft for certain operations and requirements. The entropy weighting method (EWM) is a useful, highly consistent, and reliable method for obtaining the weights of the criteria and is worth integrating with the decision uncertainty distance (DUD) method, which is more applicable and requires less computation than other methods. An illustrative example is presented to demonstrate the validity and usability of the proposed methodology. Comparing the ranking results matches the distance-based approach, which is the technique for order preference by similarity to ideal solution (TOPSIS) method, which shows the robustness of the entropy DUD hybrid method. Validity analysis shows that the proposed hybrid multiple criteria decision-making analysis (MCDMA) methodology is quantitatively stable and reliable.

Keywords: aircraft selection, decision uncertainty distance (DUD), multiple criteria decision making analysis, MCDMA, TOPSIS

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 542

7379 Users’ Information Disclosure Determinants in Social Networking Sites: A Systematic Literature Review

Authors: Wajdan Al Malwi, Karen Renaud, Lewis Mackenzie

Abstract:

The privacy paradox describes a phenomenon whereby there is no connection between stated privacy concerns and privacy behaviours. We need to understand the underlying reasons for this paradox if we are to help users to preserve their privacy more effectively. In particular, the Social Networking System (SNS) domain offers a rich area of investigation due to the risks of unwise information disclosure decisions. Our study thus aims to untangle the complicated nature and underlying mechanisms of online privacy-related decisions in SNSs. In this paper, we report on the findings of a Systematic Literature Review (SLR) that revealed a number of factors that are likely to influence online privacy decisions. Our deductive analysis approach was informed by Communicative Privacy Management (CPM) theory. We uncovered a lack of clarity around privacy attitudes and their link to behaviours, which makes it challenging to design privacy-protecting SNS platforms and to craft legislation to ensure that users’ privacy is preserved.

Keywords: Privacy paradox, self-disclosure, privacy attitude, privacy behaviour, social networking sites.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 621

7378 Power Saving System in Green Data Center

Authors: Joon-young Jung, Dong-oh Kang, Chang-seok Bae

Abstract:

Power consumption is rapidly increased in data centers because the number of data center is increased and more the scale of data center become larger. Therefore, it is one of key research items to reduce power consumption in data center. The peak power of a typical server is around 250 watts. When a server is idle, it continues to use around 60% of the power consumed when in use, though vendors are putting effort into reducing this “idle" power load. Servers tend to work at only around a 5% to 20% utilization rate, partly because of response time concerns. An average of 10% of servers in their data centers was unused. In those reason, we propose dynamic power management system to reduce power consumption in green data center. Experiment result shows that about 55% power consumption is reduced at idle time.

Keywords: Data Center, Green IT, Management Server, Power Saving.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1628

7377 Fuzzy Numbers and MCDM Methods for Portfolio Optimization

Authors: Thi T. Nguyen, Lee N. Gordon-Brown

Abstract:

A new deployment of the multiple criteria decision making (MCDM) techniques: the Simple Additive Weighting (SAW), and the Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) for portfolio allocation, is demonstrated in this paper. Rather than exclusive reference to mean and variance as in the traditional mean-variance method, the criteria used in this demonstration are the first four moments of the portfolio distribution. Each asset is evaluated based on its marginal impacts to portfolio higher moments that are characterized by trapezoidal fuzzy numbers. Then centroid-based defuzzification is applied to convert fuzzy numbers to the crisp numbers by which SAW and TOPSIS can be deployed. Experimental results suggest the similar efficiency of these MCDM approaches to selecting dominant assets for an optimal portfolio under higher moments. The proposed approaches allow investors flexibly adjust their risk preferences regarding higher moments via different schemes adapting to various (from conservative to risky) kinds of investors. The other significant advantage is that, compared to the mean-variance analysis, the portfolio weights obtained by SAW and TOPSIS are consistently well-diversified.

Keywords: Fuzzy numbers, SAW, TOPSIS, portfolio optimization, higher moments, risk management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2143

7376 Spatial Econometric Approaches for Count Data: An Overview and New Directions

Authors: Paula Simões, Isabel Natário

Abstract:

This paper reviews a number of theoretical aspects for implementing an explicit spatial perspective in econometrics for modelling non-continuous data, in general, and count data, in particular. It provides an overview of the several spatial econometric approaches that are available to model data that are collected with reference to location in space, from the classical spatial econometrics approaches to the recent developments on spatial econometrics to model count data, in a Bayesian hierarchical setting. Considerable attention is paid to the inferential framework, necessary for structural consistent spatial econometric count models, incorporating spatial lag autocorrelation, to the corresponding estimation and testing procedures for different assumptions, to the constrains and implications embedded in the various specifications in the literature. This review combines insights from the classical spatial econometrics literature as well as from hierarchical modeling and analysis of spatial data, in order to look for new possible directions on the processing of count data, in a spatial hierarchical Bayesian econometric context.

Keywords: Spatial data analysis, spatial econometrics, Bayesian hierarchical models, count data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2704

7375 Using Analytical Hierarchy Process and TOPSIS Approaches in Designing a Finite Element Analysis Automation Program

Authors: Ming Wen, Nasim Nezamoddini

Abstract:

Sophisticated numerical simulations like finite element analysis (FEA) involve a complicated process from model setup to post-processing tasks that require replication of time-consuming steps. Utilizing FEA automation program simplifies the complexity of the involved steps while minimizing human errors in analysis set up, calculations, and results processing. One of the main challenges in designing FEA automation programs is to identify user requirements and link them to possible design alternatives. This paper presents a decision-making framework to design a Python based FEA automation program for modal analysis, frequency response analysis, and random vibration fatigue (RVF) analysis procedures. Analytical hierarchy process (AHP) and technique for order preference by similarity to ideal solution (TOPSIS) are applied to evaluate design alternatives considering the feedback received from experts and program users.

Keywords: FEA, random vibration fatigue, process automation, AHP, TOPSIS, multiple-criteria decision-making, MCDM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 532

7374 MATLAB-Based Graphical User Interface (GUI) for Data Mining as a Tool for Environment Management

Authors: M. Awawdeh, A. Fedi

Abstract:

The application of data mining to environmental monitoring has become crucial for a number of tasks related to emergency management. Over recent years, many tools have been developed for decision support system (DSS) for emergency management. In this article a graphical user interface (GUI) for environmental monitoring system is presented. This interface allows accomplishing (i) data collection and observation and (ii) extraction for data mining. This tool may be the basis for future development along the line of the open source software paradigm.

Keywords: Data Mining, Environmental data, Mathematical Models, Matlab Graphical User Interface.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4741

7373 Principal Component Analysis using Singular Value Decomposition of Microarray Data

Authors: Dong Hoon Lim

Abstract:

A series of microarray experiments produces observations of differential expression for thousands of genes across multiple conditions. Principal component analysis(PCA) has been widely used in multivariate data analysis to reduce the dimensionality of the data in order to simplify subsequent analysis and allow for summarization of the data in a parsimonious manner. PCA, which can be implemented via a singular value decomposition(SVD), is useful for analysis of microarray data. For application of PCA using SVD we use the DNA microarray data for the small round blue cell tumors(SRBCT) of childhood by Khan et al.(2001). To decide the number of components which account for sufficient amount of information we draw scree plot. Biplot, a graphic display associated with PCA, reveals important features that exhibit relationship between variables and also the relationship of variables with observations.

Keywords: Principal component analysis, singular value decomposition, microarray data, SRBCT

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3250

7372 Clustering Mixed Data Using Non-normal Regression Tree for Process Monitoring

Authors: Youngji Yoo, Cheong-Sool Park, Jun Seok Kim, Young-Hak Lee, Sung-Shick Kim, Jun-Geol Baek

Abstract:

In the semiconductor manufacturing process, large amounts of data are collected from various sensors of multiple facilities. The collected data from sensors have several different characteristics due to variables such as types of products, former processes and recipes. In general, Statistical Quality Control (SQC) methods assume the normality of the data to detect out-of-control states of processes. Although the collected data have different characteristics, using the data as inputs of SQC will increase variations of data, require wide control limits, and decrease performance to detect outof- control. Therefore, it is necessary to separate similar data groups from mixed data for more accurate process control. In the paper, we propose a regression tree using split algorithm based on Pearson distribution to handle non-normal distribution in parametric method. The regression tree finds similar properties of data from different variables. The experiments using real semiconductor manufacturing process data show improved performance in fault detecting ability.

Keywords: Semiconductor, non-normal mixed process data, clustering, Statistical Quality Control (SQC), regression tree, Pearson distribution system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1780

7371 Speech Data Compression using Vector Quantization

Authors: H. B. Kekre, Tanuja K. Sarode

Abstract:

Mostly transforms are used for speech data compressions which are lossy algorithms. Such algorithms are tolerable for speech data compression since the loss in quality is not perceived by the human ear. However the vector quantization (VQ) has a potential to give more data compression maintaining the same quality. In this paper we propose speech data compression algorithm using vector quantization technique. We have used VQ algorithms LBG, KPE and FCG. The results table shows computational complexity of these three algorithms. Here we have introduced a new performance parameter Average Fractional Change in Speech Sample (AFCSS). Our FCG algorithm gives far better performance considering mean absolute error, AFCSS and complexity as compared to others.

Keywords: Vector Quantization, Data Compression, Encoding, , Speech coding.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2404

7370 Ontology and CDSS Based Intelligent Health Data Management in Health Care Server

Authors: Eun-Jung Ko, Hyung-Jik Lee, Jeun-Woo Lee

Abstract:

In ubiqutious healthcare environment, user's health data are transfered to the remote healthcare server by the user's wearable system or mobile phone. These collected user's health data should be managed and analyzed in the healthcare server, so that care giver or user can monitor user's physiological state. In this paper, we designed and developed the intelligent Healthcare Server to manage the user's health data using CDSS and ontology. Our system can analyze user's health data semantically using CDSS and ontology, and report the result of user's physiological raw data to the user and care giver.

Keywords: u-healthcare, CDSS, healthcare server, health data, ontology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2235

7369 A Genetic Algorithm for Clustering on Image Data

Authors: Qin Ding, Jim Gasvoda

Abstract:

Clustering is the process of subdividing an input data set into a desired number of subgroups so that members of the same subgroup are similar and members of different subgroups have diverse properties. Many heuristic algorithms have been applied to the clustering problem, which is known to be NP Hard. Genetic algorithms have been used in a wide variety of fields to perform clustering, however, the technique normally has a long running time in terms of input set size. This paper proposes an efficient genetic algorithm for clustering on very large data sets, especially on image data sets. The genetic algorithm uses the most time efficient techniques along with preprocessing of the input data set. We test our algorithm on both artificial and real image data sets, both of which are of large size. The experimental results show that our algorithm outperforms the k-means algorithm in terms of running time as well as the quality of the clustering.

Keywords: Clustering, data mining, genetic algorithm, image data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2053

7368 Ergonomics and Its Applicability in the Design Process in Egypt Challenges and Prospects

Authors: Mohamed Moheyeldin Mahmoud

Abstract:

Egypt suffers from a severe shortage of data and charts concerning the physical dimensions, measurements, qualities and consumer behavior. The shortage of needed information and appropriate methods has forced the Egyptian designer to use any other foreign standard when designing a product for the Egyptian consumer which has led to many problems. The urgently needed database concerning the physical specifications, measurements of the Egyptian consumers, as well as the need to support the Ergonomics given courses in many colleges and institutes with the latest technologies, is stated as the research problem. Descriptive analytical method relying on the compiling, comparing and analyzing of information and facts in order to get acceptable perceptions, ideas and considerations is the used methodology by the researcher. The research concludes that: 1. Good interaction relationship between users and products shows the success of that product. 2. An integration linkage between the most prominent fields of science specially Ergonomics, Interaction Design and Ethnography should be encouraged to provide an ultimately updated database concerning the nature, specifications and environment of the Egyptian consumer, in order to achieve a higher benefit for both user and product. 3. Chinese economic policy based on the study of market requirements long before any market activities should be emulated. 4. Using Ethnography supports the design activities creating new products or updating existent ones through measuring the compatibility of products with their environment and user expectations, While contracting a joint cooperation between military colleges, sports education institutes from one side, and design institutes from the other side to provide an ultimately updated (annually updated) database concerning some specifications about students of both sexes applying in those institutes (height, weight, etc.) to provide the Industrial designer with the needed information when creating a new product or updating an existing one concerning that category is recommended by the researcher.

Keywords: Adapt ergonomics, ethnography, interaction design.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 809

7367 A Holistic Framework for Unifying Data Security and Management in Modern Enterprises

Authors: Ashly Joseph

Abstract:

Modern businesses struggle significantly to secure and manage their data properly as the volume and complexity of their data both expand exponentially. Through the use of a multi-layered defense strategy, a centralized management platform, and cutting-edge technologies like AI, this research paper presents a comprehensive framework to integrate data security and management. The constraints of current data protection and management strategies, technological advancements, and the evolving threat landscape are all examined in this article. It suggests best practices for putting into practice integrated data security and governance models, placing an emphasis on ongoing adaptation. The advantages mentioned include a strengthened security posture, simpler procedures, lower costs, and reduced complexity. Additionally, issues including skill shortages, antiquated systems, and cultural obstacles are examined. Security executives and Chief Information Security Officers are given practical advice on how to evaluate, plan, and put into place strong data-centric security and management capabilities. The goal of the paper is to provide a thorough study of the data security and management landscape and to arm contemporary businesses with the knowledge they need to be proactive in protecting their data assets.

Keywords: Data security, security management, cloud computing, cybersecurity, data governance, security architecture, data management.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 270