Search results for: panel data analysis.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 13622

Search results for: panel data analysis.

13292 Twitter Sentiment Analysis during the Lockdown on New Zealand

Authors: Smah Doeban Almotiri

Abstract:

One of the most common fields of natural language processing (NLP) is sentimental analysis. The inferred feeling in the text can be successfully mined for various events using sentiment analysis. Twitter is viewed as a reliable data point for sentimental analytics studies since people are using social media to receive and exchange different types of data on a broad scale during the COVID-19 epidemic. The processing of such data may aid in making critical decisions on how to keep the situation under control. The aim of this research is to look at how sentimental states differed in a single geographic region during the lockdown at two different times.1162 tweets were analyzed related to the COVID-19 pandemic lockdown using keywords hashtags (lockdown, COVID-19) for the first sample tweets were from March 23, 2020, until April 23, 2020, and the second sample for the following year was from March 1, 2021, until April 4, 2021. Natural language processing (NLP), which is a form of Artificial intelligent was used for this research to calculate the sentiment value of all of the tweets by using AFINN Lexicon sentiment analysis method. The findings revealed that the sentimental condition in both different times during the region's lockdown was positive in the samples of this study, which are unique to the specific geographical area of New Zealand. This research suggests applied machine learning sentimental method such as Crystal Feel and extended the size of the sample tweet by using multiple tweets over a longer period of time.

Keywords: sentiment analysis, Twitter analysis, lockdown, Covid-19, AFINN, NodeJS

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 585
13291 Interrelationships between Physicochemical Water Pollution Indicators: A Case Study of River Pandu

Authors: Sunita Verma , Divya Tiwari, Ajay Verma

Abstract:

Water samples were collected from river Pandu at six stations where human and animal activities were high. Composite samples were analyzed for dissolved oxygen (DO), biochemical oxygen demand (BOD), chemical oxygen demand (COD) , pH values during dry and wet seasons as well as the harmattan period. The total data points were used to establish relationships between the parameters and data were also subjected to statistical analysis and expressed as mean ± standard error of mean (SEM) at a level of significance of p<0.05. Regression analysis was carried out to establish relationships if any between studied parameters and relationships in form of scatter plots were obtained between DO/BOD, COD/DO, BOD/COD, COD/pH, BOD/pH and DO/pH. The high to moderate correlation coefficient observed, R2 ranged from 0.68 to 0.15 between these parameters.

Keywords: BOD, DO, COD, pH, Regression analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2131
13290 A Review on the Comparison of EU Countries Based on Research and Development Efficiencies

Authors: Yeliz Ekinci, Raife Merve Ön

Abstract:

Nowadays, technological progress is one of the most important components of economic growth and the efficiency of R&D activities is particularly essential for countries. This study is an attempt to analyze the R&D efficiencies of EU countries. The indicators related to R&D efficiencies should be determined in advance in order to use DEA. For this reason a list of input and output indicators are derived from the literature review. Considering the data availability, a final list is given for the numerical analysis for future research.

Keywords: Data envelopment analysis, economic growth, EU Countries, R&D efficiency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2043
13289 Data Projects for “Social Good”: Challenges and Opportunities

Authors: Mikel Niño, Roberto V. Zicari, Todor Ivanov, Kim Hee, Naveed Mushtaq, Marten Rosselli, Concha Sánchez-Ocaña, Karsten Tolle, José Miguel Blanco, Arantza Illarramendi, Jörg Besier, Harry Underwood

Abstract:

One of the application fields for data analysis techniques and technologies gaining momentum is the area of social good or “common good”, covering cases related to humanitarian crises, global health care, or ecology and environmental issues, among others. The promotion of data-driven projects in this field aims at increasing the efficacy and efficiency of social initiatives, improving the way these actions help humanity in general and people in need in particular. This application field, however, poses its own barriers and challenges when developing data-driven projects, lagging behind in comparison with other scenarios. These challenges derive from aspects such as the scope and scale of the social issue to solve, cultural and political barriers, the skills of main stakeholders and the technological resources available, the motivation to be engaged in such projects, or the ethical and legal issues related to sensitive data. This paper analyzes the application of data projects in the field of social good, reviewing its current state and noteworthy initiatives, and presenting a framework covering the key aspects to analyze in such projects. The goal is to provide guidelines to understand the main challenges and opportunities for this type of data project, as well as identifying the main differential issues compared to “classical” data projects in general. A case study is presented on the initial steps and stakeholder analysis of a data project for the inclusion of refugees in the city of Frankfurt, Germany, in order to empirically confront the framework with a real example.

Keywords: Data-Driven projects, humanitarian operations, personal and sensitive data, social good, stakeholders analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1795
13288 Sentiment Analysis: Popularity of Candidates for the President of the United States

Authors: Radek Malinský, Ivan Jelínek

Abstract:

This article deals with the popularity of candidates for the president of the United States of America. The popularity is assessed according to public comments on the Web 2.0. Social networking, blogging and online forums (collectively Web 2.0) are for common Internet users the easiest way to share their personal opinions, thoughts, and ideas with the entire world. However, the web content diversity, variety of technologies and website structure differences, all of these make the Web 2.0 a network of heterogeneous data, where things are difficult to find for common users. The introductory part of the article describes methodology for gathering and processing data from Web 2.0. The next part of the article is focused on the evaluation and content analysis of obtained information, which write about presidential candidates.

Keywords: Sentiment Analysis, Web 2.0, Webometrics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3233
13287 Using RASCAL and ALOHA Codes to Establish an Analysis Methodology for Hydrogen Fluoride Evaluation

Authors: J. R. Wang, Y. Chiang, W. S. Hsu, H. C. Chen, S. H. Chen, J. H. Yang, S. W. Chen, C. Shih

Abstract:

In this study, the RASCAL and ALOHA codes are used to establish an analysis methodology for hydrogen fluoride (HF) evaluation. There are three main steps in this study. First, the UF6 data were collected. Second, one postulated case was analyzed by using the RASCAL and UF6 data. This postulated case assumes that fire occurring and UF6 is releasing from a building. Third, the results of RASCAL for HF mass were as the input data of ALOHA. Two postulated cases of HF were analyzed by using ALOHA code and the results of RASCAL. These postulated cases assume fire occurring and HF is releasing with no raining (Case 1) or raining (Case 2) condition. According to the analysis results of ALOHA, the HF concentration of Case 2 is smaller than Case 1. The results can be a reference for the preparing of emergency plans for the release of HF.

Keywords: RASCAL, ALOHA, UF6, hydrogen fluoride.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 774
13286 Exploring Performance-Based Music Attributes for Stylometric Analysis

Authors: Abdellghani Bellaachia, Edward Jimenez

Abstract:

Music Information Retrieval (MIR) and modern data mining techniques are applied to identify style markers in midi music for stylometric analysis and author attribution. Over 100 attributes are extracted from a library of 2830 songs then mined using supervised learning data mining techniques. Two attributes are identified that provide high informational gain. These attributes are then used as style markers to predict authorship. Using these style markers the authors are able to correctly distinguish songs written by the Beatles from those that were not with a precision and accuracy of over 98 per cent. The identification of these style markers as well as the architecture for this research provides a foundation for future research in musical stylometry.

Keywords: Music Information Retrieval, Music Data Mining, Stylometry.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1680
13285 Mining Multicity Urban Data for Sustainable Population Relocation

Authors: Xu Du, Aparna S. Varde

Abstract:

In this research, we propose to conduct diagnostic and predictive analysis about the key factors and consequences of urban population relocation. To achieve this goal, urban simulation models extract the urban development trends as land use change patterns from a variety of data sources. The results are treated as part of urban big data with other information such as population change and economic conditions. Multiple data mining methods are deployed on this data to analyze nonlinear relationships between parameters. The result determines the driving force of population relocation with respect to urban sprawl and urban sustainability and their related parameters. This work sets the stage for developing a comprehensive urban simulation model for catering to specific questions by targeted users. It contributes towards achieving sustainability as a whole.

Keywords: Data Mining, Environmental Modeling, Sustainability, Urban Planning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1783
13284 TRACE/FRAPTRAN Analysis of Kuosheng Nuclear Power Plant Dry-Storage System

Authors: J. R. Wang, Y. Chiang, W. Y. Li, H. T. Lin, H. C. Chen, C. Shih, S. W. Chen

Abstract:

The dry-storage systems of nuclear power plants (NPPs) in Taiwan have become one of the major safety concerns. There are two steps considered in this study. The first step is the verification of the TRACE by using VSC-17 experimental data. The results of TRACE were similar to the VSC-17 data. It indicates that TRACE has the respectable accuracy in the simulation and analysis of the dry-storage systems. The next step is the application of TRACE in the dry-storage system of Kuosheng NPP (BWR/6). Kuosheng NPP is the second BWR NPP of Taiwan Power Company. In order to solve the storage of the spent fuels, Taiwan Power Company developed the new dry-storage system for Kuosheng NPP. In this step, the dry-storage system model of Kuosheng NPP was established by TRACE. Then, the steady state simulation of this model was performed and the results of TRACE were compared with the Kuosheng NPP data. Finally, this model was used to perform the safety analysis of Kuosheng NPP dry-storage system. Besides, FRAPTRAN was used tocalculate the transient performance of fuel rods.

Keywords: BWR, TRACE, FRAPTRAN, Dry-Storage.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2082
13283 Intellectual Capital and Competitive Advantage: An Analysis of the Biotechnology Industry

Authors: Campisi Domenico, Costa Roberta

Abstract:

Intellectual capital measurement is a central aspect of knowledge management. The measurement and the evaluation of intangible assets play a key role in allowing an effective management of these assets as sources of competitiveness. For these reasons, managers and practitioners need conceptual and analytical tools taking into account the unique characteristics and economic significance of Intellectual Capital. Following this lead, we propose an efficiency and productivity analysis of Intellectual Capital, as a determinant factor of the company competitive advantage. The analysis is carried out by means of Data Envelopment Analysis (DEA) and Malmquist Productivity Index (MPI). These techniques identify Bests Practice companies that have accomplished competitive advantage implementing successful strategies of Intellectual Capital management, and offer to inefficient companies development paths by means of benchmarking. The proposed methodology is employed on the Biotechnology industry in the period 2007-2010.

Keywords: Data Envelopment Analysis, Innovation, Intangible assets, Intellectual Capital, Malmquist Productivity Index.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1924
13282 An Anomaly Detection Approach to Detect Unexpected Faults in Recordings from Test Drives

Authors: Andreas Theissler, Ian Dear

Abstract:

In the automotive industry test drives are being conducted during the development of new vehicle models or as a part of quality assurance of series-production vehicles. The communication on the in-vehicle network, data from external sensors, or internal data from the electronic control units is recorded by automotive data loggers during the test drives. The recordings are used for fault analysis. Since the resulting data volume is tremendous, manually analysing each recording in great detail is not feasible. This paper proposes to use machine learning to support domainexperts by preventing them from contemplating irrelevant data and rather pointing them to the relevant parts in the recordings. The underlying idea is to learn the normal behaviour from available recordings, i.e. a training set, and then to autonomously detect unexpected deviations and report them as anomalies. The one-class support vector machine “support vector data description” is utilised to calculate distances of feature vectors. SVDDSUBSEQ is proposed as a novel approach, allowing to classify subsequences in multivariate time series data. The approach allows to detect unexpected faults without modelling effort as is shown with experimental results on recordings from test drives.

Keywords: Anomaly detection, fault detection, test drive analysis, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2477
13281 Analysis of Sequence Moves in Successful Chess Openings Using Data Mining with Association Rules

Authors: R.M.Rani

Abstract:

Chess is one of the indoor games, which improves the level of human confidence, concentration, planning skills and knowledge. The main objective of this paper is to help the chess players to improve their chess openings using data mining techniques. Budding Chess Players usually do practices by analyzing various existing openings. When they analyze and correlate thousands of openings it becomes tedious and complex for them. The work done in this paper is to analyze the best lines of Blackmar- Diemer Gambit(BDG) which opens with White D4... using data mining analysis. It is carried out on the collection of winning games by applying association rules. The first step of this analysis is assigning variables to each different sequence moves. In the second step, the sequence association rules were generated to calculate support and confidence factor which help us to find the best subsequence chess moves that may lead to winning position.

Keywords: Blackmar-Diemer Gambit(BDG), Confidence, sequence Association Rules, Support.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3093
13280 The Effect of Nylon and Kevlar Stitching on the Mode I Fracture of Carbon/Epoxy Composites

Authors: Nisrin R. Abdelal, Steven L. Donaldson

Abstract:

Composite materials are widely used in aviation industry due to their superior properties; however, they are susceptible to delamination. Through-thickness stitching is one of the techniques to alleviate delamination. Kevlar is one of the most common stitching materials; in contrast, it is expensive and presents stitching fabrication challenges. Therefore, this study compares the performance of Kevlar with an inexpensive and easy-to-use nylon fiber in stitching to alleviate delamination. Three laminates of unidirectional carbon fiber-epoxy composites were manufactured using vacuum assisted resin transfer molding process. One panel was stitched with Kevlar, one with nylon, and one unstitched. Mode I interlaminar fracture tests were carried out on specimens from the three composite laminates, and the results were compared. Fractographic analysis using optical and scanning electron microscope were conducted to reveal the differences between stitching with Kevlar and nylon on the internal microstructure of the composite with respect to the interlaminar fracture toughness values.

Keywords: Carbon, delamination, Kevlar, mode I, nylon, stitching.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1222
13279 Turbine Trip without Bypass Analysis of Kuosheng Nuclear Power Plant Using TRACE Coupling with FRAPTRAN

Authors: J. R. Wang, H. T. Lin, H. C. Chang, W. K. Lin, W. Y. Li, C. Shih

Abstract:

This analysis of Kuosheng nuclear power plant (NPP) was performed mainly by TRACE, assisted with FRAPTRAN and FRAPCON. SNAP v2.2.1 and TRACE v5.0p3 are used to develop the Kuosheng NPP SPU TRACE model which can simulate the turbine trip without bypass transient. From the analysis of TRACE, the important parameters such as dome pressure, coolant temperature and pressure can be determined. Through these parameters, comparing with the criteria which were formulated by United States Nuclear Regulatory Commission (U.S. NRC), we can determine whether the Kuoshengnuclear power plant failed or not in the accident analysis. However, from the data of TRACE, the fuel rods status cannot be determined. With the information from TRACE and burn-up analysis obtained from FRAPCON, FRAPTRAN analyzes more details about the fuel rods in this transient. Besides, through the SNAP interface, the data results can be presented as an animation. From the animation, the TRACE and FRAPTRAN data can be merged together that may be realized by the readers more easily. In this research, TRACE showed that the maximum dome pressure of the reactor reaches to 8.32 MPa, which is lower than the acceptance limit 9.58 MPa. Furthermore, FRAPTRAN revels that the maximum strain is about 0.00165, which is below the criteria 0.01. In addition, cladding enthalpy is 52.44 cal/g which is lower than 170 cal/g specified by the USNRC NUREG-0800 Standard Review Plan.

Keywords: Turbine trip without bypass, Kuosheng NPP, TRACE, FRAPTRAN, SNAP animation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2486
13278 Reducing SAGE Data Using Genetic Algorithms

Authors: Cheng-Hong Yang, Tsung-Mu Shih, Li-Yeh Chuang

Abstract:

Serial Analysis of Gene Expression is a powerful quantification technique for generating cell or tissue gene expression data. The profile of the gene expression of cell or tissue in several different states is difficult for biologists to analyze because of the large number of genes typically involved. However, feature selection in machine learning can successfully reduce this problem. The method allows reducing the features (genes) in specific SAGE data, and determines only relevant genes. In this study, we used a genetic algorithm to implement feature selection, and evaluate the classification accuracy of the selected features with the K-nearest neighbor method. In order to validate the proposed method, we used two SAGE data sets for testing. The results of this study conclusively prove that the number of features of the original SAGE data set can be significantly reduced and higher classification accuracy can be achieved.

Keywords: Serial Analysis of Gene Expression, Feature selection, Genetic Algorithm, K-nearest neighbor method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1610
13277 Morphology of Parts of the Middle Benue Trough of Nigeria from Spectral Analysis of Aeromagnetic Data (Akiri Sheet 232 and Lafia Sheet 231)

Authors: B. S. Jatau, Nandom Abu

Abstract:

Structural interpretation of aeromagnetic data and Landsat imagery over the Middle Benue Trough was carried out to determine the depth to basement, delineate the basement morphology and relief, and the structural features within the basin. The aeromagnetic and Landsat data were subjected to various image and data enhancement and transformation routines. Results of the study revealed lineaments with trend directions in the N-S, NE-SW, NWSE and E-W directions, with the NE-SW trends been dominant. The depths to basement within the trough were established to be at 1.8, 0.3 and 0.8km, as shown from the spectral analysis plot. The Source Parameter Imaging (SPI) plot generated showed the centralsouth/ eastern portion of the study area as being deeper in contrast to the western-south-west portion. The basement morphology of the trough was interpreted as having parallel sets of micro-basins which could be considered as grabens and horsts in agreement with the general features interpreted by early workers.

Keywords: Morphology, Middle Benue Trough, Spectral Analysis, Source Parameter Imaging.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4066
13276 Straight Line Defect Detection with Feed Forward Neural Network

Authors: S. Liangwongsan, A. Oonsivilai

Abstract:

Nowadays, hard disk is one of the most popular storage components. In hard disk industry, the hard disk drive must pass various complex processes and tested systems. In each step, there are some failures. To reduce waste from these failures, we must find the root cause of those failures. Conventionall data analysis method is not effective enough to analyze the large capacity of data. In this paper, we proposed the Hough method for straight line detection that helps to detect straight line defect patterns that occurs in hard disk drive. The proposed method will help to increase more speed and accuracy in failure analysis.

Keywords: Hough Transform, Failure Analysis, Media, Hard Disk Drive

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2094
13275 Database Compression for Intelligent On-board Vehicle Controllers

Authors: Ágoston Winkler, Sándor Juhász, Zoltán Benedek

Abstract:

The vehicle fleet of public transportation companies is often equipped with intelligent on-board passenger information systems. A frequently used but time and labor-intensive way for keeping the on-board controllers up-to-date is the manual update using different memory cards (e.g. flash cards) or portable computers. This paper describes a compression algorithm that enables data transmission using low bandwidth wireless radio networks (e.g. GPRS) by minimizing the amount of data traffic. In typical cases it reaches a compression rate of an order of magnitude better than that of the general purpose compressors. Compressed data can be easily expanded by the low-performance controllers, too.

Keywords: Data analysis, data compression, differentialencoding, run-length encoding, vehicle control.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1567
13274 Data and Control Flow Analysis of VDMµ Specifications

Authors: Mubina Nazmeen, Iram Rubab

Abstract:

Formal Specification languages are being widely used for system specification and testing. Highly critical systems such as real time systems, avionics, and medical systems are represented using Formal specification languages. Formal specifications based testing is mostly performed using black box testing approaches thus testing only the set of inputs and outputs of the system. The formal specification language such as VDMµ can be used for white box testing as they provide enough constructs as any other high level programming language. In this work, we perform data and control flow analysis of VDMµ class specifications. The proposed work is discussed with an example of SavingAccount.

Keywords: VDM-SL, VDMµ, data flow graph, control flowgraph, testing, formal specification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4377
13273 Using Data Mining for Learning and Clustering FCM

Authors: Somayeh Alizadeh, Mehdi Ghazanfari, Mohammad Fathian

Abstract:

Fuzzy Cognitive Maps (FCMs) have successfully been applied in numerous domains to show relations between essential components. In some FCM, there are more nodes, which related to each other and more nodes means more complex in system behaviors and analysis. In this paper, a novel learning method used to construct FCMs based on historical data and by using data mining and DEMATEL method, a new method defined to reduce nodes number. This method cluster nodes in FCM based on their cause and effect behaviors.

Keywords: Clustering, Data Mining, Fuzzy Cognitive Map(FCM), Learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2016
13272 Health Assessment of Electronic Products using Mahalanobis Distance and Projection Pursuit Analysis

Authors: Sachin Kumar, Vasilis Sotiris, Michael Pecht

Abstract:

With increasing complexity in electronic systems there is a need for system level anomaly detection and fault isolation. Anomaly detection based on vector similarity to a training set is used in this paper through two approaches, one the preserves the original information, Mahalanobis Distance (MD), and the other that compresses the data into its principal components, Projection Pursuit Analysis. These methods have been used to detect deviations in system performance from normal operation and for critical parameter isolation in multivariate environments. The study evaluates the detection capability of each approach on a set of test data with known faults against a baseline set of data representative of such “healthy" systems.

Keywords: Mahalanobis distance, Principle components, Projection pursuit, Health assessment, Anomaly.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1681
13271 A Safety Analysis Method for Multi-Agent Systems

Authors: Ching Louis Liu, Edmund Kazmierczak, Tim Miller

Abstract:

Safety analysis for multi-agent systems is complicated by the, potentially nonlinear, interactions between agents. This paper proposes a method for analyzing the safety of multi-agent systems by explicitly focusing on interactions and the accident data of systems that are similar in structure and function to the system being analyzed. The method creates a Bayesian network using the accident data from similar systems. A feature of our method is that the events in accident data are labeled with HAZOP guide words. Our method uses an Ontology to abstract away from the details of a multi-agent implementation. Using the ontology, our methods then constructs an “Interaction Map,” a graphical representation of the patterns of interactions between agents and other artifacts. Interaction maps combined with statistical data from accidents and the HAZOP classifications of events can be converted into a Bayesian Network. Bayesian networks allow designers to explore “what it” scenarios and make design trade-offs that maintain safety. We show how to use the Bayesian networks, and the interaction maps to improve multi-agent system designs.

Keywords: Multi-agent system, safety analysis, safety model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1087
13270 GA Based Optimal Feature Extraction Method for Functional Data Classification

Authors: Jun Wan, Zehua Chen, Yingwu Chen, Zhidong Bai

Abstract:

Classification is an interesting problem in functional data analysis (FDA), because many science and application problems end up with classification problems, such as recognition, prediction, control, decision making, management, etc. As the high dimension and high correlation in functional data (FD), it is a key problem to extract features from FD whereas keeping its global characters, which relates to the classification efficiency and precision to heavens. In this paper, a novel automatic method which combined Genetic Algorithm (GA) and classification algorithm to extract classification features is proposed. In this method, the optimal features and classification model are approached via evolutional study step by step. It is proved by theory analysis and experiment test that this method has advantages in improving classification efficiency, precision and robustness whereas using less features and the dimension of extracted classification features can be controlled.

Keywords: Classification, functional data, feature extraction, genetic algorithm, wavelet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1555
13269 Simultaneous Clustering and Feature Selection Method for Gene Expression Data

Authors: T. Chandrasekhar, K. Thangavel, E. N. Sathishkumar

Abstract:

Microarrays are made it possible to simultaneously monitor the expression profiles of thousands of genes under various experimental conditions. It is used to identify the co-expressed genes in specific cells or tissues that are actively used to make proteins. This method is used to analysis the gene expression, an important task in bioinformatics research. Cluster analysis of gene expression data has proved to be a useful tool for identifying co-expressed genes, biologically relevant groupings of genes and samples. In this work K-Means algorithms has been applied for clustering of Gene Expression Data. Further, rough set based Quick reduct algorithm has been applied for each cluster in order to select the most similar genes having high correlation. Then the ACV measure is used to evaluate the refined clusters and classification is used to evaluate the proposed method. They could identify compact clusters with feature selection method used to genes are selected.

Keywords: Clustering, Feature selection, Gene expression data, Quick reduct.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1967
13268 Using RASCAL Code to Analyze the Postulated UF6 Fire Accident

Authors: J. R. Wang, Y. Chiang, W. S. Hsu, S. H. Chen, J. H. Yang, S. W. Chen, C. Shih, Y. F. Chang, Y. H. Huang, B. R. Shen

Abstract:

In this research, the RASCAL code was used to simulate and analyze the postulated UF6 fire accident which may occur in the Institute of Nuclear Energy Research (INER). There are four main steps in this research. In the first step, the UF6 data of INER were collected. In the second step, the RASCAL analysis methodology and model was established by using these data. Third, this RASCAL model was used to perform the simulation and analysis of the postulated UF6 fire accident. Three cases were simulated and analyzed in this step. Finally, the analysis results of RASCAL were compared with the hazardous levels of the chemicals. According to the compared results of three cases, Case 3 has the maximum danger in human health.

Keywords: RASCAL, UF6, Safety, Hydrogen fluoride.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 871
13267 Multi-labeled Data Expressed by a Set of Labels

Authors: Tetsuya Furukawa, Masahiro Kuzunishi

Abstract:

Collected data must be organized to be utilized efficiently, and hierarchical classification of data is efficient approach to organize data. When data is classified to multiple categories or annotated with a set of labels, users request multi-labeled data by giving a set of labels. There are several interpretations of the data expressed by a set of labels. This paper discusses which data is expressed by a set of labels by introducing orders for sets of labels and shows that there are four types of orders, which are characterized by whether the labels of expressed data includes every label of the given set of labels within the range of the set. Desirable properties of the orders, data is also expressed by the higher set of labels and different sets of labels express different data, are discussed for the orders.

Keywords: Classification Hierarchies, Multi-labeled Data, Multiple Classificaiton, Orders of Sets of Labels

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1304
13266 Effects of Video Games and Online Chat on Mathematics Performance in High School: An Approach of Multivariate Data Analysis

Authors: Lina Wu, Wenyi Lu, Ye Li

Abstract:

Regarding heavy video game players for boys and super online chat lovers for girls as a symbolic phrase in the current adolescent culture, this project of data analysis verifies the displacement effect on deteriorating mathematics performance. To evaluate correlation or regression coefficients between a factor of playing video games or chatting online and mathematics performance compared with other factors, we use multivariate analysis technique and take gender difference into account. We find the most important reason for the negative sign of the displacement effect on mathematics performance due to students’ poor academic background. Statistical analysis methods in this project could be applied to study internet users’ academic performance from the high school education to the college education.

Keywords: Correlation coefficients, displacement effect, gender difference, multivariate analysis technique, regression coefficients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2170
13265 Discovering Complex Regularities by Adaptive Self Organizing Classification

Authors: A. Faro, D. Giordano, F. Maiorana

Abstract:

Data mining uses a variety of techniques each of which is useful for some particular task. It is important to have a deep understanding of each technique and be able to perform sophisticated analysis. In this article we describe a tool built to simulate a variation of the Kohonen network to perform unsupervised clustering and support the entire data mining process up to results visualization. A graphical representation helps the user to find out a strategy to optmize classification by adding, moving or delete a neuron in order to change the number of classes. The tool is also able to automatically suggest a strategy for number of classes optimization.The tool is used to classify macroeconomic data that report the most developed countries? import and export. It is possible to classify the countries based on their economic behaviour and use an ad hoc tool to characterize the commercial behaviour of a country in a selected class from the analysis of positive and negative features that contribute to classes formation.

Keywords: Unsupervised classification, Kohonen networks, macroeconomics, Visual data mining, cluster interpretation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1563
13264 Hybrid Approach for Country’s Performance Evaluation

Authors: C. Slim

Abstract:

This paper presents an integrated model, which hybridized data envelopment analysis (DEA) and support vector machine (SVM) together, to class countries according to their efficiency and performance. This model takes into account aspects of multi-dimensional indicators, decision-making hierarchy and relativity of measurement. Starting from a set of indicators of performance as exhaustive as possible, a process of successive aggregations has been developed to attain an overall evaluation of a country’s competitiveness.

Keywords: Artificial neural networks, support vector machine, data envelopment analysis, aggregations, indicators of performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1061
13263 Evaluating Performance of an Anomaly Detection Module with Artificial Neural Network Implementation

Authors: Edward Guillén, Jhordany Rodriguez, Rafael Páez

Abstract:

Anomaly detection techniques have been focused on two main components: data extraction and selection and the second one is the analysis performed over the obtained data. The goal of this paper is to analyze the influence that each of these components has over the system performance by evaluating detection over network scenarios with different setups. The independent variables are as follows: the number of system inputs, the way the inputs are codified and the complexity of the analysis techniques. For the analysis, some approaches of artificial neural networks are implemented with different number of layers. The obtained results show the influence that each of these variables has in the system performance.

Keywords: Network Intrusion Detection, Machine learning, Artificial Neural Network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2078