Search results for: data analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 13352

Search results for: data analysis

13172 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: Mining Big Data, Big Data, Machine learning, Data Streams, Telecommunication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2418
13171 The Comparison of Anchor and Star Schema from a Query Performance Perspective

Authors: Radek Němec

Abstract:

Today's business environment requires that companies have access to highly relevant information in a matter of seconds. Modern Business Intelligence tools rely on data structured mostly in traditional dimensional database schemas, typically represented by star schemas. Dimensional modeling is already recognized as a leading industry standard in the field of data warehousing although several drawbacks and pitfalls were reported. This paper focuses on the analysis of another data warehouse modeling technique - the anchor modeling, and its characteristics in context with the standardized dimensional modeling technique from a query performance perspective. The results of the analysis show information about performance of queries executed on database schemas structured according to principles of each database modeling technique.

Keywords: Data warehousing, anchor modeling, star schema, anchor schema, query performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3270
13170 Twitter Sentiment Analysis during the Lockdown on New Zealand

Authors: Smah Doeban Almotiri

Abstract:

One of the most common fields of natural language processing (NLP) is sentimental analysis. The inferred feeling in the text can be successfully mined for various events using sentiment analysis. Twitter is viewed as a reliable data point for sentimental analytics studies since people are using social media to receive and exchange different types of data on a broad scale during the COVID-19 epidemic. The processing of such data may aid in making critical decisions on how to keep the situation under control. The aim of this research is to look at how sentimental states differed in a single geographic region during the lockdown at two different times.1162 tweets were analyzed related to the COVID-19 pandemic lockdown using keywords hashtags (lockdown, COVID-19) for the first sample tweets were from March 23, 2020, until April 23, 2020, and the second sample for the following year was from March 1, 2021, until April 4, 2021. Natural language processing (NLP), which is a form of Artificial intelligent was used for this research to calculate the sentiment value of all of the tweets by using AFINN Lexicon sentiment analysis method. The findings revealed that the sentimental condition in both different times during the region's lockdown was positive in the samples of this study, which are unique to the specific geographical area of New Zealand. This research suggests applied machine learning sentimental method such as Crystal Feel and extended the size of the sample tweet by using multiple tweets over a longer period of time.

Keywords: sentiment analysis, Twitter analysis, lockdown, Covid-19, AFINN, NodeJS

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 459
13169 Development of Energy Benchmarks Using Mandatory Energy and Emissions Reporting Data: Ontario Post-Secondary Residences

Authors: C. Xavier Mendieta, J. J McArthur

Abstract:

Governments are playing an increasingly active role in reducing carbon emissions, and a key strategy has been the introduction of mandatory energy disclosure policies. These policies have resulted in a significant amount of publicly available data, providing researchers with a unique opportunity to develop location-specific energy and carbon emission benchmarks from this data set, which can then be used to develop building archetypes and used to inform urban energy models. This study presents the development of such a benchmark using the public reporting data. The data from Ontario’s Ministry of Energy for Post-Secondary Educational Institutions are being used to develop a series of building archetype dynamic building loads and energy benchmarks to fill a gap in the currently available building database. This paper presents the development of a benchmark for college and university residences within ASHRAE climate zone 6 areas in Ontario using the mandatory disclosure energy and greenhouse gas emissions data. The methodology presented includes data cleaning, statistical analysis, and benchmark development, and lessons learned from this investigation are presented and discussed to inform the development of future energy benchmarks from this larger data set. The key findings from this initial benchmarking study are: (1) the importance of careful data screening and outlier identification to develop a valid dataset; (2) the key features used to develop a model of the data are building age, size, and occupancy schedules and these can be used to estimate energy consumption; and (3) policy changes affecting the primary energy generation significantly affected greenhouse gas emissions, and consideration of these factors was critical to evaluate the validity of the reported data.

Keywords: Building archetypes, data analysis, energy benchmarks, GHG emissions.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 972
13168 Interrelationships between Physicochemical Water Pollution Indicators: A Case Study of River Pandu

Authors: Sunita Verma , Divya Tiwari, Ajay Verma

Abstract:

Water samples were collected from river Pandu at six stations where human and animal activities were high. Composite samples were analyzed for dissolved oxygen (DO), biochemical oxygen demand (BOD), chemical oxygen demand (COD) , pH values during dry and wet seasons as well as the harmattan period. The total data points were used to establish relationships between the parameters and data were also subjected to statistical analysis and expressed as mean ± standard error of mean (SEM) at a level of significance of p<0.05. Regression analysis was carried out to establish relationships if any between studied parameters and relationships in form of scatter plots were obtained between DO/BOD, COD/DO, BOD/COD, COD/pH, BOD/pH and DO/pH. The high to moderate correlation coefficient observed, R2 ranged from 0.68 to 0.15 between these parameters.

Keywords: BOD, DO, COD, pH, Regression analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2081
13167 Generic Data Warehousing for Consumer Electronics Retail Industry

Authors: S. Habte, K. Ouazzane, P. Patel, S. Patel

Abstract:

The dynamic and highly competitive nature of the consumer electronics retail industry means that businesses in this industry are experiencing different decision making challenges in relation to pricing, inventory control, consumer satisfaction and product offerings. To overcome the challenges facing retailers and create opportunities, we propose a generic data warehousing solution which can be applied to a wide range of consumer electronics retailers with a minimum configuration. The solution includes a dimensional data model, a template SQL script, a high level architectural descriptions, ETL tool developed using C#, a set of APIs, and data access tools. It has been successfully applied by ASK Outlets Ltd UK resulting in improved productivity and enhanced sales growth.

Keywords: Consumer electronics retail, dimensional data model, data analysis, generic data warehousing, reporting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1332
13166 A Proposal for U-City (Smart City) Service Method Using Real-Time Digital Map

Authors: SangWon Han, MuWook Pyeon, Sujung Moon, DaeKyo Seo

Abstract:

Recently, technologies based on three-dimensional (3D) space information are being developed and quality of life is improving as a result. Research on real-time digital map (RDM) is being conducted now to provide 3D space information. RDM is a service that creates and supplies 3D space information in real time based on location/shape detection. Research subjects on RDM include the construction of 3D space information with matching image data, complementing the weaknesses of image acquisition using multi-source data, and data collection methods using big data. Using RDM will be effective for space analysis using 3D space information in a U-City and for other space information utilization technologies.

Keywords: RDM, multi-source data, big data, U-City.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 760
13165 K-Means for Spherical Clusters with Large Variance in Sizes

Authors: A. M. Fahim, G. Saake, A. M. Salem, F. A. Torkey, M. A. Ramadan

Abstract:

Data clustering is an important data exploration technique with many applications in data mining. The k-means algorithm is well known for its efficiency in clustering large data sets. However, this algorithm is suitable for spherical shaped clusters of similar sizes and densities. The quality of the resulting clusters decreases when the data set contains spherical shaped with large variance in sizes. In this paper, we introduce a competent procedure to overcome this problem. The proposed method is based on shifting the center of the large cluster toward the small cluster, and recomputing the membership of small cluster points, the experimental results reveal that the proposed algorithm produces satisfactory results.

Keywords: K-Means, Data Clustering, Cluster Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3239
13164 Analysis of Data Gathering Schemes for Layered Sensor Networks with Multihop Polling

Authors: Bhed Bahadur Bista, Danda B. Rawat

Abstract:

In this paper, we investigate multihop polling and data gathering schemes in layered sensor networks in order to extend the life time of the networks. A network consists of three layers. The lowest layer contains sensors. The middle layer contains so called super nodes with higher computational power, energy supply and longer transmission range than sensor nodes. The top layer contains a sink node. A node in each layer controls a number of nodes in lower layer by polling mechanism to gather data. We will present four types of data gathering schemes: intermediate nodes do not queue data packet, queue single packet, queue multiple packets and aggregate data, to see which data gathering scheme is more energy efficient for multihop polling in layered sensor networks.

Keywords: layered sensor network, polling, data gatheringschemes.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1520
13163 A Review on the Comparison of EU Countries Based on Research and Development Efficiencies

Authors: Yeliz Ekinci, Raife Merve Ön

Abstract:

Nowadays, technological progress is one of the most important components of economic growth and the efficiency of R&D activities is particularly essential for countries. This study is an attempt to analyze the R&D efficiencies of EU countries. The indicators related to R&D efficiencies should be determined in advance in order to use DEA. For this reason a list of input and output indicators are derived from the literature review. Considering the data availability, a final list is given for the numerical analysis for future research.

Keywords: Data envelopment analysis, economic growth, EU Countries, R&D efficiency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2006
13162 Sentiment Analysis: Popularity of Candidates for the President of the United States

Authors: Radek Malinský, Ivan Jelínek

Abstract:

This article deals with the popularity of candidates for the president of the United States of America. The popularity is assessed according to public comments on the Web 2.0. Social networking, blogging and online forums (collectively Web 2.0) are for common Internet users the easiest way to share their personal opinions, thoughts, and ideas with the entire world. However, the web content diversity, variety of technologies and website structure differences, all of these make the Web 2.0 a network of heterogeneous data, where things are difficult to find for common users. The introductory part of the article describes methodology for gathering and processing data from Web 2.0. The next part of the article is focused on the evaluation and content analysis of obtained information, which write about presidential candidates.

Keywords: Sentiment Analysis, Web 2.0, Webometrics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3192
13161 Data Projects for “Social Good”: Challenges and Opportunities

Authors: Mikel Niño, Roberto V. Zicari, Todor Ivanov, Kim Hee, Naveed Mushtaq, Marten Rosselli, Concha Sánchez-Ocaña, Karsten Tolle, José Miguel Blanco, Arantza Illarramendi, Jörg Besier, Harry Underwood

Abstract:

One of the application fields for data analysis techniques and technologies gaining momentum is the area of social good or “common good”, covering cases related to humanitarian crises, global health care, or ecology and environmental issues, among others. The promotion of data-driven projects in this field aims at increasing the efficacy and efficiency of social initiatives, improving the way these actions help humanity in general and people in need in particular. This application field, however, poses its own barriers and challenges when developing data-driven projects, lagging behind in comparison with other scenarios. These challenges derive from aspects such as the scope and scale of the social issue to solve, cultural and political barriers, the skills of main stakeholders and the technological resources available, the motivation to be engaged in such projects, or the ethical and legal issues related to sensitive data. This paper analyzes the application of data projects in the field of social good, reviewing its current state and noteworthy initiatives, and presenting a framework covering the key aspects to analyze in such projects. The goal is to provide guidelines to understand the main challenges and opportunities for this type of data project, as well as identifying the main differential issues compared to “classical” data projects in general. A case study is presented on the initial steps and stakeholder analysis of a data project for the inclusion of refugees in the city of Frankfurt, Germany, in order to empirically confront the framework with a real example.

Keywords: Data-Driven projects, humanitarian operations, personal and sensitive data, social good, stakeholders analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1706
13160 Using RASCAL and ALOHA Codes to Establish an Analysis Methodology for Hydrogen Fluoride Evaluation

Authors: J. R. Wang, Y. Chiang, W. S. Hsu, H. C. Chen, S. H. Chen, J. H. Yang, S. W. Chen, C. Shih

Abstract:

In this study, the RASCAL and ALOHA codes are used to establish an analysis methodology for hydrogen fluoride (HF) evaluation. There are three main steps in this study. First, the UF6 data were collected. Second, one postulated case was analyzed by using the RASCAL and UF6 data. This postulated case assumes that fire occurring and UF6 is releasing from a building. Third, the results of RASCAL for HF mass were as the input data of ALOHA. Two postulated cases of HF were analyzed by using ALOHA code and the results of RASCAL. These postulated cases assume fire occurring and HF is releasing with no raining (Case 1) or raining (Case 2) condition. According to the analysis results of ALOHA, the HF concentration of Case 2 is smaller than Case 1. The results can be a reference for the preparing of emergency plans for the release of HF.

Keywords: RASCAL, ALOHA, UF6, hydrogen fluoride.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 695
13159 Intellectual Capital and Competitive Advantage: An Analysis of the Biotechnology Industry

Authors: Campisi Domenico, Costa Roberta

Abstract:

Intellectual capital measurement is a central aspect of knowledge management. The measurement and the evaluation of intangible assets play a key role in allowing an effective management of these assets as sources of competitiveness. For these reasons, managers and practitioners need conceptual and analytical tools taking into account the unique characteristics and economic significance of Intellectual Capital. Following this lead, we propose an efficiency and productivity analysis of Intellectual Capital, as a determinant factor of the company competitive advantage. The analysis is carried out by means of Data Envelopment Analysis (DEA) and Malmquist Productivity Index (MPI). These techniques identify Bests Practice companies that have accomplished competitive advantage implementing successful strategies of Intellectual Capital management, and offer to inefficient companies development paths by means of benchmarking. The proposed methodology is employed on the Biotechnology industry in the period 2007-2010.

Keywords: Data Envelopment Analysis, Innovation, Intangible assets, Intellectual Capital, Malmquist Productivity Index.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1874
13158 Exploring Performance-Based Music Attributes for Stylometric Analysis

Authors: Abdellghani Bellaachia, Edward Jimenez

Abstract:

Music Information Retrieval (MIR) and modern data mining techniques are applied to identify style markers in midi music for stylometric analysis and author attribution. Over 100 attributes are extracted from a library of 2830 songs then mined using supervised learning data mining techniques. Two attributes are identified that provide high informational gain. These attributes are then used as style markers to predict authorship. Using these style markers the authors are able to correctly distinguish songs written by the Beatles from those that were not with a precision and accuracy of over 98 per cent. The identification of these style markers as well as the architecture for this research provides a foundation for future research in musical stylometry.

Keywords: Music Information Retrieval, Music Data Mining, Stylometry.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1630
13157 TRACE/FRAPTRAN Analysis of Kuosheng Nuclear Power Plant Dry-Storage System

Authors: J. R. Wang, Y. Chiang, W. Y. Li, H. T. Lin, H. C. Chen, C. Shih, S. W. Chen

Abstract:

The dry-storage systems of nuclear power plants (NPPs) in Taiwan have become one of the major safety concerns. There are two steps considered in this study. The first step is the verification of the TRACE by using VSC-17 experimental data. The results of TRACE were similar to the VSC-17 data. It indicates that TRACE has the respectable accuracy in the simulation and analysis of the dry-storage systems. The next step is the application of TRACE in the dry-storage system of Kuosheng NPP (BWR/6). Kuosheng NPP is the second BWR NPP of Taiwan Power Company. In order to solve the storage of the spent fuels, Taiwan Power Company developed the new dry-storage system for Kuosheng NPP. In this step, the dry-storage system model of Kuosheng NPP was established by TRACE. Then, the steady state simulation of this model was performed and the results of TRACE were compared with the Kuosheng NPP data. Finally, this model was used to perform the safety analysis of Kuosheng NPP dry-storage system. Besides, FRAPTRAN was used tocalculate the transient performance of fuel rods.

Keywords: BWR, TRACE, FRAPTRAN, Dry-Storage.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2031
13156 Analysis of Sequence Moves in Successful Chess Openings Using Data Mining with Association Rules

Authors: R.M.Rani

Abstract:

Chess is one of the indoor games, which improves the level of human confidence, concentration, planning skills and knowledge. The main objective of this paper is to help the chess players to improve their chess openings using data mining techniques. Budding Chess Players usually do practices by analyzing various existing openings. When they analyze and correlate thousands of openings it becomes tedious and complex for them. The work done in this paper is to analyze the best lines of Blackmar- Diemer Gambit(BDG) which opens with White D4... using data mining analysis. It is carried out on the collection of winning games by applying association rules. The first step of this analysis is assigning variables to each different sequence moves. In the second step, the sequence association rules were generated to calculate support and confidence factor which help us to find the best subsequence chess moves that may lead to winning position.

Keywords: Blackmar-Diemer Gambit(BDG), Confidence, sequence Association Rules, Support.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3023
13155 Turbine Trip without Bypass Analysis of Kuosheng Nuclear Power Plant Using TRACE Coupling with FRAPTRAN

Authors: J. R. Wang, H. T. Lin, H. C. Chang, W. K. Lin, W. Y. Li, C. Shih

Abstract:

This analysis of Kuosheng nuclear power plant (NPP) was performed mainly by TRACE, assisted with FRAPTRAN and FRAPCON. SNAP v2.2.1 and TRACE v5.0p3 are used to develop the Kuosheng NPP SPU TRACE model which can simulate the turbine trip without bypass transient. From the analysis of TRACE, the important parameters such as dome pressure, coolant temperature and pressure can be determined. Through these parameters, comparing with the criteria which were formulated by United States Nuclear Regulatory Commission (U.S. NRC), we can determine whether the Kuoshengnuclear power plant failed or not in the accident analysis. However, from the data of TRACE, the fuel rods status cannot be determined. With the information from TRACE and burn-up analysis obtained from FRAPCON, FRAPTRAN analyzes more details about the fuel rods in this transient. Besides, through the SNAP interface, the data results can be presented as an animation. From the animation, the TRACE and FRAPTRAN data can be merged together that may be realized by the readers more easily. In this research, TRACE showed that the maximum dome pressure of the reactor reaches to 8.32 MPa, which is lower than the acceptance limit 9.58 MPa. Furthermore, FRAPTRAN revels that the maximum strain is about 0.00165, which is below the criteria 0.01. In addition, cladding enthalpy is 52.44 cal/g which is lower than 170 cal/g specified by the USNRC NUREG-0800 Standard Review Plan.

Keywords: Turbine trip without bypass, Kuosheng NPP, TRACE, FRAPTRAN, SNAP animation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2427
13154 An Anomaly Detection Approach to Detect Unexpected Faults in Recordings from Test Drives

Authors: Andreas Theissler, Ian Dear

Abstract:

In the automotive industry test drives are being conducted during the development of new vehicle models or as a part of quality assurance of series-production vehicles. The communication on the in-vehicle network, data from external sensors, or internal data from the electronic control units is recorded by automotive data loggers during the test drives. The recordings are used for fault analysis. Since the resulting data volume is tremendous, manually analysing each recording in great detail is not feasible. This paper proposes to use machine learning to support domainexperts by preventing them from contemplating irrelevant data and rather pointing them to the relevant parts in the recordings. The underlying idea is to learn the normal behaviour from available recordings, i.e. a training set, and then to autonomously detect unexpected deviations and report them as anomalies. The one-class support vector machine “support vector data description” is utilised to calculate distances of feature vectors. SVDDSUBSEQ is proposed as a novel approach, allowing to classify subsequences in multivariate time series data. The approach allows to detect unexpected faults without modelling effort as is shown with experimental results on recordings from test drives.

Keywords: Anomaly detection, fault detection, test drive analysis, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2430
13153 Mining Multicity Urban Data for Sustainable Population Relocation

Authors: Xu Du, Aparna S. Varde

Abstract:

In this research, we propose to conduct diagnostic and predictive analysis about the key factors and consequences of urban population relocation. To achieve this goal, urban simulation models extract the urban development trends as land use change patterns from a variety of data sources. The results are treated as part of urban big data with other information such as population change and economic conditions. Multiple data mining methods are deployed on this data to analyze nonlinear relationships between parameters. The result determines the driving force of population relocation with respect to urban sprawl and urban sustainability and their related parameters. This work sets the stage for developing a comprehensive urban simulation model for catering to specific questions by targeted users. It contributes towards achieving sustainability as a whole.

Keywords: Data Mining, Environmental Modeling, Sustainability, Urban Planning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1728
13152 Straight Line Defect Detection with Feed Forward Neural Network

Authors: S. Liangwongsan, A. Oonsivilai

Abstract:

Nowadays, hard disk is one of the most popular storage components. In hard disk industry, the hard disk drive must pass various complex processes and tested systems. In each step, there are some failures. To reduce waste from these failures, we must find the root cause of those failures. Conventionall data analysis method is not effective enough to analyze the large capacity of data. In this paper, we proposed the Hough method for straight line detection that helps to detect straight line defect patterns that occurs in hard disk drive. The proposed method will help to increase more speed and accuracy in failure analysis.

Keywords: Hough Transform, Failure Analysis, Media, Hard Disk Drive

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2049
13151 Morphology of Parts of the Middle Benue Trough of Nigeria from Spectral Analysis of Aeromagnetic Data (Akiri Sheet 232 and Lafia Sheet 231)

Authors: B. S. Jatau, Nandom Abu

Abstract:

Structural interpretation of aeromagnetic data and Landsat imagery over the Middle Benue Trough was carried out to determine the depth to basement, delineate the basement morphology and relief, and the structural features within the basin. The aeromagnetic and Landsat data were subjected to various image and data enhancement and transformation routines. Results of the study revealed lineaments with trend directions in the N-S, NE-SW, NWSE and E-W directions, with the NE-SW trends been dominant. The depths to basement within the trough were established to be at 1.8, 0.3 and 0.8km, as shown from the spectral analysis plot. The Source Parameter Imaging (SPI) plot generated showed the centralsouth/ eastern portion of the study area as being deeper in contrast to the western-south-west portion. The basement morphology of the trough was interpreted as having parallel sets of micro-basins which could be considered as grabens and horsts in agreement with the general features interpreted by early workers.

Keywords: Morphology, Middle Benue Trough, Spectral Analysis, Source Parameter Imaging.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4019
13150 Reducing SAGE Data Using Genetic Algorithms

Authors: Cheng-Hong Yang, Tsung-Mu Shih, Li-Yeh Chuang

Abstract:

Serial Analysis of Gene Expression is a powerful quantification technique for generating cell or tissue gene expression data. The profile of the gene expression of cell or tissue in several different states is difficult for biologists to analyze because of the large number of genes typically involved. However, feature selection in machine learning can successfully reduce this problem. The method allows reducing the features (genes) in specific SAGE data, and determines only relevant genes. In this study, we used a genetic algorithm to implement feature selection, and evaluate the classification accuracy of the selected features with the K-nearest neighbor method. In order to validate the proposed method, we used two SAGE data sets for testing. The results of this study conclusively prove that the number of features of the original SAGE data set can be significantly reduced and higher classification accuracy can be achieved.

Keywords: Serial Analysis of Gene Expression, Feature selection, Genetic Algorithm, K-nearest neighbor method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1565
13149 Data and Control Flow Analysis of VDMµ Specifications

Authors: Mubina Nazmeen, Iram Rubab

Abstract:

Formal Specification languages are being widely used for system specification and testing. Highly critical systems such as real time systems, avionics, and medical systems are represented using Formal specification languages. Formal specifications based testing is mostly performed using black box testing approaches thus testing only the set of inputs and outputs of the system. The formal specification language such as VDMµ can be used for white box testing as they provide enough constructs as any other high level programming language. In this work, we perform data and control flow analysis of VDMµ class specifications. The proposed work is discussed with an example of SavingAccount.

Keywords: VDM-SL, VDMµ, data flow graph, control flowgraph, testing, formal specification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4327
13148 Database Compression for Intelligent On-board Vehicle Controllers

Authors: Ágoston Winkler, Sándor Juhász, Zoltán Benedek

Abstract:

The vehicle fleet of public transportation companies is often equipped with intelligent on-board passenger information systems. A frequently used but time and labor-intensive way for keeping the on-board controllers up-to-date is the manual update using different memory cards (e.g. flash cards) or portable computers. This paper describes a compression algorithm that enables data transmission using low bandwidth wireless radio networks (e.g. GPRS) by minimizing the amount of data traffic. In typical cases it reaches a compression rate of an order of magnitude better than that of the general purpose compressors. Compressed data can be easily expanded by the low-performance controllers, too.

Keywords: Data analysis, data compression, differentialencoding, run-length encoding, vehicle control.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1525
13147 A Safety Analysis Method for Multi-Agent Systems

Authors: Ching Louis Liu, Edmund Kazmierczak, Tim Miller

Abstract:

Safety analysis for multi-agent systems is complicated by the, potentially nonlinear, interactions between agents. This paper proposes a method for analyzing the safety of multi-agent systems by explicitly focusing on interactions and the accident data of systems that are similar in structure and function to the system being analyzed. The method creates a Bayesian network using the accident data from similar systems. A feature of our method is that the events in accident data are labeled with HAZOP guide words. Our method uses an Ontology to abstract away from the details of a multi-agent implementation. Using the ontology, our methods then constructs an “Interaction Map,” a graphical representation of the patterns of interactions between agents and other artifacts. Interaction maps combined with statistical data from accidents and the HAZOP classifications of events can be converted into a Bayesian Network. Bayesian networks allow designers to explore “what it” scenarios and make design trade-offs that maintain safety. We show how to use the Bayesian networks, and the interaction maps to improve multi-agent system designs.

Keywords: Multi-agent system, safety analysis, safety model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1048
13146 Health Assessment of Electronic Products using Mahalanobis Distance and Projection Pursuit Analysis

Authors: Sachin Kumar, Vasilis Sotiris, Michael Pecht

Abstract:

With increasing complexity in electronic systems there is a need for system level anomaly detection and fault isolation. Anomaly detection based on vector similarity to a training set is used in this paper through two approaches, one the preserves the original information, Mahalanobis Distance (MD), and the other that compresses the data into its principal components, Projection Pursuit Analysis. These methods have been used to detect deviations in system performance from normal operation and for critical parameter isolation in multivariate environments. The study evaluates the detection capability of each approach on a set of test data with known faults against a baseline set of data representative of such “healthy" systems.

Keywords: Mahalanobis distance, Principle components, Projection pursuit, Health assessment, Anomaly.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1636
13145 Using RASCAL Code to Analyze the Postulated UF6 Fire Accident

Authors: J. R. Wang, Y. Chiang, W. S. Hsu, S. H. Chen, J. H. Yang, S. W. Chen, C. Shih, Y. F. Chang, Y. H. Huang, B. R. Shen

Abstract:

In this research, the RASCAL code was used to simulate and analyze the postulated UF6 fire accident which may occur in the Institute of Nuclear Energy Research (INER). There are four main steps in this research. In the first step, the UF6 data of INER were collected. In the second step, the RASCAL analysis methodology and model was established by using these data. Third, this RASCAL model was used to perform the simulation and analysis of the postulated UF6 fire accident. Three cases were simulated and analyzed in this step. Finally, the analysis results of RASCAL were compared with the hazardous levels of the chemicals. According to the compared results of three cases, Case 3 has the maximum danger in human health.

Keywords: RASCAL, UF6, Safety, Hydrogen fluoride.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 805
13144 Effects of Video Games and Online Chat on Mathematics Performance in High School: An Approach of Multivariate Data Analysis

Authors: Lina Wu, Wenyi Lu, Ye Li

Abstract:

Regarding heavy video game players for boys and super online chat lovers for girls as a symbolic phrase in the current adolescent culture, this project of data analysis verifies the displacement effect on deteriorating mathematics performance. To evaluate correlation or regression coefficients between a factor of playing video games or chatting online and mathematics performance compared with other factors, we use multivariate analysis technique and take gender difference into account. We find the most important reason for the negative sign of the displacement effect on mathematics performance due to students’ poor academic background. Statistical analysis methods in this project could be applied to study internet users’ academic performance from the high school education to the college education.

Keywords: Correlation coefficients, displacement effect, gender difference, multivariate analysis technique, regression coefficients.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2125
13143 GA Based Optimal Feature Extraction Method for Functional Data Classification

Authors: Jun Wan, Zehua Chen, Yingwu Chen, Zhidong Bai

Abstract:

Classification is an interesting problem in functional data analysis (FDA), because many science and application problems end up with classification problems, such as recognition, prediction, control, decision making, management, etc. As the high dimension and high correlation in functional data (FD), it is a key problem to extract features from FD whereas keeping its global characters, which relates to the classification efficiency and precision to heavens. In this paper, a novel automatic method which combined Genetic Algorithm (GA) and classification algorithm to extract classification features is proposed. In this method, the optimal features and classification model are approached via evolutional study step by step. It is proved by theory analysis and experiment test that this method has advantages in improving classification efficiency, precision and robustness whereas using less features and the dimension of extracted classification features can be controlled.

Keywords: Classification, functional data, feature extraction, genetic algorithm, wavelet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1510