Search results for: Data stream analysis.

13225 The Robust Clustering with Reduction Dimension

Abstract:

A clustering is process to identify a homogeneous groups of object called as cluster. Clustering is one interesting topic on data mining. A group or class behaves similarly characteristics. This paper discusses a robust clustering process for data images with two reduction dimension approaches; i.e. the two dimensional principal component analysis (2DPCA) and principal component analysis (PCA). A standard approach to overcome this problem is dimension reduction, which transforms a high-dimensional data into a lower-dimensional space with limited loss of information. One of the most common forms of dimensionality reduction is the principal components analysis (PCA). The 2DPCA is often called a variant of principal component (PCA), the image matrices were directly treated as 2D matrices; they do not need to be transformed into a vector so that the covariance matrix of image can be constructed directly using the original image matrices. The decomposed classical covariance matrix is very sensitive to outlying observations. The objective of paper is to compare the performance of robust minimizing vector variance (MVV) in the two dimensional projection PCA (2DPCA) and the PCA for clustering on an arbitrary data image when outliers are hiden in the data set. The simulation aspects of robustness and the illustration of clustering images are discussed in the end of paper

Keywords: Breakdown point, Consistency, 2DPCA, PCA, Outlier, Vector Variance

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1657

13224 Benchmarking Cleaner Production Performance of Coal-fired Power Plants Using Two-stage Super-efficiency Data Envelopment Analysis

Authors: Shao-lun Zeng, Yu-long Ren

Abstract:

Benchmarking cleaner production performance is an effective way of pollution control and emission reduction in coal-fired power industry. A benchmarking method using two-stage super-efficiency data envelopment analysis for coal-fired power plants is proposed – firstly, to improve the cleaner production performance of DEA-inefficient or weakly DEA-efficient plants, then to select the benchmark from performance-improved power plants. An empirical study is carried out with the survey data of 24 coal-fired power plants. The result shows that in the first stage the performance of 16 plants is DEA-efficient and that of 8 plants is relatively inefficient. The target values for improving DEA-inefficient plants are acquired by projection analysis. The efficient performance of 24 power plants and the benchmarking plant is achieved in the second stage. The two-stage benchmarking method is practical to select the optimal benchmark in the cleaner production of coal-fired power industry and will continuously improve plants- cleaner production performance.

Keywords: benchmarking, cleaner production performance, coal-fired power plant, super-efficiency data envelopment analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2382

13223 Energy Efficiency Analysis of Crossover Technologies in Industrial Applications

Authors: W. Schellong

Abstract:

Industry accounts for one-third of global final energy demand. Crossover technologies (e.g. motors, pumps, process heat, and air conditioning) play an important role in improving energy efficiency. These technologies are used in many applications independent of the production branch. Especially electrical power is used by drives, pumps, compressors, and lightning. The paper demonstrates the algorithm of the energy analysis by some selected case studies for typical industrial processes. The energy analysis represents an essential part of energy management systems (EMS). Generally, process control system (PCS) can support EMS. They provide information about the production process, and they organize the maintenance actions. Combining these tools into an integrated process allows the development of an energy critical equipment strategy. Thus, asset and energy management can use the same common data to improve the energy efficiency.

Keywords: Crossover technologies, data management, energy analysis, energy efficiency, process control.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 910

13222 Hybrid Reliability-Similarity-Based Approach for Supervised Machine Learning

Authors: Walid Cherif

Abstract:

Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.

Keywords: Data mining, knowledge discovery, machine learning, similarity measurement, supervised classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1478

13221 Comparative Analysis of the Third Generation of Research Data for Evaluation of Solar Energy Potential

Authors: Claudineia Brazil, Elison Eduardo Jardim Bierhals, Luciane Teresa Salvi, Rafael Haag

Abstract:

Renewable energy sources are dependent on climatic variability, so for adequate energy planning, observations of the meteorological variables are required, preferably representing long-period series. Despite the scientific and technological advances that meteorological measurement systems have undergone in the last decades, there is still a considerable lack of meteorological observations that form series of long periods. The reanalysis is a system of assimilation of data prepared using general atmospheric circulation models, based on the combination of data collected at surface stations, ocean buoys, satellites and radiosondes, allowing the production of long period data, for a wide gamma. The third generation of reanalysis data emerged in 2010, among them is the Climate Forecast System Reanalysis (CFSR) developed by the National Centers for Environmental Prediction (NCEP), these data have a spatial resolution of 0.50 x 0.50. In order to overcome these difficulties, it aims to evaluate the performance of solar radiation estimation through alternative data bases, such as data from Reanalysis and from meteorological satellites that satisfactorily meet the absence of observations of solar radiation at global and/or regional level. The results of the analysis of the solar radiation data indicated that the reanalysis data of the CFSR model presented a good performance in relation to the observed data, with determination coefficient around 0.90. Therefore, it is concluded that these data have the potential to be used as an alternative source in locations with no seasons or long series of solar radiation, important for the evaluation of solar energy potential.

Keywords: Climate, reanalysis, renewable energy, solar radiation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 852

13220 Methods for Distinction of Cattle Using Supervised Learning

Authors: Radoslav Židek, Veronika Šidlová, Radovan Kasarda, Birgit Fuerst-Waltl

Abstract:

Machine learning represents a set of topics dealing with the creation and evaluation of algorithms that facilitate pattern recognition, classification, and prediction, based on models derived from existing data. The data can present identification patterns which are used to classify into groups. The result of the analysis is the pattern which can be used for identification of data set without the need to obtain input data used for creation of this pattern. An important requirement in this process is careful data preparation validation of model used and its suitable interpretation. For breeders, it is important to know the origin of animals from the point of the genetic diversity. In case of missing pedigree information, other methods can be used for traceability of animal´s origin. Genetic diversity written in genetic data is holding relatively useful information to identify animals originated from individual countries. We can conclude that the application of data mining for molecular genetic data using supervised learning is an appropriate tool for hypothesis testing and identifying an individual.

Keywords: Genetic data, Pinzgau cattle, supervised learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2277

13219 Acute Coronary Syndrome Prediction Using Data Mining Techniques- An Application

Authors: Tahseen A. Jilani, Huda Yasin, Madiha Yasin, C. Ardil

Abstract:

In this paper we use data mining techniques to investigate factors that contribute significantly to enhancing the risk of acute coronary syndrome. We assume that the dependent variable is diagnosis – with dichotomous values showing presence or absence of disease. We have applied binary regression to the factors affecting the dependent variable. The data set has been taken from two different cardiac hospitals of Karachi, Pakistan. We have total sixteen variables out of which one is assumed dependent and other 15 are independent variables. For better performance of the regression model in predicting acute coronary syndrome, data reduction techniques like principle component analysis is applied. Based on results of data reduction, we have considered only 14 out of sixteen factors.

Keywords: Acute coronary syndrome (ACS), binary logistic regression analyses, myocardial ischemia (MI), principle component analysis, unstable angina (U.A.).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2074

13218 Spatial-Temporal Clustering Characteristics of Dengue in the Northern Region of Sri Lanka, 2010-2013

Authors: Sumiko Anno, Keiji Imaoka, Takeo Tadono, Tamotsu Igarashi, Subramaniam Sivaganesh, Selvam Kannathasan, Vaithehi Kumaran, Sinnathamby Noble Surendran

Abstract:

Dengue outbreaks are affected by biological, ecological, socio-economic and demographic factors that vary over time and space. These factors have been examined separately and still require systematic clarification. The present study aimed to investigate the spatial-temporal clustering relationships between these factors and dengue outbreaks in the northern region of Sri Lanka. Remote sensing (RS) data gathered from a plurality of satellites were used to develop an index comprising rainfall, humidity and temperature data. RS data gathered by ALOS/AVNIR-2 were used to detect urbanization, and a digital land cover map was used to extract land cover information. Other data on relevant factors and dengue outbreaks were collected through institutions and extant databases. The analyzed RS data and databases were integrated into geographic information systems, enabling temporal analysis, spatial statistical analysis and space-time clustering analysis. Our present results showed that increases in the number of the combination of ecological factor and socio-economic and demographic factors with above the average or the presence contribute to significantly high rates of space-time dengue clusters.

Keywords: ALOS/AVNIR-2, Dengue, Space-time clustering analysis, Sri Lanka.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2236

13217 Estimating Bridge Deterioration for Small Data Sets Using Regression and Markov Models

Authors: Yina F. Muñoz, Alexander Paz, Hanns De La Fuente-Mella, Joaquin V. Fariña, Guilherme M. Sales

Abstract:

The primary approach for estimating bridge deterioration uses Markov-chain models and regression analysis. Traditional Markov models have problems in estimating the required transition probabilities when a small sample size is used. Often, reliable bridge data have not been taken over large periods, thus large data sets may not be available. This study presents an important change to the traditional approach by using the Small Data Method to estimate transition probabilities. The results illustrate that the Small Data Method and traditional approach both provide similar estimates; however, the former method provides results that are more conservative. That is, Small Data Method provided slightly lower than expected bridge condition ratings compared with the traditional approach. Considering that bridges are critical infrastructures, the Small Data Method, which uses more information and provides more conservative estimates, may be more appropriate when the available sample size is small. In addition, regression analysis was used to calculate bridge deterioration. Condition ratings were determined for bridge groups, and the best regression model was selected for each group. The results obtained were very similar to those obtained when using Markov chains; however, it is desirable to use more data for better results.

Keywords: Concrete bridges, deterioration, Markov chains, probability matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1387

13216 Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Authors: Hoda A. Abdel Hafez

Abstract:

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

Keywords: Mining Big Data, Big Data, Machine learning, Data Streams, Telecommunication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2420

13215 Optimization of a Bioremediation Strategy for an Urban Stream of Matanza-Riachuelo Basin

Authors: María D. Groppa, Andrea Trentini, Myriam Zawoznik, Roxana Bigi, Carlos Nadra, Patricia L. Marconi

Abstract:

In the present work, a remediation bioprocess based on the use of a local isolate of the microalgae Chlorella vulgaris immobilized in alginate beads is proposed. This process was shown to be effective for the reduction of several chemical and microbial contaminants present in Cildáñez stream, a water course that is part of the Matanza-Riachuelo Basin (Buenos Aires, Argentina). The bioprocess, involving the culture of the microalga in autotrophic conditions in a stirred-tank bioreactor supplied with a marine propeller for 6 days, allowed a significant reduction of Escherichia coli and total coliform numbers (over 95%), as well as of ammoniacal nitrogen (96%), nitrates (86%), nitrites (98%), and total phosphorus (53%) contents. Pb content was also significantly diminished after the bioprocess (95%). Standardized cytotoxicity tests using Allium cepa seeds and Cildáñez water pre- and post-remediation were also performed. Germination rate and mitotic index of onion seeds imbibed in Cildáñez water subjected to the bioprocess was similar to that observed in seeds imbibed in distilled water and significantly superior to that registered when untreated Cildáñez water was used for imbibition. Our results demonstrate the potential of this simple and cost-effective technology to remove urban-water contaminants, offering as an additional advantage the possibility of an easy biomass recovery, which may become a source of alternative energy.

Keywords: Bioreactor, bioremediation, Chlorella vulgaris, Matanza-Riachuelo basin, microalgae.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 776

13214 The Comparison of Anchor and Star Schema from a Query Performance Perspective

Authors: Radek Němec

Abstract:

Today's business environment requires that companies have access to highly relevant information in a matter of seconds. Modern Business Intelligence tools rely on data structured mostly in traditional dimensional database schemas, typically represented by star schemas. Dimensional modeling is already recognized as a leading industry standard in the field of data warehousing although several drawbacks and pitfalls were reported. This paper focuses on the analysis of another data warehouse modeling technique - the anchor modeling, and its characteristics in context with the standardized dimensional modeling technique from a query performance perspective. The results of the analysis show information about performance of queries executed on database schemas structured according to principles of each database modeling technique.

Keywords: Data warehousing, anchor modeling, star schema, anchor schema, query performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3271

13213 Twitter Sentiment Analysis during the Lockdown on New Zealand

Authors: Smah Doeban Almotiri

Abstract:

One of the most common fields of natural language processing (NLP) is sentimental analysis. The inferred feeling in the text can be successfully mined for various events using sentiment analysis. Twitter is viewed as a reliable data point for sentimental analytics studies since people are using social media to receive and exchange different types of data on a broad scale during the COVID-19 epidemic. The processing of such data may aid in making critical decisions on how to keep the situation under control. The aim of this research is to look at how sentimental states differed in a single geographic region during the lockdown at two different times.1162 tweets were analyzed related to the COVID-19 pandemic lockdown using keywords hashtags (lockdown, COVID-19) for the first sample tweets were from March 23, 2020, until April 23, 2020, and the second sample for the following year was from March 1, 2021, until April 4, 2021. Natural language processing (NLP), which is a form of Artificial intelligent was used for this research to calculate the sentiment value of all of the tweets by using AFINN Lexicon sentiment analysis method. The findings revealed that the sentimental condition in both different times during the region's lockdown was positive in the samples of this study, which are unique to the specific geographical area of New Zealand. This research suggests applied machine learning sentimental method such as Crystal Feel and extended the size of the sample tweet by using multiple tweets over a longer period of time.

Keywords: sentiment analysis, Twitter analysis, lockdown, Covid-19, AFINN, NodeJS

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 469

13212 Developing Digital Competencies in Aboriginal Students through University-College Partnerships

Authors: W. S. Barber, S. L. King

Abstract:

This paper reports on a pilot project to develop a collaborative partnership between a community college in rural northern Ontario, Canada, and an urban university in the greater Toronto area in Oshawa, Canada. Partner institutions will collaborate to address learning needs of university applicants whose goals are to attain an undergraduate university BA in Educational Studies and Digital Technology degree, but who may not live in a geographical location that would facilitate this pathways process. The UOIT BA degree is attained through a 2+2 program, where students with a 2 year college diploma or equivalent can attain a four year undergraduate degree. The goals reported on the project are as: 1. Our aim is to expand the BA program to include an additional stream which includes serious educational games, simulations and virtual environments, 2. Develop fully (using both synchronous and asynchronous technologies) online learning modules for use by university applicants who otherwise are not geographically located close to a physical university site, 3. Assess the digital competencies of all students, including members of local, distance and Indigenous communities using a validated tool developed and tested by UOIT across numerous populations. This tool, the General Technical Competency Use and Scale (GTCU) will provide the collaborating institutions with data that will allow for analyzing how well students are prepared to succeed in fully online learning communities. Philosophically, the UOIT BA program is based on a fully online learning communities model (FOLC) that can be accessed from anywhere in the world through digital learning environments via audio video conferencing tools such as Adobe Connect. It also follows models of adult learning and mobile learning, and makes a university degree accessible to the increasing demographic of adult learners who may use mobile devices to learn anywhere anytime. The program is based on key principles of Problem Based Learning, allowing students to build their own understandings through the co-design of the learning environment in collaboration with the instructors and their peers. In this way, this degree allows students to personalize and individualize the learning based on their own culture, background and professional/personal experiences. Using modified flipped classroom strategies, students are able to interrogate video modules on their own time in preparation for one hour discussions occurring in video conferencing sessions. As a consequence of the program flexibility, students may continue to work full or part time. All of the partner institutions will co-develop four new modules, administer the GTCU and share data, while creating a new stream of the UOIT BA degree. This will increase accessibility for students to bridge from community colleges to university through a fully digital environment. We aim to work collaboratively with Indigenous elders, community members and distance education instructors to increase opportunities for more students to attain a university education.

Keywords: Aboriginal, college, competencies, digital, universities.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 734

13211 Development of Energy Benchmarks Using Mandatory Energy and Emissions Reporting Data: Ontario Post-Secondary Residences

Authors: C. Xavier Mendieta, J. J McArthur

Abstract:

Governments are playing an increasingly active role in reducing carbon emissions, and a key strategy has been the introduction of mandatory energy disclosure policies. These policies have resulted in a significant amount of publicly available data, providing researchers with a unique opportunity to develop location-specific energy and carbon emission benchmarks from this data set, which can then be used to develop building archetypes and used to inform urban energy models. This study presents the development of such a benchmark using the public reporting data. The data from Ontario’s Ministry of Energy for Post-Secondary Educational Institutions are being used to develop a series of building archetype dynamic building loads and energy benchmarks to fill a gap in the currently available building database. This paper presents the development of a benchmark for college and university residences within ASHRAE climate zone 6 areas in Ontario using the mandatory disclosure energy and greenhouse gas emissions data. The methodology presented includes data cleaning, statistical analysis, and benchmark development, and lessons learned from this investigation are presented and discussed to inform the development of future energy benchmarks from this larger data set. The key findings from this initial benchmarking study are: (1) the importance of careful data screening and outlier identification to develop a valid dataset; (2) the key features used to develop a model of the data are building age, size, and occupancy schedules and these can be used to estimate energy consumption; and (3) policy changes affecting the primary energy generation significantly affected greenhouse gas emissions, and consideration of these factors was critical to evaluate the validity of the reported data.

Keywords: Building archetypes, data analysis, energy benchmarks, GHG emissions.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 974

13210 Generic Data Warehousing for Consumer Electronics Retail Industry

Authors: S. Habte, K. Ouazzane, P. Patel, S. Patel

Abstract:

The dynamic and highly competitive nature of the consumer electronics retail industry means that businesses in this industry are experiencing different decision making challenges in relation to pricing, inventory control, consumer satisfaction and product offerings. To overcome the challenges facing retailers and create opportunities, we propose a generic data warehousing solution which can be applied to a wide range of consumer electronics retailers with a minimum configuration. The solution includes a dimensional data model, a template SQL script, a high level architectural descriptions, ETL tool developed using C#, a set of APIs, and data access tools. It has been successfully applied by ASK Outlets Ltd UK resulting in improved productivity and enhanced sales growth.

Keywords: Consumer electronics retail, dimensional data model, data analysis, generic data warehousing, reporting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1333

13209 A Proposal for U-City (Smart City) Service Method Using Real-Time Digital Map

Authors: SangWon Han, MuWook Pyeon, Sujung Moon, DaeKyo Seo

Abstract:

Recently, technologies based on three-dimensional (3D) space information are being developed and quality of life is improving as a result. Research on real-time digital map (RDM) is being conducted now to provide 3D space information. RDM is a service that creates and supplies 3D space information in real time based on location/shape detection. Research subjects on RDM include the construction of 3D space information with matching image data, complementing the weaknesses of image acquisition using multi-source data, and data collection methods using big data. Using RDM will be effective for space analysis using 3D space information in a U-City and for other space information utilization technologies.

Keywords: RDM, multi-source data, big data, U-City.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 762

13208 Interrelationships between Physicochemical Water Pollution Indicators: A Case Study of River Pandu

Authors: Sunita Verma , Divya Tiwari, Ajay Verma

Abstract:

Water samples were collected from river Pandu at six stations where human and animal activities were high. Composite samples were analyzed for dissolved oxygen (DO), biochemical oxygen demand (BOD), chemical oxygen demand (COD) , pH values during dry and wet seasons as well as the harmattan period. The total data points were used to establish relationships between the parameters and data were also subjected to statistical analysis and expressed as mean ± standard error of mean (SEM) at a level of significance of p<0.05. Regression analysis was carried out to establish relationships if any between studied parameters and relationships in form of scatter plots were obtained between DO/BOD, COD/DO, BOD/COD, COD/pH, BOD/pH and DO/pH. The high to moderate correlation coefficient observed, R2 ranged from 0.68 to 0.15 between these parameters.

Keywords: BOD, DO, COD, pH, Regression analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2084

13207 K-Means for Spherical Clusters with Large Variance in Sizes

Authors: A. M. Fahim, G. Saake, A. M. Salem, F. A. Torkey, M. A. Ramadan

Abstract:

Data clustering is an important data exploration technique with many applications in data mining. The k-means algorithm is well known for its efficiency in clustering large data sets. However, this algorithm is suitable for spherical shaped clusters of similar sizes and densities. The quality of the resulting clusters decreases when the data set contains spherical shaped with large variance in sizes. In this paper, we introduce a competent procedure to overcome this problem. The proposed method is based on shifting the center of the large cluster toward the small cluster, and recomputing the membership of small cluster points, the experimental results reveal that the proposed algorithm produces satisfactory results.

Keywords: K-Means, Data Clustering, Cluster Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3241

13206 Studying the Effect of Froude Number and Densimetric Froude Number on Local Scours around Circular Bridge Piers

Authors: Md Abdullah Al Faruque

Abstract:

A very large percentage of bridge failures are attributed to scouring around bridge piers and this directly influences public safety. Experiments are carried out in a 12-m long rectangular open channel flume made of transparent tempered glass. A 300 mm thick bed made up of sand particles is leveled horizontally to create the test bed and a 50 mm hollow plastic cylinder is used as a model bridge pier. Tests are carried out with varying flow depths and velocities. Data points of various scour parameters such as scour depth, width, and length are collected based on different flow conditions and visual observations of changes in the stream bed downstream the bridge pier are also made as the scour progresses. Result shows that all three major flow characteristics (flow depth, Froude number and densimetric Froude number) have one way or other affect the scour profile.

Keywords: Bridge pier scour, densimetric Froude number, flow depth, Froude Number, sand.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 917

13205 Analysis of Data Gathering Schemes for Layered Sensor Networks with Multihop Polling

Authors: Bhed Bahadur Bista, Danda B. Rawat

Abstract:

In this paper, we investigate multihop polling and data gathering schemes in layered sensor networks in order to extend the life time of the networks. A network consists of three layers. The lowest layer contains sensors. The middle layer contains so called super nodes with higher computational power, energy supply and longer transmission range than sensor nodes. The top layer contains a sink node. A node in each layer controls a number of nodes in lower layer by polling mechanism to gather data. We will present four types of data gathering schemes: intermediate nodes do not queue data packet, queue single packet, queue multiple packets and aggregate data, to see which data gathering scheme is more energy efficient for multihop polling in layered sensor networks.

Keywords: layered sensor network, polling, data gatheringschemes.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1522

13204 A Review on the Comparison of EU Countries Based on Research and Development Efficiencies

Authors: Yeliz Ekinci, Raife Merve Ön

Abstract:

Nowadays, technological progress is one of the most important components of economic growth and the efficiency of R&D activities is particularly essential for countries. This study is an attempt to analyze the R&D efficiencies of EU countries. The indicators related to R&D efficiencies should be determined in advance in order to use DEA. For this reason a list of input and output indicators are derived from the literature review. Considering the data availability, a final list is given for the numerical analysis for future research.

Keywords: Data envelopment analysis, economic growth, EU Countries, R&D efficiency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2009

13203 Sentiment Analysis: Popularity of Candidates for the President of the United States

Authors: Radek Malinský, Ivan Jelínek

Abstract:

This article deals with the popularity of candidates for the president of the United States of America. The popularity is assessed according to public comments on the Web 2.0. Social networking, blogging and online forums (collectively Web 2.0) are for common Internet users the easiest way to share their personal opinions, thoughts, and ideas with the entire world. However, the web content diversity, variety of technologies and website structure differences, all of these make the Web 2.0 a network of heterogeneous data, where things are difficult to find for common users. The introductory part of the article describes methodology for gathering and processing data from Web 2.0. The next part of the article is focused on the evaluation and content analysis of obtained information, which write about presidential candidates.

Keywords: Sentiment Analysis, Web 2.0, Webometrics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3193

13202 Data Projects for “Social Good”: Challenges and Opportunities

Authors: Mikel Niño, Roberto V. Zicari, Todor Ivanov, Kim Hee, Naveed Mushtaq, Marten Rosselli, Concha Sánchez-Ocaña, Karsten Tolle, José Miguel Blanco, Arantza Illarramendi, Jörg Besier, Harry Underwood

Abstract:

One of the application fields for data analysis techniques and technologies gaining momentum is the area of social good or “common good”, covering cases related to humanitarian crises, global health care, or ecology and environmental issues, among others. The promotion of data-driven projects in this field aims at increasing the efficacy and efficiency of social initiatives, improving the way these actions help humanity in general and people in need in particular. This application field, however, poses its own barriers and challenges when developing data-driven projects, lagging behind in comparison with other scenarios. These challenges derive from aspects such as the scope and scale of the social issue to solve, cultural and political barriers, the skills of main stakeholders and the technological resources available, the motivation to be engaged in such projects, or the ethical and legal issues related to sensitive data. This paper analyzes the application of data projects in the field of social good, reviewing its current state and noteworthy initiatives, and presenting a framework covering the key aspects to analyze in such projects. The goal is to provide guidelines to understand the main challenges and opportunities for this type of data project, as well as identifying the main differential issues compared to “classical” data projects in general. A case study is presented on the initial steps and stakeholder analysis of a data project for the inclusion of refugees in the city of Frankfurt, Germany, in order to empirically confront the framework with a real example.

Keywords: Data-Driven projects, humanitarian operations, personal and sensitive data, social good, stakeholders analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1714

13201 Using RASCAL and ALOHA Codes to Establish an Analysis Methodology for Hydrogen Fluoride Evaluation

Authors: J. R. Wang, Y. Chiang, W. S. Hsu, H. C. Chen, S. H. Chen, J. H. Yang, S. W. Chen, C. Shih

Abstract:

In this study, the RASCAL and ALOHA codes are used to establish an analysis methodology for hydrogen fluoride (HF) evaluation. There are three main steps in this study. First, the UF₆ data were collected. Second, one postulated case was analyzed by using the RASCAL and UF₆ data. This postulated case assumes that fire occurring and UF₆ is releasing from a building. Third, the results of RASCAL for HF mass were as the input data of ALOHA. Two postulated cases of HF were analyzed by using ALOHA code and the results of RASCAL. These postulated cases assume fire occurring and HF is releasing with no raining (Case 1) or raining (Case 2) condition. According to the analysis results of ALOHA, the HF concentration of Case 2 is smaller than Case 1. The results can be a reference for the preparing of emergency plans for the release of HF.

Keywords: RASCAL, ALOHA, UF6, hydrogen fluoride.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 696

13200 Intellectual Capital and Competitive Advantage: An Analysis of the Biotechnology Industry

Authors: Campisi Domenico, Costa Roberta

Abstract:

Intellectual capital measurement is a central aspect of knowledge management. The measurement and the evaluation of intangible assets play a key role in allowing an effective management of these assets as sources of competitiveness. For these reasons, managers and practitioners need conceptual and analytical tools taking into account the unique characteristics and economic significance of Intellectual Capital. Following this lead, we propose an efficiency and productivity analysis of Intellectual Capital, as a determinant factor of the company competitive advantage. The analysis is carried out by means of Data Envelopment Analysis (DEA) and Malmquist Productivity Index (MPI). These techniques identify Bests Practice companies that have accomplished competitive advantage implementing successful strategies of Intellectual Capital management, and offer to inefficient companies development paths by means of benchmarking. The proposed methodology is employed on the Biotechnology industry in the period 2007-2010.

Keywords: Data Envelopment Analysis, Innovation, Intangible assets, Intellectual Capital, Malmquist Productivity Index.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1877

13199 Exploring Performance-Based Music Attributes for Stylometric Analysis

Authors: Abdellghani Bellaachia, Edward Jimenez

Abstract:

Music Information Retrieval (MIR) and modern data mining techniques are applied to identify style markers in midi music for stylometric analysis and author attribution. Over 100 attributes are extracted from a library of 2830 songs then mined using supervised learning data mining techniques. Two attributes are identified that provide high informational gain. These attributes are then used as style markers to predict authorship. Using these style markers the authors are able to correctly distinguish songs written by the Beatles from those that were not with a precision and accuracy of over 98 per cent. The identification of these style markers as well as the architecture for this research provides a foundation for future research in musical stylometry.

Keywords: Music Information Retrieval, Music Data Mining, Stylometry.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1633

13198 TRACE/FRAPTRAN Analysis of Kuosheng Nuclear Power Plant Dry-Storage System

Authors: J. R. Wang, Y. Chiang, W. Y. Li, H. T. Lin, H. C. Chen, C. Shih, S. W. Chen

Abstract:

The dry-storage systems of nuclear power plants (NPPs) in Taiwan have become one of the major safety concerns. There are two steps considered in this study. The first step is the verification of the TRACE by using VSC-17 experimental data. The results of TRACE were similar to the VSC-17 data. It indicates that TRACE has the respectable accuracy in the simulation and analysis of the dry-storage systems. The next step is the application of TRACE in the dry-storage system of Kuosheng NPP (BWR/6). Kuosheng NPP is the second BWR NPP of Taiwan Power Company. In order to solve the storage of the spent fuels, Taiwan Power Company developed the new dry-storage system for Kuosheng NPP. In this step, the dry-storage system model of Kuosheng NPP was established by TRACE. Then, the steady state simulation of this model was performed and the results of TRACE were compared with the Kuosheng NPP data. Finally, this model was used to perform the safety analysis of Kuosheng NPP dry-storage system. Besides, FRAPTRAN was used tocalculate the transient performance of fuel rods.

Keywords: BWR, TRACE, FRAPTRAN, Dry-Storage.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2032

13197 Analysis of Sequence Moves in Successful Chess Openings Using Data Mining with Association Rules

Authors: R.M.Rani

Abstract:

Chess is one of the indoor games, which improves the level of human confidence, concentration, planning skills and knowledge. The main objective of this paper is to help the chess players to improve their chess openings using data mining techniques. Budding Chess Players usually do practices by analyzing various existing openings. When they analyze and correlate thousands of openings it becomes tedious and complex for them. The work done in this paper is to analyze the best lines of Blackmar- Diemer Gambit(BDG) which opens with White D4... using data mining analysis. It is carried out on the collection of winning games by applying association rules. The first step of this analysis is assigning variables to each different sequence moves. In the second step, the sequence association rules were generated to calculate support and confidence factor which help us to find the best subsequence chess moves that may lead to winning position.

Keywords: Blackmar-Diemer Gambit(BDG), Confidence, sequence Association Rules, Support.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3029

13196 Turbine Trip without Bypass Analysis of Kuosheng Nuclear Power Plant Using TRACE Coupling with FRAPTRAN

Authors: J. R. Wang, H. T. Lin, H. C. Chang, W. K. Lin, W. Y. Li, C. Shih

Abstract:

This analysis of Kuosheng nuclear power plant (NPP) was performed mainly by TRACE, assisted with FRAPTRAN and FRAPCON. SNAP v2.2.1 and TRACE v5.0p3 are used to develop the Kuosheng NPP SPU TRACE model which can simulate the turbine trip without bypass transient. From the analysis of TRACE, the important parameters such as dome pressure, coolant temperature and pressure can be determined. Through these parameters, comparing with the criteria which were formulated by United States Nuclear Regulatory Commission (U.S. NRC), we can determine whether the Kuoshengnuclear power plant failed or not in the accident analysis. However, from the data of TRACE, the fuel rods status cannot be determined. With the information from TRACE and burn-up analysis obtained from FRAPCON, FRAPTRAN analyzes more details about the fuel rods in this transient. Besides, through the SNAP interface, the data results can be presented as an animation. From the animation, the TRACE and FRAPTRAN data can be merged together that may be realized by the readers more easily. In this research, TRACE showed that the maximum dome pressure of the reactor reaches to 8.32 MPa, which is lower than the acceptance limit 9.58 MPa. Furthermore, FRAPTRAN revels that the maximum strain is about 0.00165, which is below the criteria 0.01. In addition, cladding enthalpy is 52.44 cal/g which is lower than 170 cal/g specified by the USNRC NUREG-0800 Standard Review Plan.

Keywords: Turbine trip without bypass, Kuosheng NPP, TRACE, FRAPTRAN, SNAP animation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2429