Search results for: secondary data analysis
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 13650

Search results for: secondary data analysis

13470 Spray Combustion Dynamics under Thermoacoustic Oscillations

Authors: Wajid A. Chishty, Stephen D. Lepera, Uri Vandsburger

Abstract:

Thermoacoustic instabilities in combustors have remained a topic of investigation for over a few decades due to the challenges it posses to the operation of low emission gas turbines. For combustors burning liquid fuel, understanding the cause-andeffect relationship between spray combustion dynamics and thermoacoustic oscillations is imperative for the successful development of any control methodology for its mitigation. The paper presents some very unique operating characteristics of a kerosene-fueled diffusion type combustor undergoing limit-cycle oscillations. Combustor stability limits were mapped using three different-sized injectors. The results show that combustor instability depends on the characteristics of the fuel spray. A simple analytic analysis is also reported in support of a plausible explanation for the unique combustor behavior. The study indicates that high amplitude acoustic pressure in the combustor may cause secondary breakdown of fuel droplets resulting in premixed pre-vaporized type burning of the diffusion type combustor.

Keywords: Secondary droplet breakup, Spray dynamics, Taylor Analogy Breakup Model, Thermoacoustic instabilities.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1829
13469 An Analysis of Abortion Laws and Sex Selective Abortion in India: A Case Study of Rajasthan

Authors: Priya Bhakat

Abstract:

A son in every Hindu society pays his own father the debt and he owes him for his own life whereas a girl child is treated as a burden mainly in case of first child. Even today in India we have many societies which does not welcome girl child. Although there is an increase in overall sex ratio, there is a continuous decline in child sex ratio. This paper focuses on issues of sex selective abortion in Rajasthan based on secondary data. It is found that 90.0 percentages of women in Rajasthan wants at least one son. Around 34.3 percentages of women wants more sons than daughters and only 1.5 percentages of women wants more daughters than sons. It is very common among the rich and educated people.

Keywords: Rajasthan, Family Planning Program (FPP), Sex Selective Abortion (SSA), Sex Ratio at Birth (SRB).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3650
13468 Simulation Data Summarization Based on Spatial Histograms

Authors: Jing Zhao, Yoshiharu Ishikawa, Chuan Xiao, Kento Sugiura

Abstract:

In order to analyze large-scale scientific data, research on data exploration and visualization has gained popularity. In this paper, we focus on the exploration and visualization of scientific simulation data, and define a spatial V-Optimal histogram for data summarization. We propose histogram construction algorithms based on a general binary hierarchical partitioning as well as a more specific one, the l-grid partitioning. For effective data summarization and efficient data visualization in scientific data analysis, we propose an optimal algorithm as well as a heuristic algorithm for histogram construction. To verify the effectiveness and efficiency of the proposed methods, we conduct experiments on the massive evacuation simulation data.

Keywords: Simulation data, data summarization, spatial histograms, exploration and visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 722
13467 Knowledge Continuity as a Part of Business Continuity Management

Authors: H. Urbancova, J. Urbanec

Abstract:

Today the intangible assets are the capital of knowledge and are the most important and the most valuable resource for organizations. All employees have knowledge independently of the kind of jobs they do. Knowledge is thus an asset, which influences business operations. The objective of this article is to identify knowledge continuity as an objective of business continuity management. The article has been prepared based on the analysis of secondary sources and the evaluation of primary sources of data by means of a quantitative survey conducted in the Czech Republic. The conclusion of the article is that organizations that apply business continuity management do not focus on the preservation of the knowledge of key employees. Organizations ensure knowledge continuity only intuitively, on a random basis, non-systematically and discontinuously. The non-ensuring of knowledge continuity represents a threat of loss of key knowledge for organizations and can also negatively affect business continuity.

Keywords: Business continuity, knowledge, organizations, survey.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3492
13466 Analysis of Web User Identification Methods

Authors: Renáta Iváncsy, Sándor Juhász

Abstract:

Web usage mining has become a popular research area, as a huge amount of data is available online. These data can be used for several purposes, such as web personalization, web structure enhancement, web navigation prediction etc. However, the raw log files are not directly usable; they have to be preprocessed in order to transform them into a suitable format for different data mining tasks. One of the key issues in the preprocessing phase is to identify web users. Identifying users based on web log files is not a straightforward problem, thus various methods have been developed. There are several difficulties that have to be overcome, such as client side caching, changing and shared IP addresses and so on. This paper presents three different methods for identifying web users. Two of them are the most commonly used methods in web log mining systems, whereas the third on is our novel approach that uses a complex cookie-based method to identify web users. Furthermore we also take steps towards identifying the individuals behind the impersonal web users. To demonstrate the efficiency of the new method we developed an implementation called Web Activity Tracking (WAT) system that aims at a more precise distinction of web users based on log data. We present some statistical analysis created by the WAT on real data about the behavior of the Hungarian web users and a comprehensive analysis and comparison of the three methods

Keywords: Data preparation, Tracking individuals, Web useridentification, Web usage mining

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4366
13465 Data Mining Classification Methods Applied in Drug Design

Authors: Mária Stachová, Lukáš Sobíšek

Abstract:

Data mining incorporates a group of statistical methods used to analyze a set of information, or a data set. It operates with models and algorithms, which are powerful tools with the great potential. They can help people to understand the patterns in certain chunk of information so it is obvious that the data mining tools have a wide area of applications. For example in the theoretical chemistry data mining tools can be used to predict moleculeproperties or improve computer-assisted drug design. Classification analysis is one of the major data mining methodologies. The aim of thecontribution is to create a classification model, which would be able to deal with a huge data set with high accuracy. For this purpose logistic regression, Bayesian logistic regression and random forest models were built using R software. TheBayesian logistic regression in Latent GOLD software was created as well. These classification methods belong to supervised learning methods. It was necessary to reduce data matrix dimension before construct models and thus the factor analysis (FA) was used. Those models were applied to predict the biological activity of molecules, potential new drug candidates.

Keywords: data mining, classification, drug design, QSAR

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2817
13464 A Meta-Analytic Path Analysis of e-Learning Acceptance Model

Authors: David W.S. Tai, Ren-Cheng Zhang, Sheng-Hung Chang, Chin-Pin Chen, Jia-Ling Chen

Abstract:

This study reports results of a meta-analytic path analysis e-learning Acceptance Model with k = 27 studies, Databases searched included Information Sciences Institute (ISI) website. Variables recorded included perceived usefulness, perceived ease of use, attitude toward behavior, and behavioral intention to use e-learning. A correlation matrix of these variables was derived from meta-analytic data and then analyzed by using structural path analysis to test the fitness of the e-learning acceptance model to the observed aggregated data. Results showed the revised hypothesized model to be a reasonable, good fit to aggregated data. Furthermore, discussions and implications are given in this article.

Keywords: E-learning, Meta Analytic Path Analysis, Technology Acceptance Model

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2413
13463 Using SMS Mobile Technology to Assess the Mastery of Subject Content Knowledge of Science and Mathematics Teachers of Secondary Schools in Tanzania

Authors: Joel S. Mtebe, Aron Kondoro, Mussa M. Kissaka, Elia Kibga

Abstract:

Sub-Saharan Africa is described as the second fastest growing in mobile phone penetration in the world more than in the United States or the European Union. Mobile phones have been used to provide a lot of opportunities to improve people’s lives in the region such as in banking, marketing, entertainment, and paying for various bills such as water, TV, and electricity. However, the potential of mobile phones to enhance teaching and learning has not been explored. This study presents an experience of developing and delivering SMS based quiz questions used to assess mastery of subject content knowledge of science and mathematics secondary school teachers in Tanzania. The SMS quizzes were used as a follow up support mechanism to 500 teachers who participated in a project to upgrade subject content knowledge of teachers in science and mathematics subjects in Tanzania. Quizzes of 10-15 questions were sent to teachers each week for 8 weeks and the results were analyzed using SPSS. Results show that teachers who participated in chemistry and biology subjects have better performance compared to those who participated in mathematics and physics subjects. Teachers reported some challenges that led to poor performance, This research has several practical implications for those who are implementing or planning to use mobile phones in teaching and learning especially in rural secondary schools in sub-Saharan Africa.

Keywords: Mobile learning, e-learning, educational technologies, SMS, secondary education, assessment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2030
13462 Predicting the Minimum Free Energy RNA Secondary Structures using Harmony Search Algorithm

Authors: Abdulqader M. Mohsen, Ahamad Tajudin Khader, Dhanesh Ramachandram, Abdullatif Ghallab

Abstract:

The physical methods for RNA secondary structure prediction are time consuming and expensive, thus methods for computational prediction will be a proper alternative. Various algorithms have been used for RNA structure prediction including dynamic programming and metaheuristic algorithms. Musician's behaviorinspired harmony search is a recently developed metaheuristic algorithm which has been successful in a wide variety of complex optimization problems. This paper proposes a harmony search algorithm (HSRNAFold) to find RNA secondary structure with minimum free energy and similar to the native structure. HSRNAFold is compared with dynamic programming benchmark mfold and metaheuristic algorithms (RnaPredict, SetPSO and HelixPSO). The results showed that HSRNAFold is comparable to mfold and better than metaheuristics in finding the minimum free energies and the number of correct base pairs.

Keywords: Metaheuristic algorithms, dynamic programming algorithms, harmony search optimization, RNA folding, Minimum free energy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2309
13461 Big Data: Concepts, Technologies and Applications in the Public Sector

Authors: A. Alexandru, C. A. Alexandru, D. Coardos, E. Tudora

Abstract:

Big Data (BD) is associated with a new generation of technologies and architectures which can harness the value of extremely large volumes of very varied data through real time processing and analysis. It involves changes in (1) data types, (2) accumulation speed, and (3) data volume. This paper presents the main concepts related to the BD paradigm, and introduces architectures and technologies for BD and BD sets. The integration of BD with the Hadoop Framework is also underlined. BD has attracted a lot of attention in the public sector due to the newly emerging technologies that allow the availability of network access. The volume of different types of data has exponentially increased. Some applications of BD in the public sector in Romania are briefly presented.

Keywords: Big data, big data Analytics, Hadoop framework, cloud computing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2288
13460 Extreme Temperature Forecast in Mbonge, Cameroon through Return Level Analysis of the Generalized Extreme Value (GEV) Distribution

Authors: Nkongho Ayuketang Arreyndip, Ebobenow Joseph

Abstract:

In this paper, temperature extremes are forecast by employing the block maxima method of the Generalized extreme value(GEV) distribution to analyse temperature data from the Cameroon Development Corporation (C.D.C). By considering two sets of data (Raw data and simulated data) and two (stationary and non-stationary) models of the GEV distribution, return levels analysis is carried out and it was found that in the stationary model, the return values are constant over time with the raw data while in the simulated data, the return values show an increasing trend but with an upper bound. In the non-stationary model, the return levels of both the raw data and simulated data show an increasing trend but with an upper bound. This clearly shows that temperatures in the tropics even-though show a sign of increasing in the future, there is a maximum temperature at which there is no exceedence. The results of this paper are very vital in Agricultural and Environmental research.

Keywords: Return level, Generalized extreme value (GEV), Meteorology, Forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2075
13459 Comparative Analysis of Commercial Property and Stock-Market Investments in Nigeria

Authors: Bello Nurudeen Akinsola

Abstract:

The study analyzed the risk and returns of commercial-property in Southwestern Nigeria and selected stocksmarket investment between 2000 and 2009; compared the inflation hedging characteristics and diversification potentials of investing in commercial-property and selected stock- market investment. Primary data were collected on characteristics, rental and capital values of commercial- properties from their property managers through the use of questionnaire. Secondary data on stock prices and dividends on banking, insurance and conglomerates sectors were sourced from the Nigerian Stock Exchange (2000-2009). The result showed that average return on all the selected stock- investments was higher than that of commercial-property. As regards risk, commercial-property indicated lower risk, compared to stocks. Also the stock-investment had better inflation hedging capacity than commercial-properties; combination of both had diversification potentials. The study concluded that stock-market investment offered attractive higher return than commercial-property although with higher risk and there could be diversification benefits in combining commercial-property with stock- investment.

Keywords: Commercial-Property, Return, Risk, Stock Market

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5159
13458 Analysis of Lead Time Delays in Supply Chain: A Case Study

Authors: Abdel-Aziz M. Mohamed, Nermeen Coutry

Abstract:

Lead time is a critical measure of a supply chain's performance. It impacts both the customer satisfactions as well as the total cost of inventory. This paper presents the result of a study on the analysis of the customer order lead-time for a multinational company. In the study, the lead time was divided into three stages respectively: order entry, order fulfillment, and order delivery. A sample of size 2,425 order lines was extracted from the company's records to use for this study. The sample data entails information regarding customer orders from the time of order entry until order delivery. Data regarding the lead time of each stage for different orders were also provided. Summary statistics on lead time data reveals that about 30% of the orders were delivered later than the scheduled due date. The result of the multiple linear regression analysis technique revealed that component type, logistics parameter, order size and the customer type have significant impacts on lead time. Data analysis on the stages of lead time indicates that stage 2 consumed over 50% of the lead time. Pareto analysis was made to study the reasons for the customer order delay in each stage. Recommendation was given to resolve the problem.

Keywords: Lead time reduction, customer satisfaction, service quality, statistical analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6638
13457 Enhanced Clustering Analysis and Visualization Using Kohonen's Self-Organizing Feature Map Networks

Authors: Kasthurirangan Gopalakrishnan, Siddhartha Khaitan, Anshu Manik

Abstract:

Cluster analysis is the name given to a diverse collection of techniques that can be used to classify objects (e.g. individuals, quadrats, species etc). While Kohonen's Self-Organizing Feature Map (SOFM) or Self-Organizing Map (SOM) networks have been successfully applied as a classification tool to various problem domains, including speech recognition, image data compression, image or character recognition, robot control and medical diagnosis, its potential as a robust substitute for clustering analysis remains relatively unresearched. SOM networks combine competitive learning with dimensionality reduction by smoothing the clusters with respect to an a priori grid and provide a powerful tool for data visualization. In this paper, SOM is used for creating a toroidal mapping of two-dimensional lattice to perform cluster analysis on results of a chemical analysis of wines produced in the same region in Italy but derived from three different cultivators, referred to as the “wine recognition data" located in the University of California-Irvine database. The results are encouraging and it is believed that SOM would make an appealing and powerful decision-support system tool for clustering tasks and for data visualization.

Keywords: Artificial neural networks, cluster analysis, Kohonen maps, wine recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2098
13456 Analysis and Comparison of Image Encryption Algorithms

Authors: İsmet Öztürk, İbrahim Soğukpınar

Abstract:

With the fast progression of data exchange in electronic way, information security is becoming more important in data storage and transmission. Because of widely using images in industrial process, it is important to protect the confidential image data from unauthorized access. In this paper, we analyzed current image encryption algorithms and compression is added for two of them (Mirror-like image encryption and Visual Cryptography). Implementations of these two algorithms have been realized for experimental purposes. The results of analysis are given in this paper.

Keywords: image encryption, image cryptosystem, security, transmission

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4922
13455 Using TRACE and SNAP Codes to Establish the Model of Maanshan PWR for SBO Accident

Authors: B. R. Shen, J. R. Wang, J. H. Yang, S. W. Chen, C. Shih, Y. Chiang, Y. F. Chang, Y. H. Huang

Abstract:

In this research, TRACE code with the interface code-SNAP was used to simulate and analyze the SBO (station blackout) accident which occurred in Maanshan PWR (pressurized water reactor) nuclear power plant (NPP). There are four main steps in this research. First, the SBO accident data of Maanshan NPP were collected. Second, the TRACE/SNAP model of Maanshan NPP was established by using these data. Third, this TRACE/SNAP model was used to perform the simulation and analysis of SBO accident. Finally, the simulation and analysis of SBO with mitigation equipments was performed. The analysis results of TRACE are consistent with the data of Maanshan NPP. The mitigation equipments of Maanshan can maintain the safety of Maanshan in the SBO according to the TRACE predictions.

Keywords: PWR, TRACE, SBO, Maanshan.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 718
13454 Plant Varieties Selection System

Authors: Kitti Koonsanit, Chuleerat Jaruskulchai, Poonsak Miphokasap, Apisit Eiumnoh

Abstract:

In the end of the day, meteorological data and environmental data becomes widely used such as plant varieties selection system. Variety plant selection for planted area is of almost importance for all crops, including varieties of sugarcane. Since sugarcane have many varieties. Variety plant non selection for planting may not be adapted to the climate or soil conditions for planted area. Poor growth, bloom drop, poor fruit, and low price are to be from varieties which were not recommended for those planted area. This paper presents plant varieties selection system for planted areas in Thailand from meteorological data and environmental data by the use of decision tree techniques. With this software developed as an environmental data analysis tool, it can analyze resulting easier and faster. Our software is a front end of WEKA that provides fundamental data mining functions such as classify, clustering, and analysis functions. It also supports pre-processing, analysis, and decision tree output with exporting result. After that, our software can export and display data result to Google maps API in order to display result and plot plant icons effectively.

Keywords: Plant varieties selection system, decision tree, expert recommendation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1764
13453 Flow Visualization of Angled Supersonic Jets into a Supersonic Cross Flow

Authors: Yan Shao, Jin Zhou, Lin Lai, Haiyan Wu, Jing Lei

Abstract:

This paper describes Nano-particle based Planar Laser Scattering (NPLS) flow visualization of angled supersonic jets into a supersonic cross flow based on the HYpersonic Low TEmperature (HYLTE) nozzle which was widely used in DF chemical laser. In order to investigate the non-reacting flowfield in the HYLTE nozzle, a testing section with windows was designed and manufactured. The impact of secondary fluids orifice separation on mixing was examined. For narrow separation of orifices, the secondary fuel penetration increased obviously compared to diluent injection, which means smaller separation of diluent and fuel orifices would enhance the mixing of fuel and oxidant. Secondary injections with angles of 30, 40 and 50 degrees were studied. It was found that the injectant penetration increased as the injection angle increased, while the interfacial surface area to entrain the freestream fluid is largest when the injection angle is 40 degree.

Keywords: HYLTE nozzle, NPLS, supersonic mixing, transverse injection

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1807
13452 TanSSe-L System PIM Manual Transformation to Moodle as a TanSSe-L System Specific PIM

Authors: Kalinga Ellen A., Bagile Burchard B.

Abstract:

Tanzania Secondary Schools e-Learning (TanSSe-L) system is a customized learning management system (LMS) developed to enable ICT support in teaching and learning functions. Methodologies involved in the development of TanSSe-L system are Object oriented system analysis and design with UML to create and model TanSSe-L system database structure in the form of a design class diagram, Model Driven Architecture (MDA) to provide a well defined process in TanSSe-L system development, where MDA conceptual layers were integrated with system development life cycle and customization of open source learning management system which was used during implementation stage to create a timely functional TanSSe-L system. Before customization, a base for customization was prepared. This was the manual transformation from TanSSe-L system platform independent models (PIM) to TanSSe-L system specific PIM. This paper presents how Moodle open source LMS was analyzed and prepared to be the TanSSe-L system specific PIM as applied by MDA.

Keywords: Customization, e-Learning, MDA Transformation, Moodle, Secondary Schools, Tanzania.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1990
13451 Enhancement of Biogas Production from Bakery Waste by Pseudomonas aeruginosa

Authors: S. Potivichayanon, T. Sungmon, W. Chaikongmao, S. Kamvanin

Abstract:

Production of biogas from bakery waste was enhanced by additional bacterial cell. This study was divided into 2 steps. First step, grease waste from bakery industry-s grease trap was initially degraded by Pseudomonas aeruginosa. The concentration of byproduct, especially glycerol, was determined and found that glycerol concentration increased from 12.83% to 48.10%. Secondary step, 3 biodigesters were set up in 3 different substrates: non-degraded waste as substrate in first biodigester, degraded waste as substrate in secondary biodigester, and degraded waste mixed with swine manure in ratio 1:1 as substrate in third biodigester. The highest concentration of biogas was found in third biodigester that was 44.33% of methane and 63.71% of carbon dioxide. The lower concentration at 24.90% of methane and 18.98% of carbon dioxide was exhibited in secondary biodigester whereas the lowest was found in non-degraded waste biodigester. It was demonstrated that the biogas production was greatly increased with the initial grease waste degradation by Pseudomonas aeruginosa.

Keywords: Biogas production, carbon dioxide, methane, Pseudomonas aeruginosa

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3446
13450 A Study on the Factors Affecting Student Behavior Intention to Attend Robotics Courses at the Primary and Secondary School Levels

Authors: Jingwen Shan

Abstract:

In order to explore the key factors affecting the robot program learning intention of school students, this study takes the technology acceptance model as the theoretical basis and invites 167 students from Jiading District of Shanghai as the research subjects. In the robot course, the model of school students on their learning behavior is constructed. By verifying the causal path relationship between variables, it is concluded that teachers can enhance students’ perceptual usefulness to robotics courses by enhancing subjective norms, entertainment perception, and reducing technical anxiety, such as focusing on the gradual progress of programming and analyzing learner characteristics. Students can improve perceived ease of use by enhancing self-efficacy. At the same time, robot hardware designers can optimize in terms of entertainment and interactivity, which will directly or indirectly increase the learning intention of the robot course. By changing these factors, the learning behavior of primary and secondary school students can be more sustainable.

Keywords: TAM, learning behavior intentions, robot courses, primary and secondary school students.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 607
13449 A Survey of Semantic Integration Approaches in Bioinformatics

Authors: Chaimaa Messaoudi, Rachida Fissoune, Hassan Badir

Abstract:

Technological advances of computer science and data analysis are helping to provide continuously huge volumes of biological data, which are available on the web. Such advances involve and require powerful techniques for data integration to extract pertinent knowledge and information for a specific question. Biomedical exploration of these big data often requires the use of complex queries across multiple autonomous, heterogeneous and distributed data sources. Semantic integration is an active area of research in several disciplines, such as databases, information-integration, and ontology. We provide a survey of some approaches and techniques for integrating biological data, we focus on those developed in the ontology community.

Keywords: Semantic data integration, biological ontology, linked data, semantic web, OWL, RDF.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1796
13448 A Review: Comparative Analysis of Different Categorical Data Clustering Ensemble Methods

Authors: S. Sarumathi, N. Shanthi, M. Sharmila

Abstract:

Over the past epoch a rampant amount of work has been done in the data clustering research under the unsupervised learning technique in Data mining. Furthermore several algorithms and methods have been proposed focusing on clustering different data types, representation of cluster models, and accuracy rates of the clusters. However no single clustering algorithm proves to be the most efficient in providing best results. Accordingly in order to find the solution to this issue a new technique, called Cluster ensemble method was bloomed. This cluster ensemble is a good alternative approach for facing the cluster analysis problem. The main hope of the cluster ensemble is to merge different clustering solutions in such a way to achieve accuracy and to improve the quality of individual data clustering. Due to the substantial and unremitting development of new methods in the sphere of data mining and also the incessant interest in inventing new algorithms, makes obligatory to scrutinize a critical analysis of the existing techniques and the future novelty. This paper exposes the comparative study of different cluster ensemble methods along with their features, systematic working process and the average accuracy and error rates of each ensemble methods. Consequently this speculative and comprehensive analysis will be very useful for the community of clustering practitioners and also helps in deciding the most suitable one to rectify the problem in hand.

Keywords: Clustering, Cluster Ensemble methods, Co-association matrix, Consensus function, Median partition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2575
13447 Adaptive Kernel Principal Analysis for Online Feature Extraction

Authors: Mingtao Ding, Zheng Tian, Haixia Xu

Abstract:

The batch nature limits the standard kernel principal component analysis (KPCA) methods in numerous applications, especially for dynamic or large-scale data. In this paper, an efficient adaptive approach is presented for online extraction of the kernel principal components (KPC). The contribution of this paper may be divided into two parts. First, kernel covariance matrix is correctly updated to adapt to the changing characteristics of data. Second, KPC are recursively formulated to overcome the batch nature of standard KPCA.This formulation is derived from the recursive eigen-decomposition of kernel covariance matrix and indicates the KPC variation caused by the new data. The proposed method not only alleviates sub-optimality of the KPCA method for non-stationary data, but also maintains constant update speed and memory usage as the data-size increases. Experiments for simulation data and real applications demonstrate that our approach yields improvements in terms of both computational speed and approximation accuracy.

Keywords: adaptive method, kernel principal component analysis, online extraction, recursive algorithm

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1525
13446 Analysis of Textual Data Based On Multiple 2-Class Classification Models

Authors: Shigeaki Sakurai, Ryohei Orihara

Abstract:

This paper proposes a new method for analyzing textual data. The method deals with items of textual data, where each item is described based on various viewpoints. The method acquires 2- class classification models of the viewpoints by applying an inductive learning method to items with multiple viewpoints. The method infers whether the viewpoints are assigned to the new items or not by using the models. The method extracts expressions from the new items classified into the viewpoints and extracts characteristic expressions corresponding to the viewpoints by comparing the frequency of expressions among the viewpoints. This paper also applies the method to questionnaire data given by guests at a hotel and verifies its effect through numerical experiments.

Keywords: Text mining, Multiple viewpoints, Differential analysis, Questionnaire data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1267
13445 A Comprehensive Review on Different Mixed Data Clustering Ensemble Methods

Authors: S. Sarumathi, N. Shanthi, S. Vidhya, M. Sharmila

Abstract:

An extensive amount of work has been done in data clustering research under the unsupervised learning technique in Data Mining during the past two decades. Moreover, several approaches and methods have been emerged focusing on clustering diverse data types, features of cluster models and similarity rates of clusters. However, none of the single clustering algorithm exemplifies its best nature in extracting efficient clusters. Consequently, in order to rectify this issue, a new challenging technique called Cluster Ensemble method was bloomed. This new approach tends to be the alternative method for the cluster analysis problem. The main objective of the Cluster Ensemble is to aggregate the diverse clustering solutions in such a way to attain accuracy and also to improve the eminence the individual clustering algorithms. Due to the massive and rapid development of new methods in the globe of data mining, it is highly mandatory to scrutinize a vital analysis of existing techniques and the future novelty. This paper shows the comparative analysis of different cluster ensemble methods along with their methodologies and salient features. Henceforth this unambiguous analysis will be very useful for the society of clustering experts and also helps in deciding the most appropriate one to resolve the problem in hand.

Keywords: Clustering, Cluster Ensemble Methods, Coassociation matrix, Consensus Function, Median Partition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2081
13444 The Analysis of the Software Industry in Thailand

Authors: Danuvasin Charoen

Abstract:

The software industry has been considered a critical infrastructure for any nation. Several studies have indicated that national competitiveness increasingly depends upon Information and Communication Technology (ICT), and software is one of the major components of ICT, important for both large and small enterprises. Even though there has been strong growth in the software industry in Thailand, the industry has faced many challenges and problems that need to be resolved. For example, the amount of pirated software has been rising, and Thailand still has a large gap in the digital divide. Additionally, the adoption among SMEs has been slow. This paper investigates various issues in the software industry in Thailand, using information acquired through analysis of secondary sources, observation, and focus groups. The results of this study can be used as “lessons learned" for the development of the software industry in any developing country.

Keywords: Software industry, developing nations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4443
13443 Dynamical Analysis of Circadian Gene Expression

Authors: Carla Layana Luis Diambra

Abstract:

Microarrays technique allows the simultaneous measurements of the expression levels of thousands of mRNAs. By mining this data one can identify the dynamics of the gene expression time series. By recourse of principal component analysis, we uncover the circadian rhythmic patterns underlying the gene expression profiles from Cyanobacterium Synechocystis. We applied PCA to reduce the dimensionality of the data set. Examination of the components also provides insight into the underlying factors measured in the experiments. Our results suggest that all rhythmic content of data can be reduced to three main components.

Keywords: circadian rhythms, clustering, gene expression, PCA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1568
13442 An Experimental Investigation on the Droplet Behavior Impacting a Hot Surface above the Leidenfrost Temperature

Authors: Khaleel Sami Hamdan, Dong-Eok Kim, Sang-Ki Moon

Abstract:

An appropriate model to predict the size of the droplets resulting from the break-up with the structures will help in a better understanding and modeling of the two-phase flow calculations in the simulation of a reactor core loss-of-coolant accident (LOCA). A droplet behavior impacting on a hot surface above the Leidenfrost temperature was investigated. Droplets of known size and velocity were impacted to an inclined plate of hot temperature, and the behavior of the droplets was observed by a high-speed camera. It was found that for droplets of Weber number higher than a certain value, the higher the Weber number of the droplet the smaller the secondary droplets. The COBRA-TF model over-predicted the measured secondary droplet sizes obtained by the present experiment. A simple model for the secondary droplet size was proposed using the mass conservation equation. The maximum spreading diameter of the droplets was also compared to previous correlations and a fairly good agreement was found. A better prediction of the heat transfer in the case of LOCA can be obtained with the presented model.

Keywords: Break-up, droplet, impact, inclined hot plate, Leidenfrost temperature, LOCA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2344
13441 Applying Hybrid Graph Drawing and Clustering Methods on Stock Investment Analysis

Authors: Mouataz Zreika, Maria Estela Varua

Abstract:

Stock investment decisions are often made based on current events of the global economy and the analysis of historical data. Conversely, visual representation could assist investors’ gain deeper understanding and better insight on stock market trends more efficiently. The trend analysis is based on long-term data collection. The study adopts a hybrid method that combines the Clustering algorithm and Force-directed algorithm to overcome the scalability problem when visualizing large data. This method exemplifies the potential relationships between each stock, as well as determining the degree of strength and connectivity, which will provide investors another understanding of the stock relationship for reference. Information derived from visualization will also help them make an informed decision. The results of the experiments show that the proposed method is able to produced visualized data aesthetically by providing clearer views for connectivity and edge weights.

Keywords: Clustering, force-directed, graph drawing, stock investment analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1571