Search results for: Large Data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8890

Search results for: Large Data

8830 Incremental Algorithm to Cluster the Categorical Data with Frequency Based Similarity Measure

Authors: S.Aranganayagi, K.Thangavel

Abstract:

Clustering categorical data is more complicated than the numerical clustering because of its special properties. Scalability and memory constraint is the challenging problem in clustering large data set. This paper presents an incremental algorithm to cluster the categorical data. Frequencies of attribute values contribute much in clustering similar categorical objects. In this paper we propose new similarity measures based on the frequencies of attribute values and its cardinalities. The proposed measures and the algorithm are experimented with the data sets from UCI data repository. Results prove that the proposed method generates better clusters than the existing one.

Keywords: Clustering, Categorical, Incremental, Frequency, Domain

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1783
8829 Weka Based Desktop Data Mining as Web Service

Authors: Sujala.D.Shetty, S.Vadivel, Sakshi Vaghella

Abstract:

Data mining is the process of sifting through large volumes of data, analyzing data from different perspectives and summarizing it into useful information. One of the widely used desktop applications for data mining is the Weka tool which is nothing but a collection of machine learning algorithms implemented in Java and open sourced under the General Public License (GPL). A web service is a software system designed to support interoperable machine to machine interaction over a network using SOAP messages. Unlike a desktop application, a web service is easy to upgrade, deliver and access and does not occupy any memory on the system. Keeping in mind the advantages of a web service over a desktop application, in this paper we are demonstrating how this Java based desktop data mining application can be implemented as a web service to support data mining across the internet.

Keywords: desktop application, Weka mining, web service

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4033
8828 Large Strain Compression-Tension Behavior of AZ31B Rolled Sheet in the Rolling Direction

Authors: A. Yazdanmehr, H. Jahed

Abstract:

Being made with the lightest commercially available industrial metal, Magnesium (Mg) alloys are of interest for light-weighting. Expanding their application to different material processing methods requires Mg properties at large strains. Several room-temperature processes such as shot and laser peening and hole cold expansion need compressive large strain data. Two methods have been proposed in the literature to obtain the stress-strain curve at high strains: 1) anti-buckling guides and 2) small cubic samples. In this paper, an anti-buckling fixture is used with the help of digital image correlation (DIC) to obtain the compression-tension (C-T) of AZ31B-H24 rolled sheet at large strain values of up to 10.5%. The effect of the anti-bucking fixture on stress-strain curves is evaluated experimentally by comparing the results with those of the compression tests of cubic samples. For testing cubic samples, a new fixture has been designed to increase the accuracy of testing cubic samples with DIC strain measurements. Results show a negligible effect of anti-buckling on stress-strain curves, specifically at high strain values.

Keywords: Large strain, compression-tension, loading-unloading, Mg alloys.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 728
8827 Classifying Bio-Chip Data using an Ant Colony System Algorithm

Authors: Minsoo Lee, Yearn Jeong Kim, Yun-mi Kim, Sujeung Cheong, Sookyung Song

Abstract:

Bio-chips are used for experiments on genes and contain various information such as genes, samples and so on. The two-dimensional bio-chips, in which one axis represent genes and the other represent samples, are widely being used these days. Instead of experimenting with real genes which cost lots of money and much time to get the results, bio-chips are being used for biological experiments. And extracting data from the bio-chips with high accuracy and finding out the patterns or useful information from such data is very important. Bio-chip analysis systems extract data from various kinds of bio-chips and mine the data in order to get useful information. One of the commonly used methods to mine the data is classification. The algorithm that is used to classify the data can be various depending on the data types or number characteristics and so on. Considering that bio-chip data is extremely large, an algorithm that imitates the ecosystem such as the ant algorithm is suitable to use as an algorithm for classification. This paper focuses on finding the classification rules from the bio-chip data using the Ant Colony algorithm which imitates the ecosystem. The developed system takes in consideration the accuracy of the discovered rules when it applies it to the bio-chip data in order to predict the classes.

Keywords: Ant Colony System, DNA chip data, Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1422
8826 A Comparison between Heterogeneous and Homogeneous Gas Flow Model in Slurry Bubble Column Reactor for Direct Synthesis of DME

Authors: Sadegh Papari, Mohammad Kazemeini, Moslem Fattahi

Abstract:

In the present study, a heterogeneous and homogeneous gas flow dispersion model for simulation and optimisation of a large-scale catalytic slurry reactor for the direct synthesis of dimethyl ether (DME) from syngas and CO2, using a churn-turbulent regime was developed. In the heterogeneous gas flow model the gas phase was distributed into two bubble phases: small and large, however in the homogeneous one, the gas phase was distributed into only one large bubble phase. The results indicated that the heterogeneous gas flow model was in more agreement with experimental pilot plant data than the homogeneous one.

Keywords: Modelling, Slurry bubble column, Dimethyl ether synthesis, Homogeneous gas flow, Heterogeneous gas flow

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2128
8825 Post Mining- Discovering Valid Rules from Different Sized Data Sources

Authors: R. Nedunchezhian, K. Anbumani

Abstract:

A big organization may have multiple branches spread across different locations. Processing of data from these branches becomes a huge task when innumerable transactions take place. Also, branches may be reluctant to forward their data for centralized processing but are ready to pass their association rules. Local mining may also generate a large amount of rules. Further, it is not practically possible for all local data sources to be of the same size. A model is proposed for discovering valid rules from different sized data sources where the valid rules are high weighted rules. These rules can be obtained from the high frequency rules generated from each of the data sources. A data source selection procedure is considered in order to efficiently synthesize rules. Support Equalization is another method proposed which focuses on eliminating low frequency rules at the local sites itself thus reducing the rules by a significant amount.

Keywords: Association rules, multiple data stores, synthesizing, valid rules.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1363
8824 Modeling and Investigation of Volume Strain at Large Deformation under Uniaxial Cyclic Loading in Semi Crystalline Polymer

Authors: Rida B. Arieby

Abstract:

This study deals with the experimental investigation and theoretical modeling of Semi crystalline polymeric materials with a rubbery amorphous phase (HDPE) subjected to a uniaxial cyclic tests with various maximum strain levels, even at large deformation. Each cycle is loaded in tension up to certain maximum strain and then unloaded down to zero stress with N number of cycles. This work is focuses on the measure of the volume strain due to the phenomena of damage during this kind of tests. On the basis of thermodynamics of relaxation processes, a constitutive model for large strain deformation has been developed, taking into account the damage effect, to predict the complex elasto-viscoelastic-viscoplastic behavior of material. A direct comparison between the model predictions and the experimental data show that the model accurately captures the material response. The model is also capable of predicting the influence damage causing volume variation.

Keywords: Cyclic test, large strain, polymers semi-crystalline, Volume strain, Thermodynamics of Irreversible Processes.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2267
8823 On Methodologies for Analysing Sickness Absence Data: An Insight into a New Method

Authors: Xiaoshu Lu, Päivi Leino-Arjas, Kustaa Piha, Akseli Aittomäki, Peppiina Saastamoinen, Ossi Rahkonen, Eero Lahelma

Abstract:

Sickness absence represents a major economic and social issue. Analysis of sick leave data is a recurrent challenge to analysts because of the complexity of the data structure which is often time dependent, highly skewed and clumped at zero. Ignoring these features to make statistical inference is likely to be inefficient and misguided. Traditional approaches do not address these problems. In this study, we discuss model methodologies in terms of statistical techniques for addressing the difficulties with sick leave data. We also introduce and demonstrate a new method by performing a longitudinal assessment of long-term absenteeism using a large registration dataset as a working example available from the Helsinki Health Study for municipal employees from Finland during the period of 1990-1999. We present a comparative study on model selection and a critical analysis of the temporal trends, the occurrence and degree of long-term sickness absences among municipal employees. The strengths of this working example include the large sample size over a long follow-up period providing strong evidence in supporting of the new model. Our main goal is to propose a way to select an appropriate model and to introduce a new methodology for analysing sickness absence data as well as to demonstrate model applicability to complicated longitudinal data.

Keywords: Sickness absence, longitudinal data, methodologies, mix-distribution model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2228
8822 Mapping Complex, Large – Scale Spiking Networks on Neural VLSI

Authors: Christian Mayr, Matthias Ehrlich, Stephan Henker, Karsten Wendt, René Schüffny

Abstract:

Traditionally, VLSI implementations of spiking neural nets have featured large neuron counts for fixed computations or small exploratory, configurable nets. This paper presents the system architecture of a large configurable neural net system employing a dedicated mapping algorithm for projecting the targeted biology-analog nets and dynamics onto the hardware with its attendant constraints.

Keywords: Large scale VLSI neural net, topology mapping, complex pulse communication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1640
8821 Unstructured-Data Content Search Based on Optimized EEG Signal Processing and Multi-Objective Feature Extraction

Authors: Qais M. Yousef, Yasmeen A. Alshaer

Abstract:

Over the last few years, the amount of data available on the globe has been increased rapidly. This came up with the emergence of recent concepts, such as the big data and the Internet of Things, which have furnished a suitable solution for the availability of data all over the world. However, managing this massive amount of data remains a challenge due to their large verity of types and distribution. Therefore, locating the required file particularly from the first trial turned to be a not easy task, due to the large similarities of names for different files distributed on the web. Consequently, the accuracy and speed of search have been negatively affected. This work presents a method using Electroencephalography signals to locate the files based on their contents. Giving the concept of natural mind waves processing, this work analyses the mind wave signals of different people, analyzing them and extracting their most appropriate features using multi-objective metaheuristic algorithm, and then classifying them using artificial neural network to distinguish among files with similar names. The aim of this work is to provide the ability to find the files based on their contents using human thoughts only. Implementing this approach and testing it on real people proved its ability to find the desired files accurately within noticeably shorter time and retrieve them as a first choice for the user.

Keywords: Artificial intelligence, data contents search, human active memory, mind wave, multi-objective optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 870
8820 An Efficient Data Mining Approach on Compressed Transactions

Authors: Jia-Yu Dai, Don-Lin Yang, Jungpin Wu, Ming-Chuan Hung

Abstract:

In an era of knowledge explosion, the growth of data increases rapidly day by day. Since data storage is a limited resource, how to reduce the data space in the process becomes a challenge issue. Data compression provides a good solution which can lower the required space. Data mining has many useful applications in recent years because it can help users discover interesting knowledge in large databases. However, existing compression algorithms are not appropriate for data mining. In [1, 2], two different approaches were proposed to compress databases and then perform the data mining process. However, they all lack the ability to decompress the data to their original state and improve the data mining performance. In this research a new approach called Mining Merged Transactions with the Quantification Table (M2TQT) was proposed to solve these problems. M2TQT uses the relationship of transactions to merge related transactions and builds a quantification table to prune the candidate itemsets which are impossible to become frequent in order to improve the performance of mining association rules. The experiments show that M2TQT performs better than existing approaches.

Keywords: Association rule, data mining, merged transaction, quantification table.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1923
8819 EPR Hiding in Medical Images for Telemedicine

Authors: K. A. Navas, S. Archana Thampy, M. Sasikumar

Abstract:

Medical image data hiding has strict constrains such as high imperceptibility, high capacity and high robustness. Achieving these three requirements simultaneously is highly cumbersome. Some works have been reported in the literature on data hiding, watermarking and stegnography which are suitable for telemedicine applications. None is reliable in all aspects. Electronic Patient Report (EPR) data hiding for telemedicine demand it blind and reversible. This paper proposes a novel approach to blind reversible data hiding based on integer wavelet transform. Experimental results shows that this scheme outperforms the prior arts in terms of zero BER (Bit Error Rate), higher PSNR (Peak Signal to Noise Ratio), and large EPR data embedding capacity with WPSNR (Weighted Peak Signal to Noise Ratio) around 53 dB, compared with the existing reversible data hiding schemes.

Keywords: Biomedical imaging, Data security, Datacommunication, Teleconferencing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2698
8818 Density Clustering Based On Radius of Data (DCBRD)

Authors: A.M. Fahim, A. M. Salem, F. A. Torkey, M. A. Ramadan

Abstract:

Clustering algorithms are attractive for the task of class identification in spatial databases. However, the application to large spatial databases rises the following requirements for clustering algorithms: minimal requirements of domain knowledge to determine the input parameters, discovery of clusters with arbitrary shape and good efficiency on large databases. The well-known clustering algorithms offer no solution to the combination of these requirements. In this paper, a density based clustering algorithm (DCBRD) is presented, relying on a knowledge acquired from the data by dividing the data space into overlapped regions. The proposed algorithm discovers arbitrary shaped clusters, requires no input parameters and uses the same definitions of DBSCAN algorithm. We performed an experimental evaluation of the effectiveness and efficiency of it, and compared this results with that of DBSCAN. The results of our experiments demonstrate that the proposed algorithm is significantly efficient in discovering clusters of arbitrary shape and size.

Keywords: Clustering Algorithms, Arbitrary Shape of clusters, cluster Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1826
8817 An Intelligent Approach of Rough Set in Knowledge Discovery Databases

Authors: Hrudaya Ku. Tripathy, B. K. Tripathy, Pradip K. Das

Abstract:

Knowledge Discovery in Databases (KDD) has evolved into an important and active area of research because of theoretical challenges and practical applications associated with the problem of discovering (or extracting) interesting and previously unknown knowledge from very large real-world databases. Rough Set Theory (RST) is a mathematical formalism for representing uncertainty that can be considered an extension of the classical set theory. It has been used in many different research areas, including those related to inductive machine learning and reduction of knowledge in knowledge-based systems. One important concept related to RST is that of a rough relation. In this paper we presented the current status of research on applying rough set theory to KDD, which will be helpful for handle the characteristics of real-world databases. The main aim is to show how rough set and rough set analysis can be effectively used to extract knowledge from large databases.

Keywords: Data mining, Data tables, Knowledge discovery in database (KDD), Rough sets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2288
8816 Main Cause of Children's Deaths in Indigenous Wayuu Community from Department of La Guajira: A Research Developed through Data Mining Use

Authors: Isaura Esther Solano Núñez, David Suarez

Abstract:

The main purpose of this research is to discover what causes death in children of the Wayuu community, and deeply analyze those results in order to take corrective measures to properly control infant mortality. We consider important to determine the reasons that are producing early death in this specific type of population, since they are the most vulnerable to high risk environmental conditions. In this way, the government, through competent authorities, may develop prevention policies and the right measures to avoid an increase of this tragic fact. The methodology used to develop this investigation is data mining, which consists in gaining and examining large amounts of data to produce new and valuable information. Through this technique it has been possible to determine that the child population is dying mostly from malnutrition. In short, this technique has been very useful to develop this study; it has allowed us to transform large amounts of information into a conclusive and important statement, which has made it easier to take appropriate steps to resolve a particular situation.

Keywords: Malnutrition, datamining, analytical, descriptive, population, wayuu, indigenous.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 639
8815 Explorative Data Mining of Constructivist Learning Experiences and Activities with Multiple Dimensions

Authors: Patrick Wessa, Bart Baesens

Abstract:

This paper discusses the use of explorative data mining tools that allow the educator to explore new relationships between reported learning experiences and actual activities, even if there are multiple dimensions with a large number of measured items. The underlying technology is based on the so-called Compendium Platform for Reproducible Computing (http://www.freestatistics.org) which was built on top the computational R Framework (http://www.wessa.net).

Keywords: Reproducible computing, data mining, explorative data analysis, compendium technology, computer assisted education

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1212
8814 Choosing R-tree or Quadtree Spatial DataIndexing in One Oracle Spatial Database System to Make Faster Showing Geographical Map in Mobile Geographical Information System Technology

Authors: Maruto Masserie Sardadi, Mohd Shafry bin Mohd Rahim, Zahabidin Jupri, Daut bin Daman

Abstract:

The latest Geographic Information System (GIS) technology makes it possible to administer the spatial components of daily “business object," in the corporate database, and apply suitable geographic analysis efficiently in a desktop-focused application. We can use wireless internet technology for transfer process in spatial data from server to client or vice versa. However, the problem in wireless Internet is system bottlenecks that can make the process of transferring data not efficient. The reason is large amount of spatial data. Optimization in the process of transferring and retrieving data, however, is an essential issue that must be considered. Appropriate decision to choose between R-tree and Quadtree spatial data indexing method can optimize the process. With the rapid proliferation of these databases in the past decade, extensive research has been conducted on the design of efficient data structures to enable fast spatial searching. Commercial database vendors like Oracle have also started implementing these spatial indexing to cater to the large and diverse GIS. This paper focuses on the decisions to choose R-tree and quadtree spatial indexing using Oracle spatial database in mobile GIS application. From our research condition, the result of using Quadtree and R-tree spatial data indexing method in one single spatial database can save the time until 42.5%.

Keywords: Indexing, Mobile GIS, MapViewer, Oracle SpatialDatabase.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3991
8813 Conceptual Multidimensional Model

Authors: Manpreet Singh, Parvinder Singh, Suman

Abstract:

The data is available in abundance in any business organization. It includes the records for finance, maintenance, inventory, progress reports etc. As the time progresses, the data keep on accumulating and the challenge is to extract the information from this data bank. Knowledge discovery from these large and complex databases is the key problem of this era. Data mining and machine learning techniques are needed which can scale to the size of the problems and can be customized to the application of business. For the development of accurate and required information for particular problem, business analyst needs to develop multidimensional models which give the reliable information so that they can take right decision for particular problem. If the multidimensional model does not possess the advance features, the accuracy cannot be expected. The present work involves the development of a Multidimensional data model incorporating advance features. The criterion of computation is based on the data precision and to include slowly change time dimension. The final results are displayed in graphical form.

Keywords: Multidimensional, data precision.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1409
8812 Analyzing Current Transformer’s Transient and Steady State Behavior for Different Burden’s Using LabVIEW Data Acquisition Tool

Authors: D. Subedi, D. Sharma

Abstract:

Current transformers (CTs) are used to transform large primary currents to a small secondary current. Since most standard equipment’s are not designed to handle large primary currents the CTs have an important part in any electrical system for the purpose of Metering and Protection both of which are integral in Power system. Now a days due to advancement in solid state technology, the operation times of the protective relays have come to a few cycles from few seconds. Thus, in such a scenario it becomes important to study the transient response of the current transformers as it will play a vital role in the operating of the protective devices.

This paper shows the steady state and transient behavior of current transformers and how it changes with change in connected burden. The transient and steady state response will be captured using the data acquisition software LabVIEW. Analysis is done on the real time data gathered using LabVIEW. Variation of current transformer characteristics with changes in burden will be discussed.

Keywords: Accuracy, Accuracy limiting factor, Burden, Current Transformer, Instrument Security factor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3270
8811 A New Heuristic Approach for the Stock- Cutting Problems

Authors: Stephen C. H. Leung, Defu Zhang

Abstract:

This paper addresses a stock-cutting problem with rotation of items and without the guillotine cutting constraint. In order to solve the large-scale problem effectively and efficiently, we propose a simple but fast heuristic algorithm. It is shown that this heuristic outperforms the latest published algorithms for large-scale problem instances.

Keywords: Combinatorial optimization, heuristic, large-scale, stock-cutting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1635
8810 Clustering Mixed Data Using Non-normal Regression Tree for Process Monitoring

Authors: Youngji Yoo, Cheong-Sool Park, Jun Seok Kim, Young-Hak Lee, Sung-Shick Kim, Jun-Geol Baek

Abstract:

In the semiconductor manufacturing process, large amounts of data are collected from various sensors of multiple facilities. The collected data from sensors have several different characteristics due to variables such as types of products, former processes and recipes. In general, Statistical Quality Control (SQC) methods assume the normality of the data to detect out-of-control states of processes. Although the collected data have different characteristics, using the data as inputs of SQC will increase variations of data, require wide control limits, and decrease performance to detect outof- control. Therefore, it is necessary to separate similar data groups from mixed data for more accurate process control. In the paper, we propose a regression tree using split algorithm based on Pearson distribution to handle non-normal distribution in parametric method. The regression tree finds similar properties of data from different variables. The experiments using real semiconductor manufacturing process data show improved performance in fault detecting ability.

Keywords: Semiconductor, non-normal mixed process data, clustering, Statistical Quality Control (SQC), regression tree, Pearson distribution system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1732
8809 Bus Transit Demand Modeling and Fare Structure Analysis of Kabul City

Authors: Ramin Mirzada, Takuya Maruyama

Abstract:

Kabul is the heart of political, commercial, cultural, educational and social life in Afghanistan and the fifth fastest growing city in the world. Minimum income inclined most of Kabul residents to use public transport, especially buses, although there is no proper bus system, beside that there is no proper fare exist in Kabul city Due to wars. From 1992 to 2001 during civil wars, Kabul suffered damage and destruction of its transportation facilities including pavements, sidewalks, traffic circles, drainage systems, traffic signs and signals, trolleybuses and almost all of the public transport system (e.g. Millie bus). This research is mainly focused on Kabul city’s transportation system. In this research, the data used have been gathered by Japan International Cooperation Agency (JICA) in 2008 and this data will be used to find demand and fare structure, additionally a survey was done in 2016 to find satisfaction level of Kabul residents for fare structure. Aim of this research is to observe the demand for Large Buses, compare to the actual supply from the government, analyze the current fare structure and compare it with the proposed fare (distance based fare) structure which has already been analyzed. Outcome of this research shows that the demand of Kabul city residents for the public transport (Large Buses) exceeds from the current supply, so that current public transportation (Large Buses) is not sufficient to serve public transport in Kabul city, worth to be mentioned, that in order to overcome this problem, there is no need to build new roads or exclusive way for buses. This research proposes government to change the fare from fixed fare to distance based fare, invest on public transportation and increase the number of large buses so that the current demand for public transport is met.

Keywords: Transportation, planning, public transport, large buses, fixed fare, distance based fare, Kabul, Afghanistan.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1614
8808 Evaluating Performance of Quality-of-Service Routing in Large Networks

Authors: V. Narasimha Raghavan, M. Venkatesh, T. Peer Meera Labbai, Praveen Dwarakanath Prabhu

Abstract:

The performance and complexity of QoS routing depends on the complex interaction between a large set of parameters. This paper investigated the scaling properties of source-directed link-state routing in large core networks. The simulation results show that the routing algorithm, network topology, and link cost function each have a significant impact on the probability of successfully routing new connections. The experiments confirm and extend the findings of other studies, and also lend new insight designing efficient quality-of-service routing policies in large networks.

Keywords: QoS, Link-State Routing, Dijkstra, Path Selection, Path Computation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1532
8807 An Analysis of Genetic Algorithm Based Test Data Compression Using Modified PRL Coding

Authors: K. S. Neelukumari, K. B. Jayanthi

Abstract:

In this paper genetic based test data compression is targeted for improving the compression ratio and for reducing the computation time. The genetic algorithm is based on extended pattern run-length coding. The test set contains a large number of X value that can be effectively exploited to improve the test data compression. In this coding method, a reference pattern is set and its compatibility is checked. For this process, a genetic algorithm is proposed to reduce the computation time of encoding algorithm. This coding technique encodes the 2n compatible pattern or the inversely compatible pattern into a single test data segment or multiple test data segment. The experimental result shows that the compression ratio and computation time is reduced.

Keywords: Backtracking, test data compression (TDC), x-filling, x-propagating and genetic algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1830
8806 Data-organization Before Learning Multi-Entity Bayesian Networks Structure

Authors: H. Bouhamed, A. Rebai, T. Lecroq, M. Jaoua

Abstract:

The objective of our work is to develop a new approach for discovering knowledge from a large mass of data, the result of applying this approach will be an expert system that will serve as diagnostic tools of a phenomenon related to a huge information system. We first recall the general problem of learning Bayesian network structure from data and suggest a solution for optimizing the complexity by using organizational and optimization methods of data. Afterward we proposed a new heuristic of learning a Multi-Entities Bayesian Networks structures. We have applied our approach to biological facts concerning hereditary complex illnesses where the literatures in biology identify the responsible variables for those diseases. Finally we conclude on the limits arched by this work.

Keywords: Data-organization, data-optimization, automatic knowledge discovery, Multi-Entities Bayesian networks, score merging.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1573
8805 Generating Concept Trees from Dynamic Self-organizing Map

Authors: Norashikin Ahmad, Damminda Alahakoon

Abstract:

Self-organizing map (SOM) provides both clustering and visualization capabilities in mining data. Dynamic self-organizing maps such as Growing Self-organizing Map (GSOM) has been developed to overcome the problem of fixed structure in SOM to enable better representation of the discovered patterns. However, in mining large datasets or historical data the hierarchical structure of the data is also useful to view the cluster formation at different levels of abstraction. In this paper, we present a technique to generate concept trees from the GSOM. The formation of tree from different spread factor values of GSOM is also investigated and the quality of the trees analyzed. The results show that concept trees can be generated from GSOM, thus, eliminating the need for re-clustering of the data from scratch to obtain a hierarchical view of the data under study.

Keywords: dynamic self-organizing map, concept formation, clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1416
8804 The First Integral Approach in Stability Problem of Large Scale Nonlinear Dynamical Systems

Authors: M. Kidouche, H. Habbi, M. Zelmat, S. Grouni

Abstract:

In analyzing large scale nonlinear dynamical systems, it is often desirable to treat the overall system as a collection of interconnected subsystems. Solutions properties of the large scale system are then deduced from the solution properties of the individual subsystems and the nature of the interconnections. In this paper a new approach is proposed for the stability analysis of large scale systems, which is based upon the concept of vector Lyapunov functions and the decomposition methods. The present results make use of graph theoretic decomposition techniques in which the overall system is partitioned into a hierarchy of strongly connected components. We show then, that under very reasonable assumptions, the overall system is stable once the strongly connected subsystems are stables. Finally an example is given to illustrate the constructive methodology proposed.

Keywords: Comparison principle, First integral, Large scale system, Lyapunov stability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1490
8803 A Modified Fuzzy C-Means Algorithm for Natural Data Exploration

Authors: Binu Thomas, Raju G., Sonam Wangmo

Abstract:

In Data mining, Fuzzy clustering algorithms have demonstrated advantage over crisp clustering algorithms in dealing with the challenges posed by large collections of vague and uncertain natural data. This paper reviews concept of fuzzy logic and fuzzy clustering. The classical fuzzy c-means algorithm is presented and its limitations are highlighted. Based on the study of the fuzzy c-means algorithm and its extensions, we propose a modification to the cmeans algorithm to overcome the limitations of it in calculating the new cluster centers and in finding the membership values with natural data. The efficiency of the new modified method is demonstrated on real data collected for Bhutan-s Gross National Happiness (GNH) program.

Keywords: Adaptive fuzzy clustering, clustering, fuzzy logic, fuzzy clustering, c-means.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1932
8802 Success Factors of Large Scale ERP Implementation in Thailand

Authors: Rotchanakitumnuai, Siriluck

Abstract:

The objectives of the study are to examine the determinants of ERP implementation success factors of ERP implementation. The result indicates that large scale ERP implementation success consist of eight factors: project management competence, knowledge sharing, ERP system quality , understanding, user involvement, business process re-engineering, top management support, organization readiness.

Keywords: large scale ERP, implementation success factors, Thailand

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3157
8801 A Monte Carlo Method to Data Stream Analysis

Authors: Kittisak Kerdprasop, Nittaya Kerdprasop, Pairote Sattayatham

Abstract:

Data stream analysis is the process of computing various summaries and derived values from large amounts of data which are continuously generated at a rapid rate. The nature of a stream does not allow a revisit on each data element. Furthermore, data processing must be fast to produce timely analysis results. These requirements impose constraints on the design of the algorithms to balance correctness against timely responses. Several techniques have been proposed over the past few years to address these challenges. These techniques can be categorized as either dataoriented or task-oriented. The data-oriented approach analyzes a subset of data or a smaller transformed representation, whereas taskoriented scheme solves the problem directly via approximation techniques. We propose a hybrid approach to tackle the data stream analysis problem. The data stream has been both statistically transformed to a smaller size and computationally approximated its characteristics. We adopt a Monte Carlo method in the approximation step. The data reduction has been performed horizontally and vertically through our EMR sampling method. The proposed method is analyzed by a series of experiments. We apply our algorithm on clustering and classification tasks to evaluate the utility of our approach.

Keywords: Data Stream, Monte Carlo, Sampling, DensityEstimation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1382