Search results for: Data mining classification algorithms
8737 An Evaluation of Algorithms for Single-Echo Biosonar Target Classification
Authors: Turgay Temel, John Hallam
Abstract:
A recent neurospiking coding scheme for feature extraction from biosonar echoes of various plants is examined with avariety of stochastic classifiers. Feature vectors derived are employedin well-known stochastic classifiers, including nearest-neighborhood,single Gaussian and a Gaussian mixture with EM optimization.Classifiers' performances are evaluated by using cross-validation and bootstrapping techniques. It is shown that the various classifers perform equivalently and that the modified preprocessing configuration yields considerably improved results.
Keywords: Classification, neuro-spike coding, non-parametricmodel, parametric model, Gaussian mixture, EM algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16708736 Machine Learning-Enabled Classification of Climbing Using Small Data
Authors: Nicholas Milburn, Yu Liang, Dalei Wu
Abstract:
Athlete performance scoring within the climbing domain presents interesting challenges as the sport does not have an objective way to assign skill. Assessing skill levels within any sport is valuable as it can be used to mark progress while training, and it can help an athlete choose appropriate climbs to attempt. Machine learning-based methods are popular for complex problems like this. The dataset available was composed of dynamic force data recorded during climbing; however, this dataset came with challenges such as data scarcity, imbalance, and it was temporally heterogeneous. Investigated solutions to these challenges include data augmentation, temporal normalization, conversion of time series to the spectral domain, and cross validation strategies. The investigated solutions to the classification problem included light weight machine classifiers KNN and SVM as well as the deep learning with CNN. The best performing model had an 80% accuracy. In conclusion, there seems to be enough information within climbing force data to accurately categorize climbers by skill.
Keywords: Classification, climbing, data imbalance, data scarcity, machine learning, time sequence.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5688735 Enhance the Power of Sentiment Analysis
Authors: Yu Zhang, Pedro Desouza
Abstract:
Since big data has become substantially more accessible and manageable due to the development of powerful tools for dealing with unstructured data, people are eager to mine information from social media resources that could not be handled in the past. Sentiment analysis, as a novel branch of text mining, has in the last decade become increasingly important in marketing analysis, customer risk prediction and other fields. Scientists and researchers have undertaken significant work in creating and improving their sentiment models. In this paper, we present a concept of selecting appropriate classifiers based on the features and qualities of data sources by comparing the performances of five classifiers with three popular social media data sources: Twitter, Amazon Customer Reviews, and Movie Reviews. We introduced a couple of innovative models that outperform traditional sentiment classifiers for these data sources, and provide insights on how to further improve the predictive power of sentiment analysis. The modeling and testing work was done in R and Greenplum in-database analytic tools.
Keywords: Sentiment Analysis, Social Media, Twitter, Amazon, Data Mining, Machine Learning, Text Mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 35188734 Applying Biosensors’ Electromyography Signals through an Artificial Neural Network to Control a Small Unmanned Aerial Vehicle
Authors: Mylena McCoggle, Shyra Wilson, Andrea Rivera, Rocio Alba-Flores, Valentin Soloiu
Abstract:
This work describes a system that uses electromyography (EMG) signals obtained from muscle sensors and an Artificial Neural Network (ANN) for signal classification and pattern recognition that is used to control a small unmanned aerial vehicle using specific arm movements. The main objective of this endeavor is the development of an intelligent interface that allows the user to control the flight of a drone beyond direct manual control. The sensor used were the MyoWare Muscle sensor which contains two EMG electrodes used to collect signals from the posterior (extensor) and anterior (flexor) forearm, and the bicep. The collection of the raw signals from each sensor was performed using an Arduino Uno. Data processing algorithms were developed with the purpose of classifying the signals generated by the arm’s muscles when performing specific movements, namely: flexing, resting, and motion of the arm. With these arm motions roll control of the drone was achieved. MATLAB software was utilized to condition the signals and prepare them for the classification. To generate the input vector for the ANN and perform the classification, the root mean square and the standard deviation were processed for the signals from each electrode. The neuromuscular information was trained using an ANN with a single 10 neurons hidden layer to categorize the four targets. The result of the classification shows that an accuracy of 97.5% was obtained. Afterwards, classification results are used to generate the appropriate control signals from the computer to the drone through a Wi-Fi network connection. These procedures were successfully tested, where the drone responded successfully in real time to the commanded inputs.
Keywords: Biosensors, electromyography, Artificial Neural Network, Arduino, drone flight control, machine learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5568733 Selection of Appropriate Classification Technique for Lithological Mapping of Gali Jagir Area, Pakistan
Authors: Khunsa Fatima, Umar K. Khattak, Allah Bakhsh Kausar
Abstract:
Satellite images interpretation and analysis assist geologists by providing valuable information about geology and minerals of an area to be surveyed. A test site in Fatejang of district Attock has been studied using Landsat ETM+ and ASTER satellite images for lithological mapping. Five different supervised image classification techniques namely maximum likelihood, parallelepiped, minimum distance to mean, mahalanobis distance and spectral angle mapper have been performed upon both satellite data images to find out the suitable classification technique for lithological mapping in the study area. Results of these five image classification techniques were compared with the geological map produced by Geological Survey of Pakistan. Result of maximum likelihood classification technique applied on ASTER satellite image has highest correlation of 0.66 with the geological map. Field observations and XRD spectra of field samples also verified the results. A lithological map was then prepared based on the maximum likelihood classification of ASTER satellite image.
Keywords: ASTER, Landsat-ETM+, Satellite, Image classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 29208732 CART Method for Modeling the Output Power of Copper Bromide Laser
Authors: Iliycho P. Iliev, Desislava S. Voynikova, Snezhana G. Gocheva-Ilieva
Abstract:
This paper examines the available experiment data for a copper bromide vapor laser (CuBr laser), emitting at two wavelengths - 510.6 and 578.2nm. Laser output power is estimated based on 10 independent input physical parameters. A classification and regression tree (CART) model is obtained which describes 97% of data. The resulting binary CART tree specifies which input parameters influence considerably each of the classification groups. This allows for a technical assessment that indicates which of these are the most significant for the manufacture and operation of the type of laser under consideration. The predicted values of the laser output power are also obtained depending on classification. This aids the design and development processes considerably.
Keywords: Classification and regression trees (CART), Copper Bromide laser (CuBr laser), laser generation, nonparametric statistical model.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18258731 A Web Designer Agent, Based On Usage Mining Online Behavior of Visitors
Authors: Babak Abedin, Babak Sohrabi
Abstract:
Website plays a significant role in success of an e-business. It is the main start point of any organization and corporation for its customers, so it's important to customize and design it according to the visitors' preferences. Also, websites are a place to introduce services of an organization and highlight new service to the visitors and audiences. In this paper, we will use web usage mining techniques, as a new field of research in data mining and knowledge discovery, in an Iranian government website. Using the results, a framework for web content layour is proposed. An agent is designed to dynamically update and improve web links locations and layout. Then, we will explain how it is used to directly enable top managers of the organization to influence on the arrangement of web contents and also to enhance customization of web site navigation due to online users' behaviors.
Keywords: Web usage mining, website design, agent, website customization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19318730 Data Quality Enhancement with String Length Distribution
Authors: Qi Xiu, Hiromu Hota, Yohsuke Ishii, Takuya Oda
Abstract:
Recently, collectable manufacturing data are rapidly increasing. On the other hand, mega recall is getting serious as a social problem. Under such circumstances, there are increasing needs for preventing mega recalls by defect analysis such as root cause analysis and abnormal detection utilizing manufacturing data. However, the time to classify strings in manufacturing data by traditional method is too long to meet requirement of quick defect analysis. Therefore, we present String Length Distribution Classification method (SLDC) to correctly classify strings in a short time. This method learns character features, especially string length distribution from Product ID, Machine ID in BOM and asset list. By applying the proposal to strings in actual manufacturing data, we verified that the classification time of strings can be reduced by 80%. As a result, it can be estimated that the requirement of quick defect analysis can be fulfilled.Keywords: Data quality, feature selection, probability distribution, string classification, string length.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13288729 Ensemble Approach for Predicting Student's Academic Performance
Authors: L. A. Muhammad, M. S. Argungu
Abstract:
Educational data mining (EDM) has recorded substantial considerations. Techniques of data mining in one way or the other have been proposed to dig out out-of-sight knowledge in educational data. The result of the study got assists academic institutions in further enhancing their process of learning and methods of passing knowledge to students. Consequently, the performance of students boasts and the educational products are by no doubt enhanced. This study adopted a student performance prediction model premised on techniques of data mining with Students' Essential Features (SEF). SEF are linked to the learner's interactivity with the e-learning management system. The performance of the student's predictive model is assessed by a set of classifiers, viz. Bayes Network, Logistic Regression, and Reduce Error Pruning Tree (REP). Consequently, ensemble methods of Bagging, Boosting, and Random Forest (RF) are applied to improve the performance of these single classifiers. The study reveals that the result shows a robust affinity between learners' behaviors and their academic attainment. Result from the study shows that the REP Tree and its ensemble record the highest accuracy of 83.33% using SEF. Hence, in terms of the Receiver Operating Curve (ROC), boosting method of REP Tree records 0.903, which is the best. This result further demonstrates the dependability of the proposed model.
Keywords: Ensemble, bagging, Random Forest, boosting, data mining, classifiers, machine learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7658728 Classification of Right and Left-Hand Movement Using Multi-Resolution Analysis Method
Authors: Nebi Gedik
Abstract:
The aim of the brain-computer interface studies on electroencephalogram (EEG) signals containing motor imagery is to extract the effective features that will provide the highest possible classification accuracy for the detection of the desired motor movement. However, achieving this goal is difficult as the most suitable frequency band and time frame vary from subject to subject. In this study, the classification success of the two-feature data obtained from raw EEG signals and the coefficients of the multi-resolution analysis method applied to the EEG signals were analyzed comparatively. The method was applied to several EEG channels (C3, Cz and C4) signals obtained from the EEG data set belonging to the publicly available BCI competition III.
Keywords: Motor imagery, EEG, wave atom transform, k-NN.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5908727 FCNN-MR: A Parallel Instance Selection Method Based on Fast Condensed Nearest Neighbor Rule
Authors: Lu Si, Jie Yu, Shasha Li, Jun Ma, Lei Luo, Qingbo Wu, Yongqi Ma, Zhengji Liu
Abstract:
Instance selection (IS) technique is used to reduce the data size to improve the performance of data mining methods. Recently, to process very large data set, several proposed methods divide the training set into some disjoint subsets and apply IS algorithms independently to each subset. In this paper, we analyze the limitation of these methods and give our viewpoint about how to divide and conquer in IS procedure. Then, based on fast condensed nearest neighbor (FCNN) rule, we propose a large data sets instance selection method with MapReduce framework. Besides ensuring the prediction accuracy and reduction rate, it has two desirable properties: First, it reduces the work load in the aggregation node; Second and most important, it produces the same result with the sequential version, which other parallel methods cannot achieve. We evaluate the performance of FCNN-MR on one small data set and two large data sets. The experimental results show that it is effective and practical.Keywords: Instance selection, data reduction, MapReduce, kNN.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10188726 Investigation on Bio-Inspired Population Based Metaheuristic Algorithms for Optimization Problems in Ad Hoc Networks
Authors: C. Rajan, K. Geetha, C. Rasi Priya, R. Sasikala
Abstract:
Nature is a great source of inspiration for solving complex problems in networks. It helps to find the optimal solution. Metaheuristic algorithm is one of the nature-inspired algorithm which helps in solving routing problem in networks. The dynamic features, changing of topology frequently and limited bandwidth make the routing, challenging in MANET. Implementation of appropriate routing algorithms leads to the efficient transmission of data in mobile ad hoc networks. The algorithms that are inspired by the principles of naturally-distributed/collective behavior of social colonies have shown excellence in dealing with complex optimization problems. Thus some of the bio-inspired metaheuristic algorithms help to increase the efficiency of routing in ad hoc networks. This survey work presents the overview of bio-inspired metaheuristic algorithms which support the efficiency of routing in mobile ad hoc networks.
Keywords: Ant colony optimization algorithm, Genetic algorithm, naturally inspired algorithms and particle swarm optimization algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 36108725 A Cumulative Learning Approach to Data Mining Employing Censored Production Rules (CPRs)
Authors: Rekha Kandwal, Kamal K.Bharadwaj
Abstract:
Knowledge is indispensable but voluminous knowledge becomes a bottleneck for efficient processing. A great challenge for data mining activity is the generation of large number of potential rules as a result of mining process. In fact sometimes result size is comparable to the original data. Traditional data mining pruning activities such as support do not sufficiently reduce the huge rule space. Moreover, many practical applications are characterized by continual change of data and knowledge, thereby making knowledge voluminous with each change. The most predominant representation of the discovered knowledge is the standard Production Rules (PRs) in the form If P Then D. Michalski & Winston proposed Censored Production Rules (CPRs), as an extension of production rules, that exhibit variable precision and supports an efficient mechanism for handling exceptions. A CPR is an augmented production rule of the form: If P Then D Unless C, where C (Censor) is an exception to the rule. Such rules are employed in situations in which the conditional statement 'If P Then D' holds frequently and the assertion C holds rarely. By using a rule of this type we are free to ignore the exception conditions, when the resources needed to establish its presence, are tight or there is simply no information available as to whether it holds or not. Thus the 'If P Then D' part of the CPR expresses important information while the Unless C part acts only as a switch changes the polarity of D to ~D. In this paper a scheme based on Dempster-Shafer Theory (DST) interpretation of a CPR is suggested for discovering CPRs from the discovered flat PRs. The discovery of CPRs from flat rules would result in considerable reduction of the already discovered rules. The proposed scheme incrementally incorporates new knowledge and also reduces the size of knowledge base considerably with each episode. Examples are given to demonstrate the behaviour of the proposed scheme. The suggested cumulative learning scheme would be useful in mining data streams.
Keywords: Censored production rules, cumulative learning, data mining, machine learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14858724 Visualization of Searching and Sorting Algorithms
Authors: Bremananth R, Radhika.V, Thenmozhi.S
Abstract:
Sequences of execution of algorithms in an interactive manner using multimedia tools are employed in this paper. It helps to realize the concept of fundamentals of algorithms such as searching and sorting method in a simple manner. Visualization gains more attention than theoretical study and it is an easy way of learning process. We propose methods for finding runtime sequence of each algorithm in an interactive way and aims to overcome the drawbacks of the existing character systems. System illustrates each and every step clearly using text and animation. Comparisons of its time complexity have been carried out and results show that our approach provides better perceptive of algorithms.Keywords: Algorithms, Searching, Sorting, Visualization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21148723 Post Mining- Discovering Valid Rules from Different Sized Data Sources
Authors: R. Nedunchezhian, K. Anbumani
Abstract:
A big organization may have multiple branches spread across different locations. Processing of data from these branches becomes a huge task when innumerable transactions take place. Also, branches may be reluctant to forward their data for centralized processing but are ready to pass their association rules. Local mining may also generate a large amount of rules. Further, it is not practically possible for all local data sources to be of the same size. A model is proposed for discovering valid rules from different sized data sources where the valid rules are high weighted rules. These rules can be obtained from the high frequency rules generated from each of the data sources. A data source selection procedure is considered in order to efficiently synthesize rules. Support Equalization is another method proposed which focuses on eliminating low frequency rules at the local sites itself thus reducing the rules by a significant amount.
Keywords: Association rules, multiple data stores, synthesizing, valid rules.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14048722 Artificial Intelligence Applications in Aggregate Quarries: A Reality
Authors: J. E. Ortiz, P. Plaza, J. Herrero, I. Cabria, J. L. Blanco, J. Gavilanes, J. I. Escavy, I. López-Cilla, V. Yagüe, C. Pérez, S. Rodríguez, J. Rico, C. Serrano, J. Bernat
Abstract:
The development of Artificial Intelligence services in mining processes, specifically in aggregate quarries, is facilitating automation and improving numerous aspects of operations. Ultimately, AI is transforming the mining industry by improving efficiency, safety and sustainability. With the ability to analyze large amounts of data and make autonomous decisions, AI offers great opportunities to optimize mining operations and maximize the economic and social benefits of this vital industry. Within the framework of the European DIGIECOQUARRY project, various services were developed for the identification of material quality, production estimation, detection of anomalies and prediction of consumption and production automatically with good results.
Keywords: Aggregates, artificial intelligence, automatization, mining operations.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 328721 Text Mining of Twitter Data Using a Latent Dirichlet Allocation Topic Model and Sentiment Analysis
Authors: Sidi Yang, Haiyi Zhang
Abstract:
Twitter is a microblogging platform, where millions of users daily share their attitudes, views, and opinions. Using a probabilistic Latent Dirichlet Allocation (LDA) topic model to discern the most popular topics in the Twitter data is an effective way to analyze a large set of tweets to find a set of topics in a computationally efficient manner. Sentiment analysis provides an effective method to show the emotions and sentiments found in each tweet and an efficient way to summarize the results in a manner that is clearly understood. The primary goal of this paper is to explore text mining, extract and analyze useful information from unstructured text using two approaches: LDA topic modelling and sentiment analysis by examining Twitter plain text data in English. These two methods allow people to dig data more effectively and efficiently. LDA topic model and sentiment analysis can also be applied to provide insight views in business and scientific fields.
Keywords: Text mining, Twitter, topic model, sentiment analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18088720 Class Outliers Mining: Distance-Based Approach
Authors: Nabil M. Hewahi, Motaz K. Saad
Abstract:
In large datasets, identifying exceptional or rare cases with respect to a group of similar cases is considered very significant problem. The traditional problem (Outlier Mining) is to find exception or rare cases in a dataset irrespective of the class label of these cases, they are considered rare events with respect to the whole dataset. In this research, we pose the problem that is Class Outliers Mining and a method to find out those outliers. The general definition of this problem is “given a set of observations with class labels, find those that arouse suspicions, taking into account the class labels". We introduce a novel definition of Outlier that is Class Outlier, and propose the Class Outlier Factor (COF) which measures the degree of being a Class Outlier for a data object. Our work includes a proposal of a new algorithm towards mining of the Class Outliers, presenting experimental results applied on various domains of real world datasets and finally a comparison study with other related methods is performed.Keywords: Class Outliers, Distance-Based Approach, Outliers Mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 33888719 Variational EM Inference Algorithm for Gaussian Process Classification Model with Multiclass and Its Application to Human Action Classification
Authors: Wanhyun Cho, Soonja Kang, Sangkyoon Kim, Soonyoung Park
Abstract:
In this paper, we propose the variational EM inference algorithm for the multi-class Gaussian process classification model that can be used in the field of human behavior recognition. This algorithm can drive simultaneously both a posterior distribution of a latent function and estimators of hyper-parameters in a Gaussian process classification model with multiclass. Our algorithm is based on the Laplace approximation (LA) technique and variational EM framework. This is performed in two steps: called expectation and maximization steps. First, in the expectation step, using the Bayesian formula and LA technique, we derive approximately the posterior distribution of the latent function indicating the possibility that each observation belongs to a certain class in the Gaussian process classification model. Second, in the maximization step, using a derived posterior distribution of latent function, we compute the maximum likelihood estimator for hyper-parameters of a covariance matrix necessary to define prior distribution for latent function. These two steps iteratively repeat until a convergence condition satisfies. Moreover, we apply the proposed algorithm with human action classification problem using a public database, namely, the KTH human action data set. Experimental results reveal that the proposed algorithm shows good performance on this data set.
Keywords: Bayesian rule, Gaussian process classification model with multiclass, Gaussian process prior, human action classification, laplace approximation, variational EM algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17598718 Pose Normalization Network for Object Classification
Authors: Bingquan Shen
Abstract:
Convolutional Neural Networks (CNN) have demonstrated their effectiveness in synthesizing 3D views of object instances at various viewpoints. Given the problem where one have limited viewpoints of a particular object for classification, we present a pose normalization architecture to transform the object to existing viewpoints in the training dataset before classification to yield better classification performance. We have demonstrated that this Pose Normalization Network (PNN) can capture the style of the target object and is able to re-render it to a desired viewpoint. Moreover, we have shown that the PNN improves the classification result for the 3D chairs dataset and ShapeNet airplanes dataset when given only images at limited viewpoint, as compared to a CNN baseline.Keywords: Convolutional neural networks, object classification, pose normalization, viewpoint invariant.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11208717 A New History Based Method to Handle the Recurring Concept Shifts in Data Streams
Authors: Hossein Morshedlou, Ahmad Abdollahzade Barforoush
Abstract:
Recent developments in storage technology and networking architectures have made it possible for broad areas of applications to rely on data streams for quick response and accurate decision making. Data streams are generated from events of real world so existence of associations, which are among the occurrence of these events in real world, among concepts of data streams is logical. Extraction of these hidden associations can be useful for prediction of subsequent concepts in concept shifting data streams. In this paper we present a new method for learning association among concepts of data stream and prediction of what the next concept will be. Knowing the next concept, an informed update of data model will be possible. The results of conducted experiments show that the proposed method is proper for classification of concept shifting data streams.Keywords: Data Stream, Classification, Concept Shift, History.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12788716 Generating Concept Trees from Dynamic Self-organizing Map
Authors: Norashikin Ahmad, Damminda Alahakoon
Abstract:
Self-organizing map (SOM) provides both clustering and visualization capabilities in mining data. Dynamic self-organizing maps such as Growing Self-organizing Map (GSOM) has been developed to overcome the problem of fixed structure in SOM to enable better representation of the discovered patterns. However, in mining large datasets or historical data the hierarchical structure of the data is also useful to view the cluster formation at different levels of abstraction. In this paper, we present a technique to generate concept trees from the GSOM. The formation of tree from different spread factor values of GSOM is also investigated and the quality of the trees analyzed. The results show that concept trees can be generated from GSOM, thus, eliminating the need for re-clustering of the data from scratch to obtain a hierarchical view of the data under study.
Keywords: dynamic self-organizing map, concept formation, clustering.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14598715 Lean Models Classification: Towards a Holistic View
Authors: Y. Tiamaz, N. Souissi
Abstract:
The purpose of this paper is to present a classification of Lean models which aims to capture all the concepts related to this approach and thus facilitate its implementation. This classification allows the identification of the most relevant models according to several dimensions. From this perspective, we present a review and an analysis of Lean models literature and we propose dimensions for the classification of the current proposals while respecting among others the axes of the Lean approach, the maturity of the models as well as their application domains. This classification allowed us to conclude that researchers essentially consider the Lean approach as a toolbox also they design their models to solve problems related to a specific environment. Since Lean approach is no longer intended only for the automotive sector where it was invented, but to all fields (IT, Hospital, ...), we consider that this approach requires a generic model that is capable of being implemented in all areas.
Keywords: Lean approach, lean models, classification, dimensions, holistic view.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12498714 A Rigid Point Set Registration of Remote Sensing Images Based on Genetic Algorithms and Hausdorff Distance
Authors: F. Meskine, N. Taleb, M. Chikr El-Mezouar, K. Kpalma, A. Almhdie
Abstract:
Image registration is the process of establishing point by point correspondence between images obtained from a same scene. This process is very useful in remote sensing, medicine, cartography, computer vision, etc. Then, the task of registration is to place the data into a common reference frame by estimating the transformations between the data sets. In this work, we develop a rigid point registration method based on the application of genetic algorithms and Hausdorff distance. First, we extract the feature points from both images based on the algorithm of global and local curvature corner. After refining the feature points, we use Hausdorff distance as similarity measure between the two data sets and for optimizing the search space we use genetic algorithms to achieve high computation speed for its inertial parallel. The results show the efficiency of this method for registration of satellite images.Keywords: Feature extraction, Genetic algorithms, Hausdorff distance, Image registration, Point registration.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19318713 What the Future Holds for Social Media Data Analysis
Authors: P. Wlodarczak, J. Soar, M. Ally
Abstract:
The dramatic rise in the use of Social Media (SM) platforms such as Facebook and Twitter provide access to an unprecedented amount of user data. Users may post reviews on products and services they bought, write about their interests, share ideas or give their opinions and views on political issues. There is a growing interest in the analysis of SM data from organisations for detecting new trends, obtaining user opinions on their products and services or finding out about their online reputations. A recent research trend in SM analysis is making predictions based on sentiment analysis of SM. Often indicators of historic SM data are represented as time series and correlated with a variety of real world phenomena like the outcome of elections, the development of financial indicators, box office revenue and disease outbreaks. This paper examines the current state of research in the area of SM mining and predictive analysis and gives an overview of the analysis methods using opinion mining and machine learning techniques.
Keywords: Social Media, text mining, knowledge discovery, predictive analysis, machine learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 38498712 Selective Mutation for Genetic Algorithms
Authors: Sung Hoon Jung
Abstract:
In this paper, we propose a selective mutation method for improving the performances of genetic algorithms. In selective mutation, individuals are first ranked and then additionally mutated one bit in a part of their strings which is selected corresponding to their ranks. This selective mutation helps genetic algorithms to fast approach the global optimum and to quickly escape local optima. This results in increasing the performances of genetic algorithms. We measured the effects of selective mutation with four function optimization problems. It was found from extensive experiments that the selective mutation can significantly enhance the performances of genetic algorithms.Keywords: Genetic algorithm, selective mutation, function optimization
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18368711 Cloud Computing Support for Diagnosing Researches
Authors: A. Amirov, O. Gerget, V. Kochegurov
Abstract:
One of the main biomedical problem lies in detecting dependencies in semi structured data. Solution includes biomedical portal and algorithms (integral rating health criteria, multidimensional data visualization methods). Biomedical portal allows to process diagnostic and research data in parallel mode using Microsoft System Center 2012, Windows HPC Server cloud technologies. Service does not allow user to see internal calculations instead it provides practical interface. When data is sent for processing user may track status of task and will achieve results as soon as computation is completed. Service includes own algorithms and allows diagnosing and predicating medical cases. Approved methods are based on complex system entropy methods, algorithms for determining the energy patterns of development and trajectory models of biological systems and logical–probabilistic approach with the blurring of images.
Keywords: Biomedical portal, cloud computing, diagnostic and prognostic research, mathematical data analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16448710 Developing Structured Sizing Systems for Manufacturing Ready-Made Garments of Indian Females Using Decision Tree-Based Data Mining
Authors: Hina Kausher, Sangita Srivastava
Abstract:
In India, there is a lack of standard, systematic sizing approach for producing readymade garments. Garments manufacturing companies use their own created size tables by modifying international sizing charts of ready-made garments. The purpose of this study is to tabulate the anthropometric data which cover the variety of figure proportions in both height and girth. 3,000 data have been collected by an anthropometric survey undertaken over females between the ages of 16 to 80 years from the some states of India to produce the sizing system suitable for clothing manufacture and retailing. The data are used for the statistical analysis of body measurements, the formulation of sizing systems and body measurements tables. Factor analysis technique is used to filter the control body dimensions from the large number of variables. Decision tree-based data mining is used to cluster the data. The standard and structured sizing system can facilitate pattern grading and garment production. Moreover, it can exceed buying ratios and upgrade size allocations to retail segments.Keywords: Anthropometric data, data mining, decision tree, garments manufacturing, ready-made garments, sizing systems.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9638709 An Efficient Obstacle Detection Algorithm Using Colour and Texture
Authors: Chau Nguyen Viet, Ian Marshall
Abstract:
This paper presents a new classification algorithm using colour and texture for obstacle detection. Colour information is computationally cheap to learn and process. However in many cases, colour alone does not provide enough information for classification. Texture information can improve classification performance but usually comes at an expensive cost. Our algorithm uses both colour and texture features but texture is only needed when colour is unreliable. During the training stage, texture features are learned specifically to improve the performance of a colour classifier. The algorithm learns a set of simple texture features and only the most effective features are used in the classification stage. Therefore our algorithm has a very good classification rate while is still fast enough to run on a limited computer platform. The proposed algorithm was tested with a challenging outdoor image set. Test result shows the algorithm achieves a much better trade-off between classification performance and efficiency than a typical colour classifier.
Keywords: Colour, texture, classification, obstacle detection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18238708 Density Clustering Based On Radius of Data (DCBRD)
Authors: A.M. Fahim, A. M. Salem, F. A. Torkey, M. A. Ramadan
Abstract:
Clustering algorithms are attractive for the task of class identification in spatial databases. However, the application to large spatial databases rises the following requirements for clustering algorithms: minimal requirements of domain knowledge to determine the input parameters, discovery of clusters with arbitrary shape and good efficiency on large databases. The well-known clustering algorithms offer no solution to the combination of these requirements. In this paper, a density based clustering algorithm (DCBRD) is presented, relying on a knowledge acquired from the data by dividing the data space into overlapped regions. The proposed algorithm discovers arbitrary shaped clusters, requires no input parameters and uses the same definitions of DBSCAN algorithm. We performed an experimental evaluation of the effectiveness and efficiency of it, and compared this results with that of DBSCAN. The results of our experiments demonstrate that the proposed algorithm is significantly efficient in discovering clusters of arbitrary shape and size.
Keywords: Clustering Algorithms, Arbitrary Shape of clusters, cluster Analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1876