Search results for: Pattern classification.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1939

Search results for: Pattern classification.

1729 Satellite Data Classification Accuracy Assessment Based from Reference Dataset

Authors: Mohd Hasmadi Ismail, Kamaruzaman Jusoff

Abstract:

In order to develop forest management strategies in tropical forest in Malaysia, surveying the forest resources and monitoring the forest area affected by logging activities is essential. There are tremendous effort has been done in classification of land cover related to forest resource management in this country as it is a priority in all aspects of forest mapping using remote sensing and related technology such as GIS. In fact classification process is a compulsory step in any remote sensing research. Therefore, the main objective of this paper is to assess classification accuracy of classified forest map on Landsat TM data from difference number of reference data (200 and 388 reference data). This comparison was made through observation (200 reference data), and interpretation and observation approaches (388 reference data). Five land cover classes namely primary forest, logged over forest, water bodies, bare land and agricultural crop/mixed horticultural can be identified by the differences in spectral wavelength. Result showed that an overall accuracy from 200 reference data was 83.5 % (kappa value 0.7502459; kappa variance 0.002871), which was considered acceptable or good for optical data. However, when 200 reference data was increased to 388 in the confusion matrix, the accuracy slightly improved from 83.5% to 89.17%, with Kappa statistic increased from 0.7502459 to 0.8026135, respectively. The accuracy in this classification suggested that this strategy for the selection of training area, interpretation approaches and number of reference data used were importance to perform better classification result.

Keywords: Image Classification, Reference Data, Accuracy Assessment, Kappa Statistic, Forest Land Cover

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3141
1728 A Supervised Learning Data Mining Approach for Object Recognition and Classification in High Resolution Satellite Data

Authors: Mais Nijim, Rama Devi Chennuboyina, Waseem Al Aqqad

Abstract:

Advances in spatial and spectral resolution of satellite images have led to tremendous growth in large image databases. The data we acquire through satellites, radars, and sensors consists of important geographical information that can be used for remote sensing applications such as region planning, disaster management. Spatial data classification and object recognition are important tasks for many applications. However, classifying objects and identifying them manually from images is a difficult task. Object recognition is often considered as a classification problem, this task can be performed using machine-learning techniques. Despite of many machine-learning algorithms, the classification is done using supervised classifiers such as Support Vector Machines (SVM) as the area of interest is known. We proposed a classification method, which considers neighboring pixels in a region for feature extraction and it evaluates classifications precisely according to neighboring classes for semantic interpretation of region of interest (ROI). A dataset has been created for training and testing purpose; we generated the attributes by considering pixel intensity values and mean values of reflectance. We demonstrated the benefits of using knowledge discovery and data-mining techniques, which can be on image data for accurate information extraction and classification from high spatial resolution remote sensing imagery.

Keywords: Remote sensing, object recognition, classification, data mining, waterbody identification, feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2055
1727 Classifying and Predicting Efficiencies Using Interval DEA Grid Setting

Authors: Yiannis G. Smirlis

Abstract:

The classification and the prediction of efficiencies in Data Envelopment Analysis (DEA) is an important issue, especially in large scale problems or when new units frequently enter the under-assessment set. In this paper, we contribute to the subject by proposing a grid structure based on interval segmentations of the range of values for the inputs and outputs. Such intervals combined, define hyper-rectangles that partition the space of the problem. This structure, exploited by Interval DEA models and a dominance relation, acts as a DEA pre-processor, enabling the classification and prediction of efficiency scores, without applying any DEA models.

Keywords: Data envelopment analysis, interval DEA, efficiency classification, efficiency prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 938
1726 Statistics over Lyapunov Exponents for Feature Extraction: Electroencephalographic Changes Detection Case

Authors: Elif Derya UBEYLI, Inan GULER

Abstract:

A new approach based on the consideration that electroencephalogram (EEG) signals are chaotic signals was presented for automated diagnosis of electroencephalographic changes. This consideration was tested successfully using the nonlinear dynamics tools, like the computation of Lyapunov exponents. This paper presented the usage of statistics over the set of the Lyapunov exponents in order to reduce the dimensionality of the extracted feature vectors. Since classification is more accurate when the pattern is simplified through representation by important features, feature extraction and selection play an important role in classifying systems such as neural networks. Multilayer perceptron neural network (MLPNN) architectures were formulated and used as basis for detection of electroencephalographic changes. Three types of EEG signals (EEG signals recorded from healthy volunteers with eyes open, epilepsy patients in the epileptogenic zone during a seizure-free interval, and epilepsy patients during epileptic seizures) were classified. The selected Lyapunov exponents of the EEG signals were used as inputs of the MLPNN trained with Levenberg- Marquardt algorithm. The classification results confirmed that the proposed MLPNN has potential in detecting the electroencephalographic changes.

Keywords: Chaotic signal, Electroencephalogram (EEG) signals, Feature extraction/selection, Lyapunov exponents

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2510
1725 Active Segment Selection Method in EEG Classification Using Fractal Features

Authors: Samira Vafaye Eslahi

Abstract:

BCI (Brain Computer Interface) is a communication machine that translates brain massages to computer commands. These machines with the help of computer programs can recognize the tasks that are imagined. Feature extraction is an important stage of the process in EEG classification that can effect in accuracy and the computation time of processing the signals. In this study we process the signal in three steps of active segment selection, fractal feature extraction, and classification. One of the great challenges in BCI applications is to improve classification accuracy and computation time together. In this paper, we have used student’s 2D sample t-statistics on continuous wavelet transforms for active segment selection to reduce the computation time. In the next level, the features are extracted from some famous fractal dimension estimation of the signal. These fractal features are Katz and Higuchi. In the classification stage we used ANFIS (Adaptive Neuro-Fuzzy Inference System) classifier, FKNN (Fuzzy K-Nearest Neighbors), LDA (Linear Discriminate Analysis), and SVM (Support Vector Machines). We resulted that active segment selection method would reduce the computation time and Fractal dimension features with ANFIS analysis on selected active segments is the best among investigated methods in EEG classification.

Keywords: EEG, Student’s t- statistics, BCI, Fractal Features, ANFIS, FKNN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2120
1724 Comparing and Combining the Axial with the Network Maps for Analyzing Urban Street Pattern

Authors: Nophaket Napong

Abstract:

Rooted in the study of social functioning of space in architecture, Space Syntax (SS) and the more recent Network Pattern (NP) researches demonstrate the 'spatial structures' of city, i.e. the hierarchical patterns of streets, junctions and alley ends. Applying SS and NP models, planners can conceptualize the real city-s patterns. Although, both models yield the optimal path of the city their underpinning displays of the city-s spatial configuration differ. The Axial Map analyzes the topological non-distance-based connectivity structure, whereas, the Central-Node Map and the Shortcut-Path Map, in contrast, analyze the metrical distance-based structures. This research contrasts and combines them to understand various forms of city-s structures. It concludes that, while they reveal different spatial structures, Space Syntax and Network Pattern urban models support each the other. Combining together they simulate the global access and the locally compact structures namely the central nodes and the shortcuts for the city.

Keywords: Street pattern, space syntax, syntactic and metrical models, network pattern models.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1461
1723 Development of Fake News Model Using Machine Learning through Natural Language Processing

Authors: Sajjad Ahmed, Knut Hinkelmann, Flavio Corradini

Abstract:

Fake news detection research is still in the early stage as this is a relatively new phenomenon in the interest raised by society. Machine learning helps to solve complex problems and to build AI systems nowadays and especially in those cases where we have tacit knowledge or the knowledge that is not known. We used machine learning algorithms and for identification of fake news; we applied three classifiers; Passive Aggressive, Naïve Bayes, and Support Vector Machine. Simple classification is not completely correct in fake news detection because classification methods are not specialized for fake news. With the integration of machine learning and text-based processing, we can detect fake news and build classifiers that can classify the news data. Text classification mainly focuses on extracting various features of text and after that incorporating those features into classification. The big challenge in this area is the lack of an efficient way to differentiate between fake and non-fake due to the unavailability of corpora. We applied three different machine learning classifiers on two publicly available datasets. Experimental analysis based on the existing dataset indicates a very encouraging and improved performance.

Keywords: Fake news detection, types of fake news, machine learning, natural language processing, classification techniques.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1514
1722 Decomposition Method for Neural Multiclass Classification Problem

Authors: H. El Ayech, A. Trabelsi

Abstract:

In this article we are going to discuss the improvement of the multi classes- classification problem using multi layer Perceptron. The considered approach consists in breaking down the n-class problem into two-classes- subproblems. The training of each two-class subproblem is made independently; as for the phase of test, we are going to confront a vector that we want to classify to all two classes- models, the elected class will be the strongest one that won-t lose any competition with the other classes. Rates of recognition gotten with the multi class-s approach by two-class-s decomposition are clearly better that those gotten by the simple multi class-s approach.

Keywords: Artificial neural network, letter-recognition, Multi class Classification, Multi Layer Perceptron.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1572
1721 Stochastic Modeling and Combined Spatial Pattern Analysis of Epidemic Spreading

Authors: S. Chadsuthi, W. Triampo, C. Modchang, P. Kanthang, D. Triampo, N. Nuttavut

Abstract:

We present analysis of spatial patterns of generic disease spread simulated by a stochastic long-range correlation SIR model, where individuals can be infected at long distance in a power law distribution. We integrated various tools, namely perimeter, circularity, fractal dimension, and aggregation index to characterize and investigate spatial pattern formations. Our primary goal was to understand for a given model of interest which tool has an advantage over the other and to what extent. We found that perimeter and circularity give information only for a case of strong correlation– while the fractal dimension and aggregation index exhibit the growth rule of pattern formation, depending on the degree of the correlation exponent (β). The aggregation index method used as an alternative method to describe the degree of pathogenic ratio (α). This study may provide a useful approach to characterize and analyze the pattern formation of epidemic spreading

Keywords: spatial pattern epidemics, aggregation index, fractaldimension, stochastic, long-rang epidemics

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1676
1720 An Analysis of Genetic Algorithm Based Test Data Compression Using Modified PRL Coding

Authors: K. S. Neelukumari, K. B. Jayanthi

Abstract:

In this paper genetic based test data compression is targeted for improving the compression ratio and for reducing the computation time. The genetic algorithm is based on extended pattern run-length coding. The test set contains a large number of X value that can be effectively exploited to improve the test data compression. In this coding method, a reference pattern is set and its compatibility is checked. For this process, a genetic algorithm is proposed to reduce the computation time of encoding algorithm. This coding technique encodes the 2n compatible pattern or the inversely compatible pattern into a single test data segment or multiple test data segment. The experimental result shows that the compression ratio and computation time is reduced.

Keywords: Backtracking, test data compression (TDC), x-filling, x-propagating and genetic algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1870
1719 Fingerprint Identification using Discretization Technique

Authors: W. Y. Leng, S. M. Shamsuddin

Abstract:

Fingerprint based identification system; one of a well known biometric system in the area of pattern recognition and has always been under study through its important role in forensic science that could help government criminal justice community. In this paper, we proposed an identification framework of individuals by means of fingerprint. Different from the most conventional fingerprint identification frameworks the extracted Geometrical element features (GEFs) will go through a Discretization process. The intention of Discretization in this study is to attain individual unique features that could reflect the individual varianceness in order to discriminate one person from another. Previously, Discretization has been shown a particularly efficient identification on English handwriting with accuracy of 99.9% and on discrimination of twins- handwriting with accuracy of 98%. Due to its high discriminative power, this method is adopted into this framework as an independent based method to seek for the accuracy of fingerprint identification. Finally the experimental result shows that the accuracy rate of identification of the proposed system using Discretization is 100% for FVC2000, 93% for FVC2002 and 89.7% for FVC2004 which is much better than the conventional or the existing fingerprint identification system (72% for FVC2000, 26% for FVC2002 and 32.8% for FVC2004). The result indicates that Discretization approach manages to boost up the classification effectively, and therefore prove to be suitable for other biometric features besides handwriting and fingerprint.

Keywords: Discretization, fingerprint identification, geometrical features, pattern recognition

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2361
1718 Classification and Resolving Urban Problems by Means of Fuzzy Approach

Authors: F. Habib, A. Shokoohi

Abstract:

Urban problems are problems of organized complexity. Thus, many models and scientific methods to resolve urban problems are failed. This study is concerned with proposing of a fuzzy system driven approach for classification and solving urban problems. The proposed study investigated mainly the selection of the inputs and outputs of urban systems for classification of urban problems. In this research, five categories of urban problems, respect to fuzzy system approach had been recognized: control, polytely, optimizing, open and decision making problems. Grounded Theory techniques were then applied to analyze the data and develop new solving method for each category. The findings indicate that the fuzzy system methods are powerful processes and analytic tools for helping planners to resolve urban complex problems. These tools can be successful where as others have failed because both incorporate or address uncertainty and risk; complexity and systems interacting with other systems.

Keywords: Classification, complexity, Fuzzy theory, urban problems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2114
1717 Performance Analysis of Genetic Algorithm with kNN and SVM for Feature Selection in Tumor Classification

Authors: C. Gunavathi, K. Premalatha

Abstract:

Tumor classification is a key area of research in the field of bioinformatics. Microarray technology is commonly used in the study of disease diagnosis using gene expression levels. The main drawback of gene expression data is that it contains thousands of genes and a very few samples. Feature selection methods are used to select the informative genes from the microarray. These methods considerably improve the classification accuracy. In the proposed method, Genetic Algorithm (GA) is used for effective feature selection. Informative genes are identified based on the T-Statistics, Signal-to-Noise Ratio (SNR) and F-Test values. The initial candidate solutions of GA are obtained from top-m informative genes. The classification accuracy of k-Nearest Neighbor (kNN) method is used as the fitness function for GA. In this work, kNN and Support Vector Machine (SVM) are used as the classifiers. The experimental results show that the proposed work is suitable for effective feature selection. With the help of the selected genes, GA-kNN method achieves 100% accuracy in 4 datasets and GA-SVM method achieves in 5 out of 10 datasets. The GA with kNN and SVM methods are demonstrated to be an accurate method for microarray based tumor classification.

Keywords: F-Test, Gene Expression, Genetic Algorithm, k- Nearest-Neighbor, Microarray, Signal-to-Noise Ratio, Support Vector Machine, T-statistics, Tumor Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4540
1716 Combined Feature Based Hyperspectral Image Classification Technique Using Support Vector Machines

Authors: Mrs.K.Kavitha, S.Arivazhagan

Abstract:

A spatial classification technique incorporating a State of Art Feature Extraction algorithm is proposed in this paper for classifying a heterogeneous classes present in hyper spectral images. The classification accuracy can be improved if and only if both the feature extraction and classifier selection are proper. As the classes in the hyper spectral images are assumed to have different textures, textural classification is entertained. Run Length feature extraction is entailed along with the Principal Components and Independent Components. A Hyperspectral Image of Indiana Site taken by AVIRIS is inducted for the experiment. Among the original 220 bands, a subset of 120 bands is selected. Gray Level Run Length Matrix (GLRLM) is calculated for the selected forty bands. From GLRLMs the Run Length features for individual pixels are calculated. The Principle Components are calculated for other forty bands. Independent Components are calculated for next forty bands. As Principal & Independent Components have the ability to represent the textural content of pixels, they are treated as features. The summation of Run Length features, Principal Components, and Independent Components forms the Combined Features which are used for classification. SVM with Binary Hierarchical Tree is used to classify the hyper spectral image. Results are validated with ground truth and accuracies are calculated.

Keywords: Multi-class, Run Length features, PCA, ICA, classification and Support Vector Machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1523
1715 Discriminant Analysis as a Function of Predictive Learning to Select Evolutionary Algorithms in Intelligent Transportation System

Authors: Jorge A. Ruiz-Vanoye, Ocotlán Díaz-Parra, Alejandro Fuentes-Penna, Daniel Vélez-Díaz, Edith Olaco García

Abstract:

In this paper, we present the use of the discriminant analysis to select evolutionary algorithms that better solve instances of the vehicle routing problem with time windows. We use indicators as independent variables to obtain the classification criteria, and the best algorithm from the generic genetic algorithm (GA), random search (RS), steady-state genetic algorithm (SSGA), and sexual genetic algorithm (SXGA) as the dependent variable for the classification. The discriminant classification was trained with classic instances of the vehicle routing problem with time windows obtained from the Solomon benchmark. We obtained a classification of the discriminant analysis of 66.7%.

Keywords: Intelligent transportation systems, data-mining techniques, evolutionary algorithms, discriminant analysis, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1547
1714 Air Classification of Dust from Steel Converter Secondary De-dusting for Zinc Enrichment

Authors: C. Lanzerstorfer

Abstract:

The off-gas from the basic oxygen furnace (BOF), where pig iron is converted into steel, is treated in the primary ventilation system. This system is in full operation only during oxygen-blowing when the BOF converter vessel is in a vertical position. When pig iron and scrap are charged into the BOF and when slag or steel are tapped, the vessel is tilted. The generated emissions during charging and tapping cannot be captured by the primary off-gas system. To capture these emissions, a secondary ventilation system is usually installed. The emissions are captured by a canopy hood installed just above the converter mouth in tilted position. The aim of this study was to investigate the dependence of Zn and other components on the particle size of BOF secondary ventilation dust. Because of the high temperature of the BOF process it can be expected that Zn will be enriched in the fine dust fractions. If Zn is enriched in the fine fractions, classification could be applied to split the dust into two size fractions with a different content of Zn. For this air classification experiments with dust from the secondary ventilation system of a BOF were performed. The results show that Zn and Pb are highly enriched in the finest dust fraction. For Cd, Cu and Sb the enrichment is less. In contrast, the non-volatile metals Al, Fe, Mn and Ti were depleted in the fine fractions. Thus, air classification could be considered for the treatment of dust from secondary BOF off-gas cleaning.

Keywords: Air classification, converter dust, recycling, zinc.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1220
1713 Satellite Imagery Classification Based on Deep Convolution Network

Authors: Zhong Ma, Zhuping Wang, Congxin Liu, Xiangzeng Liu

Abstract:

Satellite imagery classification is a challenging problem with many practical applications. In this paper, we designed a deep convolution neural network (DCNN) to classify the satellite imagery. The contributions of this paper are twofold — First, to cope with the large-scale variance in the satellite image, we introduced the inception module, which has multiple filters with different size at the same level, as the building block to build our DCNN model. Second, we proposed a genetic algorithm based method to efficiently search the best hyper-parameters of the DCNN in a large search space. The proposed method is evaluated on the benchmark database. The results of the proposed hyper-parameters search method show it will guide the search towards better regions of the parameter space. Based on the found hyper-parameters, we built our DCNN models, and evaluated its performance on satellite imagery classification, the results show the classification accuracy of proposed models outperform the state of the art method.

Keywords: Satellite imagery classification, deep convolution network, genetic algorithm, hyper-parameter optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2348
1712 Brainwave Classification for Brain Balancing Index (BBI) via 3D EEG Model Using k-NN Technique

Authors: N. Fuad, M. N. Taib, R. Jailani, M. E. Marwan

Abstract:

In this paper, the comparison between k-Nearest Neighbor (kNN) algorithms for classifying the 3D EEG model in brain balancing is presented. The EEG signal recording was conducted on 51 healthy subjects. Development of 3D EEG models involves pre-processing of raw EEG signals and construction of spectrogram images. Then, maximum PSD values were extracted as features from the model. There are three indexes for balanced brain; index 3, index 4 and index 5. There are significant different of the EEG signals due to the brain balancing index (BBI). Alpha-α (8–13 Hz) and beta-β (13–30 Hz) were used as input signals for the classification model. The k-NN classification result is 88.46% accuracy. These results proved that k-NN can be used in order to predict the brain balancing application.

Keywords: Brain balancing, kNN, power spectral density, 3D EEG model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2630
1711 Classification of Right and Left-Hand Movement Using Multi-Resolution Analysis Method

Authors: Nebi Gedik

Abstract:

The aim of the brain-computer interface studies on electroencephalogram (EEG) signals containing motor imagery is to extract the effective features that will provide the highest possible classification accuracy for the detection of the desired motor movement. However, achieving this goal is difficult as the most suitable frequency band and time frame vary from subject to subject. In this study, the classification success of the two-feature data obtained from raw EEG signals and the coefficients of the multi-resolution analysis method applied to the EEG signals were analyzed comparatively. The method was applied to several EEG channels (C3, Cz and C4) signals obtained from the EEG data set belonging to the publicly available BCI competition III.

Keywords: Motor imagery, EEG, wave atom transform, k-NN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 590
1710 Prediction Modeling of Alzheimer’s Disease and Its Prodromal Stages from Multimodal Data with Missing Values

Authors: M. Aghili, S. Tabarestani, C. Freytes, M. Shojaie, M. Cabrerizo, A. Barreto, N. Rishe, R. E. Curiel, D. Loewenstein, R. Duara, M. Adjouadi

Abstract:

A major challenge in medical studies, especially those that are longitudinal, is the problem of missing measurements which hinders the effective application of many machine learning algorithms. Furthermore, recent Alzheimer's Disease studies have focused on the delineation of Early Mild Cognitive Impairment (EMCI) and Late Mild Cognitive Impairment (LMCI) from cognitively normal controls (CN) which is essential for developing effective and early treatment methods. To address the aforementioned challenges, this paper explores the potential of using the eXtreme Gradient Boosting (XGBoost) algorithm in handling missing values in multiclass classification. We seek a generalized classification scheme where all prodromal stages of the disease are considered simultaneously in the classification and decision-making processes. Given the large number of subjects (1631) included in this study and in the presence of almost 28% missing values, we investigated the performance of XGBoost on the classification of the four classes of AD, NC, EMCI, and LMCI. Using 10-fold cross validation technique, XGBoost is shown to outperform other state-of-the-art classification algorithms by 3% in terms of accuracy and F-score. Our model achieved an accuracy of 80.52%, a precision of 80.62% and recall of 80.51%, supporting the more natural and promising multiclass classification.

Keywords: eXtreme Gradient Boosting, missing data, Alzheimer disease, early mild cognitive impairment, late mild cognitive impairment, multiclass classification, ADNI, support vector machine, random forest.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 958
1709 A Hybrid Data Mining Method for the Medical Classification of Chest Pain

Authors: Sung Ho Ha, Seong Hyeon Joo

Abstract:

Data mining techniques have been used in medical research for many years and have been known to be effective. In order to solve such problems as long-waiting time, congestion, and delayed patient care, faced by emergency departments, this study concentrates on building a hybrid methodology, combining data mining techniques such as association rules and classification trees. The methodology is applied to real-world emergency data collected from a hospital and is evaluated by comparing with other techniques. The methodology is expected to help physicians to make a faster and more accurate classification of chest pain diseases.

Keywords: Data mining, medical decisions, medical domainknowledge, chest pain.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2223
1708 Application of Pattern Search Method to Power System Security Constrained Economic Dispatch

Authors: A. K. Al-Othman, K. M. EL-Nagger

Abstract:

Direct search methods are evolutionary algorithms used to solve optimization problems. (DS) methods do not require any information about the gradient of the objective function at hand while searching for an optimum solution. One of such methods is Pattern Search (PS) algorithm. This paper presents a new approach based on a constrained pattern search algorithm to solve a security constrained power system economic dispatch problem (SCED). Operation of power systems demands a high degree of security to keep the system satisfactorily operating when subjected to disturbances, while and at the same time it is required to pay attention to the economic aspects. Pattern recognition technique is used first to assess dynamic security. Linear classifiers that determine the stability of electric power system are presented and added to other system stability and operational constraints. The problem is formulated as a constrained optimization problem in a way that insures a secure-economic system operation. Pattern search method is then applied to solve the constrained optimization formulation. In particular, the method is tested using one system. Simulation results of the proposed approach are compared with those reported in literature. The outcome is very encouraging and proves that pattern search (PS) is very applicable for solving security constrained power system economic dispatch problem (SCED).

Keywords: Security Constrained Economic Dispatch, Direct Search method, optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2208
1707 On the Network Packet Loss Tolerance of SVM Based Activity Recognition

Authors: Gamze Uslu, Sebnem Baydere, Alper K. Demir

Abstract:

In this study, data loss tolerance of Support Vector Machines (SVM) based activity recognition model and multi activity classification performance when data are received over a lossy wireless sensor network is examined. Initially, the classification algorithm we use is evaluated in terms of resilience to random data loss with 3D acceleration sensor data for sitting, lying, walking and standing actions. The results show that the proposed classification method can recognize these activities successfully despite high data loss. Secondly, the effect of differentiated quality of service performance on activity recognition success is measured with activity data acquired from a multi hop wireless sensor network, which introduces  high data loss. The effect of number of nodes on the reliability and multi activity classification success is demonstrated in simulation environment. To the best of our knowledge, the effect of data loss in a wireless sensor network on activity detection success rate of an SVM based classification algorithm has not been studied before.

Keywords: Activity recognition, support vector machines, acceleration sensor, wireless sensor networks, packet loss.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2871
1706 Determination of the Bank's Customer Risk Profile: Data Mining Applications

Authors: Taner Ersoz, Filiz Ersoz, Seyma Ozbilge

Abstract:

In this study, the clients who applied to a bank branch for loan were analyzed through data mining. The study was composed of the information such as amounts of loans received by personal and SME clients working with the bank branch, installment numbers, number of delays in loan installments, payments available in other banks and number of banks to which they are in debt between 2010 and 2013. The client risk profile was examined through Classification and Regression Tree (CART) analysis, one of the decision tree classification methods. At the end of the study, 5 different types of customers have been determined on the decision tree. The classification of these types of customers has been created with the rating of those posing a risk for the bank branch and the customers have been classified according to the risk ratings.

Keywords: Client classification, loan suitability, risk rating, CART analysis, decision tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1076
1705 Neural Network Based Speech to Text in Malay Language

Authors: H. F. A. Abdul Ghani, R. R. Porle

Abstract:

Speech to text in Malay language is a system that converts Malay speech into text. The Malay language recognition system is still limited, thus, this paper aims to investigate the performance of ten Malay words obtained from the online Malay news. The methodology consists of three stages, which are preprocessing, feature extraction, and speech classification. In preprocessing stage, the speech samples are filtered using pre emphasis. After that, feature extraction method is applied to the samples using Mel Frequency Cepstrum Coefficient (MFCC). Lastly, speech classification is performed using Feedforward Neural Network (FFNN). The accuracy of the classification is further investigated based on the hidden layer size. From experimentation, the classifier with 40 hidden neurons shows the highest classification rate which is 94%.  

Keywords: Feed-Forward Neural Network, FFNN, Malay speech recognition, Mel Frequency Cepstrum Coefficient, MFCC, speech-to-text.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 746
1704 Effect of Plant Growth Promoting Rhizobacteria (PGPR) and Planting Pattern on Yield and Its Components of Rice (Oryza sativa L.) in Ilam Province, Iran

Authors: Ali Rahmani, Abbas Maleki, Mohammad Mirzaeiheydari, Rahim Naseri

Abstract:

Most parts of the world such as Iran are facing the excessive consumption of fertilizers, that are used to achieve high yield, but increase the cost of production of fertilizer and degradation of soil and water resources. This experiment was carried out to study the effect of PGPR and planting pattern on yield and yield components of rice (Oryza sativa L.) using split plot based on randomized complete block design with three replications in Ilam province, Iran. Bio-fertilizer including Azotobacter, Nitroxin and control treatment (without consumption) were designed as a main plot and planting pattern including 15 × 10, 15 × 15 and 15 × 20 and the number of plant in hill including 3, 4 and 5 plants in hill were considered as a sub-plots. The results showed that the effect of bio-fertilizers, planting pattern and the number of plants in hill were significant affect on yield and yield components. Interaction effect between bio-fertilizer and planting pattern had important difference on the number spikelet of panicle and harvest index. Interaction effect between bio-fertilizer and the number of plants in hill were significant affect on the number of spikelet per panicle. The maximum grain yield was obtained by inoculation with Nitroxin, planting pattern of 15 × 15 and 4 plants in hill with mean of 1110.6 g.m-2, 959.9 g.m-2 and 928.4 g.m-2, respectively.

Keywords: Bio-fertilizer, Grain yield, Planting pattern, Rice.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1840
1703 Sentiment Analysis: Comparative Analysis of Multilingual Sentiment and Opinion Classification Techniques

Authors: Sannikumar Patel, Brian Nolan, Markus Hofmann, Philip Owende, Kunjan Patel

Abstract:

Sentiment analysis and opinion mining have become emerging topics of research in recent years but most of the work is focused on data in the English language. A comprehensive research and analysis are essential which considers multiple languages, machine translation techniques, and different classifiers. This paper presents, a comparative analysis of different approaches for multilingual sentiment analysis. These approaches are divided into two parts: one using classification of text without language translation and second using the translation of testing data to a target language, such as English, before classification. The presented research and results are useful for understanding whether machine translation should be used for multilingual sentiment analysis or building language specific sentiment classification systems is a better approach. The effects of language translation techniques, features, and accuracy of various classifiers for multilingual sentiment analysis is also discussed in this study.

Keywords: Cross-language analysis, machine learning, machine translation, sentiment analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1667
1702 Data Mining Using Learning Automata

Authors: M. R. Aghaebrahimi, S. H. Zahiri, M. Amiri

Abstract:

In this paper a data miner based on the learning automata is proposed and is called LA-miner. The LA-miner extracts classification rules from data sets automatically. The proposed algorithm is established based on the function optimization using learning automata. The experimental results on three benchmarks indicate that the performance of the proposed LA-miner is comparable with (sometimes better than) the Ant-miner (a data miner algorithm based on the Ant Colony optimization algorithm) and CNZ (a well-known data mining algorithm for classification).

Keywords: Data mining, Learning automata, Classification rules, Knowledge discovery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1935
1701 Feature Extraction for Surface Classification – An Approach with Wavelets

Authors: Smriti H. Bhandari, S. M. Deshpande

Abstract:

Surface metrology with image processing is a challenging task having wide applications in industry. Surface roughness can be evaluated using texture classification approach. Important aspect here is appropriate selection of features that characterize the surface. We propose an effective combination of features for multi-scale and multi-directional analysis of engineering surfaces. The features include standard deviation, kurtosis and the Canny edge detector. We apply the method by analyzing the surfaces with Discrete Wavelet Transform (DWT) and Dual-Tree Complex Wavelet Transform (DT-CWT). We used Canberra distance metric for similarity comparison between the surface classes. Our database includes the surface textures manufactured by three machining processes namely Milling, Casting and Shaping. The comparative study shows that DT-CWT outperforms DWT giving correct classification performance of 91.27% with Canberra distance metric.

Keywords: Dual-tree complex wavelet transform, surface metrology, surface roughness, texture classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2245
1700 Analytical Authentication of Butter Using Fourier Transform Infrared Spectroscopy Coupled with Chemometrics

Authors: M. Bodner, M. Scampicchio

Abstract:

Fourier Transform Infrared (FT-IR) spectroscopy coupled with chemometrics was used to distinguish between butter samples and non-butter samples. Further, quantification of the content of margarine in adulterated butter samples was investigated. Fingerprinting region (1400-800 cm–1) was used to develop unsupervised pattern recognition (Principal Component Analysis, PCA), supervised modeling (Soft Independent Modelling by Class Analogy, SIMCA), classification (Partial Least Squares Discriminant Analysis, PLS-DA) and regression (Partial Least Squares Regression, PLS-R) models. PCA of the fingerprinting region shows a clustering of the two sample types. All samples were classified in their rightful class by SIMCA approach; however, nine adulterated samples (between 1% and 30% w/w of margarine) were classified as belonging both at the butter class and at the non-butter one. In the two-class PLS-DA model’s (R2 = 0.73, RMSEP, Root Mean Square Error of Prediction = 0.26% w/w) sensitivity was 71.4% and Positive Predictive Value (PPV) 100%. Its threshold was calculated at 7% w/w of margarine in adulterated butter samples. Finally, PLS-R model (R2 = 0.84, RMSEP = 16.54%) was developed. PLS-DA was a suitable classification tool and PLS-R a proper quantification approach. Results demonstrate that FT-IR spectroscopy combined with PLS-R can be used as a rapid, simple and safe method to identify pure butter samples from adulterated ones and to determine the grade of adulteration of margarine in butter samples.

Keywords: Adulterated butter, margarine, PCA, PLS-DA, PLS-R, SIMCA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 782