Search results for: classification techniques.
3244 Normal and Peaberry Coffee Beans Classification from Green Coffee Bean Images Using Convolutional Neural Networks and Support Vector Machine
Authors: Hira Lal Gope, Hidekazu Fukai
Abstract:
The aim of this study is to develop a system which can identify and sort peaberries automatically at low cost for coffee producers in developing countries. In this paper, the focus is on the classification of peaberries and normal coffee beans using image processing and machine learning techniques. The peaberry is not bad and not a normal bean. The peaberry is born in an only single seed, relatively round seed from a coffee cherry instead of the usual flat-sided pair of beans. It has another value and flavor. To make the taste of the coffee better, it is necessary to separate the peaberry and normal bean before green coffee beans roasting. Otherwise, the taste of total beans will be mixed, and it will be bad. In roaster procedure time, all the beans shape, size, and weight must be unique; otherwise, the larger bean will take more time for roasting inside. The peaberry has a different size and different shape even though they have the same weight as normal beans. The peaberry roasts slower than other normal beans. Therefore, neither technique provides a good option to select the peaberries. Defect beans, e.g., sour, broken, black, and fade bean, are easy to check and pick up manually by hand. On the other hand, the peaberry pick up is very difficult even for trained specialists because the shape and color of the peaberry are similar to normal beans. In this study, we use image processing and machine learning techniques to discriminate the normal and peaberry bean as a part of the sorting system. As the first step, we applied Deep Convolutional Neural Networks (CNN) and Support Vector Machine (SVM) as machine learning techniques to discriminate the peaberry and normal bean. As a result, better performance was obtained with CNN than with SVM for the discrimination of the peaberry. The trained artificial neural network with high performance CPU and GPU in this work will be simply installed into the inexpensive and low in calculation Raspberry Pi system. We assume that this system will be used in under developed countries. The study evaluates and compares the feasibility of the methods in terms of accuracy of classification and processing speed.
Keywords: Convolutional neural networks, coffee bean, peaberry, sorting, support vector machine.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15513243 An SVM based Classification Method for Cancer Data using Minimum Microarray Gene Expressions
Authors: R. Mallika, V. Saravanan
Abstract:
This paper gives a novel method for improving classification performance for cancer classification with very few microarray Gene expression data. The method employs classification with individual gene ranking and gene subset ranking. For selection and classification, the proposed method uses the same classifier. The method is applied to three publicly available cancer gene expression datasets from Lymphoma, Liver and Leukaemia datasets. Three different classifiers namely Support vector machines-one against all (SVM-OAA), K nearest neighbour (KNN) and Linear Discriminant analysis (LDA) were tested and the results indicate the improvement in performance of SVM-OAA classifier with satisfactory results on all the three datasets when compared with the other two classifiers.Keywords: Support vector machines-one against all, cancerclassification, Linear Discriminant analysis, K nearest neighbour, microarray gene expression, gene pair ranking.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25613242 Performance Evaluation of Music and Minimum Norm Eigenvector Algorithms in Resolving Noisy Multiexponential Signals
Authors: Abdussamad U. Jibia, Momoh-Jimoh E. Salami
Abstract:
Eigenvector methods are gaining increasing acceptance in the area of spectrum estimation. This paper presents a successful attempt at testing and evaluating the performance of two of the most popular types of subspace techniques in determining the parameters of multiexponential signals with real decay constants buried in noise. In particular, MUSIC (Multiple Signal Classification) and minimum-norm techniques are examined. It is shown that these methods perform almost equally well on multiexponential signals with MUSIC displaying better defined peaks.
Keywords: Eigenvector, minimum norm, multiexponential, subspace.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17373241 Efficiency of Floristic and Molecular Markers to Determine Diversity in Iranian Populations of T. boeoticum
Authors: M. R. Naghavi, M. Maleki, S. F. Tabatabaei
Abstract:
In order to study floristic and molecular classification of common wild wheat (Triticum boeoticum Boiss.), an analysis was conducted on populations of the Triticum boeoticum collected from different regions of Iran. Considering all floristic compositions of habitats, six floristic groups (syntaxa) within the populations were identified. A high level of variation of T. boeoticum also detected using SSR markers. Our results showed that molecular method confirmed the grouping of floristic method. In other word, the results from our study indicate that floristic classification are still useful, efficient, and economic tools for characterizing the amount and distribution of genetic variation in natural populations of T. boeoticum. Nevertheless, molecular markers appear as useful and complementary techniques for identification and for evaluation of genetic diversity in studied populations.Keywords: T. boeoticum, diversity, floristic, SSRs.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13493240 A Novel Approach to Fault Classification and Fault Location for Medium Voltage Cables Based on Artificial Neural Network
Authors: H. Khorashadi-Zadeh, M. R. Aghaebrahimi
Abstract:
A novel application of neural network approach to fault classification and fault location of Medium voltage cables is demonstrated in this paper. Different faults on a protected cable should be classified and located correctly. This paper presents the use of neural networks as a pattern classifier algorithm to perform these tasks. The proposed scheme is insensitive to variation of different parameters such as fault type, fault resistance, and fault inception angle. Studies show that the proposed technique is able to offer high accuracy in both of the fault classification and fault location tasks.Keywords: Artificial neural networks, cable, fault location andfault classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18513239 Analysis of Classifications of Unsolicited Bulk Emails
Authors: Jatinderkumar R. Saini, Apurva A. Desai
Abstract:
In recent times, the problem of Unsolicited Bulk Email (UBE) or commonly known as Spam Email, has increased at a tremendous growth rate. We present an analysis of survey based on classifications of UBE in various research works. There are many research instances for classification between spam and non-spam emails but very few research instances are available for classification of spam emails, per se. This paper does not intend to assert some UBE classification to be better than the others nor does it propose any new classification but it bemoans the lack of harmony on number and definition of categories proposed by different researchers. The paper also elaborates on factors like intent of spammer, content of UBE and ambiguity in different categories as proposed in related research works of classifications of UBE.Keywords: E-mail, Scams, Spam Email, Unsolicited Bulk Email(UBE)
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17263238 Ensemble Learning with Decision Tree for Remote Sensing Classification
Authors: Mahesh Pal
Abstract:
In recent years, a number of works proposing the combination of multiple classifiers to produce a single classification have been reported in remote sensing literature. The resulting classifier, referred to as an ensemble classifier, is generally found to be more accurate than any of the individual classifiers making up the ensemble. As accuracy is the primary concern, much of the research in the field of land cover classification is focused on improving classification accuracy. This study compares the performance of four ensemble approaches (boosting, bagging, DECORATE and random subspace) with a univariate decision tree as base classifier. Two training datasets, one without ant noise and other with 20 percent noise was used to judge the performance of different ensemble approaches. Results with noise free data set suggest an improvement of about 4% in classification accuracy with all ensemble approaches in comparison to the results provided by univariate decision tree classifier. Highest classification accuracy of 87.43% was achieved by boosted decision tree. A comparison of results with noisy data set suggests that bagging, DECORATE and random subspace approaches works well with this data whereas the performance of boosted decision tree degrades and a classification accuracy of 79.7% is achieved which is even lower than that is achieved (i.e. 80.02%) by using unboosted decision tree classifier.Keywords: Ensemble learning, decision tree, remote sensingclassification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25833237 A New Method for Image Classification Based on Multi-level Neural Networks
Authors: Samy Sadek, Ayoub Al-Hamadi, Bernd Michaelis, Usama Sayed
Abstract:
In this paper, we propose a supervised method for color image classification based on a multilevel sigmoidal neural network (MSNN) model. In this method, images are classified into five categories, i.e., “Car", “Building", “Mountain", “Farm" and “Coast". This classification is performed without any segmentation processes. To verify the learning capabilities of the proposed method, we compare our MSNN model with the traditional Sigmoidal Neural Network (SNN) model. Results of comparison have shown that the MSNN model performs better than the traditional SNN model in the context of training run time and classification rate. Both color moments and multi-level wavelets decomposition technique are used to extract features from images. The proposed method has been tested on a variety of real and synthetic images.Keywords: Image classification, multi-level neural networks, feature extraction, wavelets decomposition.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16473236 A Dataset of Program Educational Objectives Mapped to ABET Outcomes: Data Cleansing, Exploratory Data Analysis and Modeling
Authors: Addin Osman, Anwar Ali Yahya, Mohammed Basit Kamal
Abstract:
Datasets or collections are becoming important assets by themselves and now they can be accepted as a primary intellectual output of a research. The quality and usage of the datasets depend mainly on the context under which they have been collected, processed, analyzed, validated, and interpreted. This paper aims to present a collection of program educational objectives mapped to student’s outcomes collected from self-study reports prepared by 32 engineering programs accredited by ABET. The manual mapping (classification) of this data is a notoriously tedious, time consuming process. In addition, it requires experts in the area, which are mostly not available. It has been shown the operational settings under which the collection has been produced. The collection has been cleansed, preprocessed, some features have been selected and preliminary exploratory data analysis has been performed so as to illustrate the properties and usefulness of the collection. At the end, the collection has been benchmarked using nine of the most widely used supervised multiclass classification techniques (Binary Relevance, Label Powerset, Classifier Chains, Pruned Sets, Random k-label sets, Ensemble of Classifier Chains, Ensemble of Pruned Sets, Multi-Label k-Nearest Neighbors and Back-Propagation Multi-Label Learning). The techniques have been compared to each other using five well-known measurements (Accuracy, Hamming Loss, Micro-F, Macro-F, and Macro-F). The Ensemble of Classifier Chains and Ensemble of Pruned Sets have achieved encouraging performance compared to other experimented multi-label classification methods. The Classifier Chains method has shown the worst performance. To recap, the benchmark has achieved promising results by utilizing preliminary exploratory data analysis performed on the collection, proposing new trends for research and providing a baseline for future studies.
Keywords: Benchmark collection, program educational objectives, student outcomes, ABET, Accreditation, machine learning, supervised multiclass classification, text mining.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 8363235 The Performance of Predictive Classification Using Empirical Bayes
Authors: N. Deetae, S. Sukparungsee, Y. Areepong, K. Jampachaisri
Abstract:
This research is aimed to compare the percentages of correct classification of Empirical Bayes method (EB) to Classical method when data are constructed as near normal, short-tailed and long-tailed symmetric, short-tailed and long-tailed asymmetric. The study is performed using conjugate prior, normal distribution with known mean and unknown variance. The estimated hyper-parameters obtained from EB method are replaced in the posterior predictive probability and used to predict new observations. Data are generated, consisting of training set and test set with the sample sizes 100, 200 and 500 for the binary classification. The results showed that EB method exhibited an improved performance over Classical method in all situations under study.
Keywords: Classification, Empirical Bayes, Posterior predictive probability.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15963234 Gene Expression Signature for Classification of Metastasis Positive and Negative Oral Cancer in Homosapiens
Authors: A. Shukla, A. Tarsauliya, R. Tiwari, S. Sharma
Abstract:
Cancer classification to their corresponding cohorts has been key area of research in bioinformatics aiming better prognosis of the disease. High dimensionality of gene data has been makes it a complex task and requires significance data identification technique in order to reducing the dimensionality and identification of significant information. In this paper, we have proposed a novel approach for classification of oral cancer into metastasis positive and negative patients. We have used significance analysis of microarrays (SAM) for identifying significant genes which constitutes gene signature. 3 different gene signatures were identified using SAM from 3 different combination of training datasets and their classification accuracy was calculated on corresponding testing datasets using k-Nearest Neighbour (kNN), Fuzzy C-Means Clustering (FCM), Support Vector Machine (SVM) and Backpropagation Neural Network (BPNN). A final gene signature of only 9 genes was obtained from above 3 individual gene signatures. 9 gene signature-s classification capability was compared using same classifiers on same testing datasets. Results obtained from experimentation shows that 9 gene signature classified all samples in testing dataset accurately while individual genes could not classify all accurately.
Keywords: Cancer, Gene Signature, SAM, Classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20753233 Electronic Nose Based On Metal Oxide Semiconductor Sensors as an Alternative Technique for the Spoilage Classification of Oat Milk
Authors: A. Deswal, N. S. Deora, H. N. Mishra
Abstract:
The aim of the present study was to develop a rapid method for electronic nose for online quality control of oat milk. Analysis by electronic nose and bacteriological measurements were performed to analyze spoilage kinetics of oat milk samples stored at room temperature and refrigerated conditions for up to 15 days. Principal component analysis (PCA), Discriminant Factorial Analysis (DFA) and Soft Independent Modelling by Class Analogy (SIMCA) classification techniques were used to differentiate the samples of oat milk at different days. The total plate count (bacteriological method) was selected as the reference method to consistently train the electronic nose system. The e-nose was able to differentiate between the oat milk samples of varying microbial load. The results obtained by the bacteria total viable countsshowed that the shelf-life of oat milk stored at room temperature and refrigerated conditions were 20hrs and 13 days, respectively. The models built classified oat milk samples based on the total microbial population into “unspoiled” and “spoiled”.
Keywords: Electronic-nose, bacteriological, shelf-life, classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 32713232 Application of Functional Network to Solving Classification Problems
Authors: Yong-Quan Zhou, Deng-Xu He, Zheng Nong
Abstract:
In this paper two models using a functional network were employed to solving classification problem. Functional networks are generalized neural networks, which permit the specification of their initial topology using knowledge about the problem at hand. In this case, and after analyzing the available data and their relations, we systematically discuss a numerical analysis method used for functional network, and apply two functional network models to solving XOR problem. The XOR problem that cannot be solved with two-layered neural network can be solved by two-layered functional network, which reveals a potent computational power of functional networks, and the performance of the proposed model was validated using classification problems.Keywords: Functional network, neural network, XOR problem, classification, numerical analysis method.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13083231 A Kernel Based Rejection Method for Supervised Classification
Authors: Abdenour Bounsiar, Edith Grall, Pierre Beauseroy
Abstract:
In this paper we are interested in classification problems with a performance constraint on error probability. In such problems if the constraint cannot be satisfied, then a rejection option is introduced. For binary labelled classification, a number of SVM based methods with rejection option have been proposed over the past few years. All of these methods use two thresholds on the SVM output. However, in previous works, we have shown on synthetic data that using thresholds on the output of the optimal SVM may lead to poor results for classification tasks with performance constraint. In this paper a new method for supervised classification with rejection option is proposed. It consists in two different classifiers jointly optimized to minimize the rejection probability subject to a given constraint on error rate. This method uses a new kernel based linear learning machine that we have recently presented. This learning machine is characterized by its simplicity and high training speed which makes the simultaneous optimization of the two classifiers computationally reasonable. The proposed classification method with rejection option is compared to a SVM based rejection method proposed in recent literature. Experiments show the superiority of the proposed method.Keywords: rejection, Chow's rule, error-reject tradeoff, SupportVector Machine.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14443230 Synthetic Aperture Radar Remote Sensing Classification Using the Bag of Visual Words Model to Land Cover Studies
Authors: Reza Mohammadi, Mahmod R. Sahebi, Mehrnoosh Omati, Milad Vahidi
Abstract:
Classification of high resolution polarimetric Synthetic Aperture Radar (PolSAR) images plays an important role in land cover and land use management. Recently, classification algorithms based on Bag of Visual Words (BOVW) model have attracted significant interest among scholars and researchers in and out of the field of remote sensing. In this paper, BOVW model with pixel based low-level features has been implemented to classify a subset of San Francisco bay PolSAR image, acquired by RADARSAR 2 in C-band. We have used segment-based decision-making strategy and compared the result with the result of traditional Support Vector Machine (SVM) classifier. 90.95% overall accuracy of the classification with the proposed algorithm has shown that the proposed algorithm is comparable with the state-of-the-art methods. In addition to increase in the classification accuracy, the proposed method has decreased undesirable speckle effect of SAR images.
Keywords: Bag of Visual Words, classification, feature extraction, land cover management, Polarimetric Synthetic Aperture Radar.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7733229 Wavelet-Based ECG Signal Analysis and Classification
Authors: Madina Hamiane, May Hashim Ali
Abstract:
This paper presents the processing and analysis of ECG signals. The study is based on wavelet transform and uses exclusively the MATLAB environment. This study includes removing Baseline wander and further de-noising through wavelet transform and metrics such as signal-to noise ratio (SNR), Peak signal-to-noise ratio (PSNR) and the mean squared error (MSE) are used to assess the efficiency of the de-noising techniques. Feature extraction is subsequently performed whereby signal features such as heart rate, rise and fall levels are extracted and the QRS complex was detected which helped in classifying the ECG signal. The classification is the last step in the analysis of the ECG signals and it is shown that these are successfully classified as Normal rhythm or Abnormal rhythm. The final result proved the adequacy of using wavelet transform for the analysis of ECG signals.
Keywords: ECG Signal, QRS detection, thresholding, wavelet decomposition, feature extraction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12723228 Cardiac Disorder Classification Based On Extreme Learning Machine
Authors: Chul Kwak, Oh-Wook Kwon
Abstract:
In this paper, an extreme learning machine with an automatic segmentation algorithm is applied to heart disorder classification by heart sound signals. From continuous heart sound signals, the starting points of the first (S1) and the second heart pulses (S2) are extracted and corrected by utilizing an inter-pulse histogram. From the corrected pulse positions, a single period of heart sound signals is extracted and converted to a feature vector including the mel-scaled filter bank energy coefficients and the envelope coefficients of uniform-sized sub-segments. An extreme learning machine is used to classify the feature vector. In our cardiac disorder classification and detection experiments with 9 cardiac disorder categories, the proposed method shows significantly better performance than multi-layer perceptron, support vector machine, and hidden Markov model; it achieves the classification accuracy of 81.6% and the detection accuracy of 96.9%.
Keywords: Heart sound classification, extreme learning machine
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19323227 Investigation of Wave Atom Sub-Bands via Breast Cancer Classification
Authors: Nebi Gedik, Ayten Atasoy
Abstract:
This paper investigates successful sub-bands of wave atom transform via classification of mammograms, when the coefficients of sub-bands are used as features. A computer-aided diagnosis system is constructed by using wave atom transform, support vector machine and k-nearest neighbor classifiers. Two-class classification is studied in detail using two data sets, separately. The successful sub-bands are determined according to the accuracy rates, coefficient numbers, and sensitivity rates.
Keywords: Breast cancer, wave atom transform, SVM, k-NN.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10703226 Mapping Paddy Rice Agriculture using Multi-temporal FORMOSAT-2 Images
Authors: Yi-Shiang Shiu, Meng-Lung Lin, Kang-Tsung Chang, Tzu-How Chu
Abstract:
Most paddy rice fields in East Asia are small parcels, and the weather conditions during the growing season are usually cloudy. FORMOSAT-2 multi-spectral images have an 8-meter resolution and one-day recurrence, ideal for mapping paddy rice fields in East Asia. To map rice fields, this study first determined the transplanting and the most active tillering stages of paddy rice and then used multi-temporal images to distinguish different growing characteristics between paddy rice and other ground covers. The unsupervised ISODATA (iterative self-organizing data analysis techniques) and supervised maximum likelihood were both used to discriminate paddy rice fields, with training areas automatically derived from ten-year cultivation parcels in Taiwan. Besides original bands in multi-spectral images, we also generated normalized difference vegetation index and experimented with object-based pre-classification and post-classification. This paper discusses results of different image classification methods in an attempt to find a precise and automatic solution to mapping paddy rice in Taiwan.Keywords: paddy rice fields; multi-temporal; FORMOSAT-2images, normalized difference vegetation index, object-basedclassification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17953225 A General Framework for Knowledge Discovery Using High Performance Machine Learning Algorithms
Authors: S. Nandagopalan, N. Pradeep
Abstract:
The aim of this paper is to propose a general framework for storing, analyzing, and extracting knowledge from two-dimensional echocardiographic images, color Doppler images, non-medical images, and general data sets. A number of high performance data mining algorithms have been used to carry out this task. Our framework encompasses four layers namely physical storage, object identification, knowledge discovery, user level. Techniques such as active contour model to identify the cardiac chambers, pixel classification to segment the color Doppler echo image, universal model for image retrieval, Bayesian method for classification, parallel algorithms for image segmentation, etc., were employed. Using the feature vector database that have been efficiently constructed, one can perform various data mining tasks like clustering, classification, etc. with efficient algorithms along with image mining given a query image. All these facilities are included in the framework that is supported by state-of-the-art user interface (UI). The algorithms were tested with actual patient data and Coral image database and the results show that their performance is better than the results reported already.Keywords: Active Contour, Bayesian, Echocardiographic image, Feature vector.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17123224 A New Approaches for Seismic Signals Discrimination
Authors: M. Benbrahim, K. Benjelloun, A. Ibenbrahim, M. Kasmi, E. Ardil
Abstract:
The automatic discrimination of seismic signals is an important practical goal for the earth-science observatories due to the large amount of information that they receive continuously. An essential discrimination task is to allocate the incoming signal to a group associated with the kind of physical phenomena producing it. In this paper, we present new techniques for seismic signals classification: local, regional and global discrimination. These techniques were tested on seismic signals from the data base of the National Geophysical Institute of the Centre National pour la Recherche Scientifique et Technique (Morocco) by using the Moroccan software for seismic signals analysis.
Keywords: Seismic signals, local discrimination, regionaldiscrimination, global discrimination, Moroccan software for seismicsignals analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15563223 Using Data Mining Technique for Scholarship Disbursement
Authors: J. K. Alhassan, S. A. Lawal
Abstract:
This work is on decision tree-based classification for the disbursement of scholarship. Tree-based data mining classification technique is used in other to determine the generic rule to be used to disburse the scholarship. The system based on the defined rules from the tree is able to determine the class (status) to which an applicant shall belong whether Granted or Not Granted. The applicants that fall to the class of granted denote a successful acquirement of scholarship while those in not granted class are unsuccessful in the scheme. An algorithm that can be used to classify the applicants based on the rules from tree-based classification was also developed. The tree-based classification is adopted because of its efficiency, effectiveness, and easy to comprehend features. The system was tested with the data of National Information Technology Development Agency (NITDA) Abuja, a Parastatal of Federal Ministry of Communication Technology that is mandated to develop and regulate information technology in Nigeria. The system was found working according to the specification. It is therefore recommended for all scholarship disbursement organizations.Keywords: Decision tree, classification, data mining, scholarship.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21563222 A Constrained Clustering Algorithm for the Classification of Industrial Ores
Authors: Luciano Nieddu, Giuseppe Manfredi
Abstract:
In this paper a Pattern Recognition algorithm based on a constrained version of the k-means clustering algorithm will be presented. The proposed algorithm is a non parametric supervised statistical pattern recognition algorithm, i.e. it works under very mild assumptions on the dataset. The performance of the algorithm will be tested, togheter with a feature extraction technique that captures the information on the closed two-dimensional contour of an image, on images of industrial mineral ores.Keywords: K-means, Industrial ores classification, Invariant Features, Supervised Classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13803221 Variational EM Inference Algorithm for Gaussian Process Classification Model with Multiclass and Its Application to Human Action Classification
Authors: Wanhyun Cho, Soonja Kang, Sangkyoon Kim, Soonyoung Park
Abstract:
In this paper, we propose the variational EM inference algorithm for the multi-class Gaussian process classification model that can be used in the field of human behavior recognition. This algorithm can drive simultaneously both a posterior distribution of a latent function and estimators of hyper-parameters in a Gaussian process classification model with multiclass. Our algorithm is based on the Laplace approximation (LA) technique and variational EM framework. This is performed in two steps: called expectation and maximization steps. First, in the expectation step, using the Bayesian formula and LA technique, we derive approximately the posterior distribution of the latent function indicating the possibility that each observation belongs to a certain class in the Gaussian process classification model. Second, in the maximization step, using a derived posterior distribution of latent function, we compute the maximum likelihood estimator for hyper-parameters of a covariance matrix necessary to define prior distribution for latent function. These two steps iteratively repeat until a convergence condition satisfies. Moreover, we apply the proposed algorithm with human action classification problem using a public database, namely, the KTH human action data set. Experimental results reveal that the proposed algorithm shows good performance on this data set.
Keywords: Bayesian rule, Gaussian process classification model with multiclass, Gaussian process prior, human action classification, laplace approximation, variational EM algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17573220 A Case-Based Reasoning-Decision Tree Hybrid System for Stock Selection
Authors: Yaojun Wang, Yaoqing Wang
Abstract:
Stock selection is an important decision-making problem. Many machine learning and data mining technologies are employed to build automatic stock-selection system. A profitable stock-selection system should consider the stock’s investment value and the market timing. In this paper, we present a hybrid system including both engage for stock selection. This system uses a case-based reasoning (CBR) model to execute the stock classification, uses a decision-tree model to help with market timing and stock selection. The experiments show that the performance of this hybrid system is better than that of other techniques regarding to the classification accuracy, the average return and the Sharpe ratio.Keywords: Case-based reasoning, decision tree, stock selection, machine learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17033219 Analysis of Relation between Unlabeled and Labeled Data to Self-Taught Learning Performance
Authors: Ekachai Phaisangittisagul, Rapeepol Chongprachawat
Abstract:
Obtaining labeled data in supervised learning is often difficult and expensive, and thus the trained learning algorithm tends to be overfitting due to small number of training data. As a result, some researchers have focused on using unlabeled data which may not necessary to follow the same generative distribution as the labeled data to construct a high-level feature for improving performance on supervised learning tasks. In this paper, we investigate the impact of the relationship between unlabeled and labeled data for classification performance. Specifically, we will apply difference unlabeled data which have different degrees of relation to the labeled data for handwritten digit classification task based on MNIST dataset. Our experimental results show that the higher the degree of relation between unlabeled and labeled data, the better the classification performance. Although the unlabeled data that is completely from different generative distribution to the labeled data provides the lowest classification performance, we still achieve high classification performance. This leads to expanding the applicability of the supervised learning algorithms using unsupervised learning.Keywords: Autoencoder, high-level feature, MNIST dataset, selftaught learning, supervised learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18313218 Comparative Study Using Weka for Red Blood Cells Classification
Authors: Jameela Ali Alkrimi, Hamid A. Jalab, Loay E. George, Abdul Rahim Ahmad, Azizah Suliman, Karim Al-Jashamy
Abstract:
Red blood cells (RBC) are the most common types of blood cells and are the most intensively studied in cell biology. The lack of RBCs is a condition in which the amount of hemoglobin level is lower than normal and is referred to as “anemia”. Abnormalities in RBCs will affect the exchange of oxygen. This paper presents a comparative study for various techniques for classifying the RBCs as normal or abnormal (anemic) using WEKA. WEKA is an open source consists of different machine learning algorithms for data mining applications. The algorithms tested are Radial Basis Function neural network, Support vector machine, and K-Nearest Neighbors algorithm. Two sets of combined features were utilized for classification of blood cells images. The first set, exclusively consist of geometrical features, was used to identify whether the tested blood cell has a spherical shape or non-spherical cells. While the second set, consist mainly of textural features was used to recognize the types of the spherical cells. We have provided an evaluation based on applying these classification methods to our RBCs image dataset which were obtained from Serdang Hospital - Malaysia, and measuring the accuracy of test results. The best achieved classification rates are 97%, 98%, and 79% for Support vector machines, Radial Basis Function neural network, and K-Nearest Neighbors algorithm respectively.
Keywords: K-Nearest Neighbors, Neural Network, Radial Basis Function, Red blood cells, Support vector machine.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 29933217 Curvelet Transform Based Two Class Motor Imagery Classification
Authors: Nebi Gedik
Abstract:
One of the important parts of the brain-computer interface (BCI) studies is the classification of motor imagery (MI) obtained by electroencephalography (EEG). The major goal is to provide non-muscular communication and control via assistive technologies to people with severe motor disorders so that they can communicate with the outside world. In this study, an EEG signal classification approach based on multiscale and multi-resolution transform method is presented. The proposed approach is used to decompose the EEG signal containing motor image information (right- and left-hand movement imagery). The decomposition process is performed using curvelet transform which is a multiscale and multiresolution analysis method, and the transform output was evaluated as feature data. The obtained feature set is subjected to feature selection process to obtain the most effective ones using t-test methods. SVM and k-NN algorithms are assigned for classification.
Keywords: motor imagery, EEG, curvelet transform, SVM, k-NN
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6193216 An Attribute-Centre Based Decision Tree Classification Algorithm
Authors: Gökhan Silahtaroğlu
Abstract:
Decision tree algorithms have very important place at classification model of data mining. In literature, algorithms use entropy concept or gini index to form the tree. The shape of the classes and their closeness to each other some of the factors that affect the performance of the algorithm. In this paper we introduce a new decision tree algorithm which employs data (attribute) folding method and variation of the class variables over the branches to be created. A comparative performance analysis has been held between the proposed algorithm and C4.5.Keywords: Classification, decision tree, split, pruning, entropy, gini.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13683215 SVM Based Model as an Optimal Classifier for the Classification of Sonar Signals
Authors: Suresh S. Salankar, Balasaheb M. Patre
Abstract:
Research into the problem of classification of sonar signals has been taken up as a challenging task for the neural networks. This paper investigates the design of an optimal classifier using a Multi layer Perceptron Neural Network (MLP NN) and Support Vector Machines (SVM). Results obtained using sonar data sets suggest that SVM classifier perform well in comparison with well-known MLP NN classifier. An average classification accuracy of 91.974% is achieved with SVM classifier and 90.3609% with MLP NN classifier, on the test instances. The area under the Receiver Operating Characteristics (ROC) curve for the proposed SVM classifier on test data set is found as 0.981183, which is very close to unity and this clearly confirms the excellent quality of the proposed classifier. The SVM classifier employed in this paper is implemented using kernel Adatron algorithm is seen to be robust and relatively insensitive to the parameter initialization in comparison to MLP NN.
Keywords: Classification, MLP NN, backpropagation algorithm, SVM, Receiver Operating Characteristics.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1819