Search results for: K-nearest neighbor classifier
141 Facial Recognition on the Basis of Facial Fragments
Authors: Tetyana Baydyk, Ernst Kussul, Sandra Bonilla Meza
Abstract:
There are many articles that attempt to establish the role of different facial fragments in face recognition. Various approaches are used to estimate this role. Frequently, authors calculate the entropy corresponding to the fragment. This approach can only give approximate estimation. In this paper, we propose to use a more direct measure of the importance of different fragments for face recognition. We propose to select a recognition method and a face database and experimentally investigate the recognition rate using different fragments of faces. We present two such experiments in the paper. We selected the PCNC neural classifier as a method for face recognition and parts of the LFW (Labeled Faces in the Wild) face database as training and testing sets. The recognition rate of the best experiment is comparable with the recognition rate obtained using the whole face.
Keywords: Face recognition, Labeled Faces in the Wild (LFW) database, Random Local Descriptor (RLD), random features.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1014140 Indonesian News Classification using Support Vector Machine
Authors: Dewi Y. Liliana, Agung Hardianto, M. Ridok
Abstract:
Digital news with a variety topics is abundant on the internet. The problem is to classify news based on its appropriate category to facilitate user to find relevant news rapidly. Classifier engine is used to split any news automatically into the respective category. This research employs Support Vector Machine (SVM) to classify Indonesian news. SVM is a robust method to classify binary classes. The core processing of SVM is in the formation of an optimum separating plane to separate the different classes. For multiclass problem, a mechanism called one against one is used to combine the binary classification result. Documents were taken from the Indonesian digital news site, www.kompas.com. The experiment showed a promising result with the accuracy rate of 85%. This system is feasible to be implemented on Indonesian news classification.Keywords: classification, Indonesian news, text processing, support vector machine
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3489139 Improved Wavelet Neural Networks for Early Cancer Diagnosis Using Clustering Algorithms
Authors: Zarita Zainuddin, Ong Pauline
Abstract:
Wavelet neural networks (WNNs) have emerged as a vital alternative to the vastly studied multilayer perceptrons (MLPs) since its first implementation. In this paper, we applied various clustering algorithms, namely, K-means (KM), Fuzzy C-means (FCM), symmetry-based K-means (SBKM), symmetry-based Fuzzy C-means (SBFCM) and modified point symmetry-based K-means (MPKM) clustering algorithms in choosing the translation parameter of a WNN. These modified WNNs are further applied to the heterogeneous cancer classification using benchmark microarray data and were compared against the conventional WNN with random initialization method. Experimental results showed that a WNN classifier with the MPKM algorithm is more precise than the conventional WNN as well as the WNNs with other clustering algorithms.
Keywords: Clustering, microarray, symmetry, wavelet neural networks.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1619138 Balancing Neural Trees to Improve Classification Performance
Authors: Asha Rani, Christian Micheloni, Gian Luca Foresti
Abstract:
In this paper, a neural tree (NT) classifier having a simple perceptron at each node is considered. A new concept for making a balanced tree is applied in the learning algorithm of the tree. At each node, if the perceptron classification is not accurate and unbalanced, then it is replaced by a new perceptron. This separates the training set in such a way that almost the equal number of patterns fall into each of the classes. Moreover, each perceptron is trained only for the classes which are present at respective node and ignore other classes. Splitting nodes are employed into the neural tree architecture to divide the training set when the current perceptron node repeats the same classification of the parent node. A new error function based on the depth of the tree is introduced to reduce the computational time for the training of a perceptron. Experiments are performed to check the efficiency and encouraging results are obtained in terms of accuracy and computational costs.Keywords: Neural Tree, Pattern Classification, Perceptron, Splitting Nodes.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1226137 Frontal EEG Asymmetry Based Classification of Emotional Valence using Common Spatial Patterns
Authors: Irene Winkler, Mark Jager, Vojkan Mihajlovic, Tsvetomira Tsoneva
Abstract:
In this work we evaluate the possibility of predicting the emotional state of a person based on the EEG. We investigate the problem of classifying valence from EEG signals during the presentation of affective pictures, utilizing the "frontal EEG asymmetry" phenomenon. To distinguish positive and negative emotions, we applied the Common Spatial Patterns algorithm. In contrast to our expectations, the affective pictures did not reliably elicit changes in frontal asymmetry. The classifying task thereby becomes very hard as reflected by the poor classifier performance. We suspect that the masking of the source of the brain activity related to emotions, coming mostly from deeper structures in the brain, and the insufficient emotional engagement are among main reasons why it is difficult to predict the emotional state of a person.Keywords: Emotion, Valence, EEG, Common Spatial Patterns(CSP).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2611136 Automatic Microaneurysm Quantification for Diabetic Retinopathy Screening
Authors: A. Sopharak, B. Uyyanonvara, S. Barman
Abstract:
Microaneurysm is a key indicator of diabetic retinopathy that can potentially cause damage to retina. Early detection and automatic quantification are the keys to prevent further damage. In this paper, which focuses on automatic microaneurysm detection in images acquired through non-dilated pupils, we present a series of experiments on feature selection and automatic microaneurysm pixel classification. We found that the best feature set is a combination of 10 features: the pixel-s intensity of shade corrected image, the pixel hue, the standard deviation of shade corrected image, DoG4, the area of the candidate MA, the perimeter of the candidate MA, the eccentricity of the candidate MA, the circularity of the candidate MA, the mean intensity of the candidate MA on shade corrected image and the ratio of the major axis length and minor length of the candidate MA. The overall sensitivity, specificity, precision, and accuracy are 84.82%, 99.99%, 89.01%, and 99.99%, respectively.
Keywords: Diabetic retinopathy, microaneurysm, naive Bayes classifier
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2190135 Diagnosis of Diabetes Using Computer Methods: Soft Computing Methods for Diabetes Detection Using Iris
Authors: Piyush Samant, Ravinder Agarwal
Abstract:
Complementary and Alternative Medicine (CAM) techniques are quite popular and effective for chronic diseases. Iridology is more than 150 years old CAM technique which analyzes the patterns, tissue weakness, color, shape, structure, etc. for disease diagnosis. The objective of this paper is to validate the use of iridology for the diagnosis of the diabetes. The suggested model was applied in a systemic disease with ocular effects. 200 subject data of 100 each diabetic and non-diabetic were evaluated. Complete procedure was kept very simple and free from the involvement of any iridologist. From the normalized iris, the region of interest was cropped. All 63 features were extracted using statistical, texture analysis, and two-dimensional discrete wavelet transformation. A comparison of accuracies of six different classifiers has been presented. The result shows 89.66% accuracy by the random forest classifier.
Keywords: Complementary and alternative medicine, Iridology, iris, feature extraction, classification, disease prediction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1859134 Real Time Lidar and Radar High-Level Fusion for Obstacle Detection and Tracking with Evaluation on a Ground Truth
Authors: Hatem Hajri, Mohamed-Cherif Rahal
Abstract:
Both Lidars and Radars are sensors for obstacle detection. While Lidars are very accurate on obstacles positions and less accurate on their velocities, Radars are more precise on obstacles velocities and less precise on their positions. Sensor fusion between Lidar and Radar aims at improving obstacle detection using advantages of the two sensors. The present paper proposes a real-time Lidar/Radar data fusion algorithm for obstacle detection and tracking based on the global nearest neighbour standard filter (GNN). This algorithm is implemented and embedded in an automative vehicle as a component generated by a real-time multisensor software. The benefits of data fusion comparing with the use of a single sensor are illustrated through several tracking scenarios (on a highway and on a bend) and using real-time kinematic sensors mounted on the ego and tracked vehicles as a ground truth.Keywords: Ground truth, Hungarian algorithm, lidar Radar data fusion, global nearest neighbor filter.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 963133 ANFIS Approach for Locating Faults in Underground Cables
Authors: Magdy B. Eteiba, Wael Ismael Wahba, Shimaa Barakat
Abstract:
This paper presents a fault identification, classification and fault location estimation method based on Discrete Wavelet Transform and Adaptive Network Fuzzy Inference System (ANFIS) for medium voltage cable in the distribution system.
Different faults and locations are simulated by ATP/EMTP, and then certain selected features of the wavelet transformed signals are used as an input for a training process on the ANFIS. Then an accurate fault classifier and locator algorithm was designed, trained and tested using current samples only. The results obtained from ANFIS output were compared with the real output. From the results, it was found that the percentage error between ANFIS output and real output is less than three percent. Hence, it can be concluded that the proposed technique is able to offer high accuracy in both of the fault classification and fault location.
Keywords: ANFIS, Fault location, Underground Cable, Wavelet Transform.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2741132 Deep-Learning Based Approach to Facial Emotion Recognition Through Convolutional Neural Network
Authors: Nouha Khediri, Mohammed Ben Ammar, Monji Kherallah
Abstract:
Recently, facial emotion recognition (FER) has become increasingly essential to understand the state of the human mind. However, accurately classifying emotion from the face is a challenging task. In this paper, we present a facial emotion recognition approach named CV-FER benefiting from deep learning, especially CNN and VGG16. First, the data are pre-processed with data cleaning and data rotation. Then, we augment the data and proceed to our FER model, which contains five convolutions layers and five pooling layers. Finally, a softmax classifier is used in the output layer to recognize emotions. Based on the above contents, this paper reviews the works of facial emotion recognition based on deep learning. Experiments show that our model outperforms the other methods using the same FER2013 database and yields a recognition rate of 92%. We also put forward some suggestions for future work.
Keywords: CNN, deep-learning, facial emotion recognition, machine learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 710131 Liver Tumor Detection by Classification through FD Enhancement of CT Image
Authors: N. Ghatwary, A. Ahmed, H. Jalab
Abstract:
In this paper, an approach for the liver tumor detection in computed tomography (CT) images is represented. The detection process is based on classifying the features of target liver cell to either tumor or non-tumor. Fractional differential (FD) is applied for enhancement of Liver CT images, with the aim of enhancing texture and edge features. Later on, a fusion method is applied to merge between the various enhanced images and produce a variety of feature improvement, which will increase the accuracy of classification. Each image is divided into NxN non-overlapping blocks, to extract the desired features. Support vector machines (SVM) classifier is trained later on a supplied dataset different from the tested one. Finally, the block cells are identified whether they are classified as tumor or not. Our approach is validated on a group of patients’ CT liver tumor datasets. The experiment results demonstrated the efficiency of detection in the proposed technique.Keywords: Fractional differential (FD), Computed Tomography (CT), fusion.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1682130 Optimizing Mobile Agents Migration Based on Decision Tree Learning
Authors: Yasser k. Ali, Hesham N. Elmahdy, Sanaa El Olla Hanfy Ahmed
Abstract:
Mobile agents are a powerful approach to develop distributed systems since they migrate to hosts on which they have the resources to execute individual tasks. In a dynamic environment like a peer-to-peer network, Agents have to be generated frequently and dispatched to the network. Thus they will certainly consume a certain amount of bandwidth of each link in the network if there are too many agents migration through one or several links at the same time, they will introduce too much transferring overhead to the links eventually, these links will be busy and indirectly block the network traffic, therefore, there is a need of developing routing algorithms that consider about traffic load. In this paper we seek to create cooperation between a probabilistic manner according to the quality measure of the network traffic situation and the agent's migration decision making to the next hop based on decision tree learning algorithms.
Keywords: Agent Migration, Decision Tree learning, ID3 algorithm, Naive Bayes Classifier
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1991129 WiPoD Wireless Positioning System based on 802.11 WLAN Infrastructure
Authors: Haluk Gümüskaya, Hüseyin Hakkoymaz
Abstract:
This paper describes WiPoD (Wireless Position Detector) which is a pure software based location determination and tracking (positioning) system. It uses empirical signal strength measurements from different wireless access points for mobile user positioning. It is designed to determine the location of users having 802.11 enabled mobile devices in an 802.11 WLAN infrastructure and track them in real time. WiPoD is the first main module in our LBS (Location Based Services) framework. We tested K-Nearest Neighbor and Triangulation algorithms to estimate the position of a mobile user. We also give the analysis results of these algorithms for real time operations. In this paper, we propose a supportable, i.e. understandable, maintainable, scalable and portable wireless positioning system architecture for an LBS framework. The WiPoD software has a multithreaded structure and was designed and implemented with paying attention to supportability features and real-time constraints and using object oriented design principles. We also describe the real-time software design issues of a wireless positioning system which will be part of an LBS framework.Keywords: Indoor location determination and tracking, positioning in Wireless LAN.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1994128 Evaluation of Graph-based Analysis for Forest Fire Detections
Authors: Young Gi Byun, Yong Huh, Kiyun Yu, Yong Il Kim
Abstract:
Spatial outliers in remotely sensed imageries represent observed quantities showing unusual values compared to their neighbor pixel values. There have been various methods to detect the spatial outliers based on spatial autocorrelations in statistics and data mining. These methods may be applied in detecting forest fire pixels in the MODIS imageries from NASA-s AQUA satellite. This is because the forest fire detection can be referred to as finding spatial outliers using spatial variation of brightness temperature. This point is what distinguishes our approach from the traditional fire detection methods. In this paper, we propose a graph-based forest fire detection algorithm which is based on spatial outlier detection methods, and test the proposed algorithm to evaluate its applicability. For this the ordinary scatter plot and Moran-s scatter plot were used. In order to evaluate the proposed algorithm, the results were compared with the MODIS fire product provided by the NASA MODIS Science Team, which showed the possibility of the proposed algorithm in detecting the fire pixels.Keywords: Spatial Outlier Detection, MODIS, Forest Fire
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2226127 Evaluation of Haar Cascade Classifiers Designed for Face Detection
Authors: R. Padilla, C. F. F. Costa Filho, M. G. F. Costa
Abstract:
In the past years a lot of effort has been made in the field of face detection. The human face contains important features that can be used by vision-based automated systems in order to identify and recognize individuals. Face location, the primary step of the vision-based automated systems, finds the face area in the input image. An accurate location of the face is still a challenging task. Viola-Jones framework has been widely used by researchers in order to detect the location of faces and objects in a given image. Face detection classifiers are shared by public communities, such as OpenCV. An evaluation of these classifiers will help researchers to choose the best classifier for their particular need. This work focuses of the evaluation of face detection classifiers minding facial landmarks.Keywords: Face datasets, face detection, facial landmarking, haar wavelets, Viola-Jones detectors.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5410126 The Performance Improvement of Automatic Modulation Recognition Using Simple Feature Manipulation, Analysis of the HOS, and Voted Decision
Authors: Heroe Wijanto, Sugihartono, Suhartono Tjondronegoro, Kuspriyanto
Abstract:
The use of High Order Statistics (HOS) analysis is expected to provide so many candidates of features that can be selected for pattern recognition. More candidates of the feature can be extracted using simple manipulation through a specific mathematical function prior to the HOS analysis. Feature extraction method using HOS analysis combined with Difference to the Nth-Power manipulation has been examined in application for Automatic Modulation Recognition (AMR) to perform scheme recognition of three digital modulation signal, i.e. QPSK-16QAM-64QAM in the AWGN transmission channel. The simulation results is reported when the analysis of HOS up to order-12 and the manipulation of Difference to the Nth-Power up to N = 4. The obtained accuracy rate of AMR using the method of Simple Decision obtained 90% in SNR > 10 dB in its classifier, while using the method of Voted Decision is 96% in SNR > 2 dB.Keywords: modulation, automatic modulation recognition, feature analysis, feature manipulation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2122125 Comparative Study of Filter Characteristics as Statistical Vocal Correlates of Clinical Psychiatric State in Human
Authors: Thaweesak Yingthawornsuk, Chusak Thanawattano
Abstract:
Acoustical properties of speech have been shown to be related to mental states of speaker with symptoms: depression and remission. This paper describes way to address the issue of distinguishing depressed patients from remitted subjects based on measureable acoustics change of their spoken sound. The vocal-tract related frequency characteristics of speech samples from female remitted and depressed patients were analyzed via speech processing techniques and consequently, evaluated statistically by cross-validation with Support Vector Machine. Our results comparatively show the classifier's performance with effectively correct separation of 93% determined from testing with the subjectbased feature model and 88% from the frame-based model based on the same speech samples collected from hospital visiting interview sessions between patients and psychiatrists.Keywords: Depression, SVM, Vocal Extract, Vocal Tract
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1541124 Searchable Encryption in Cloud Storage
Authors: Ren-Junn Hwang, Chung-Chien Lu, Jain-Shing Wu
Abstract:
Cloud outsource storage is one of important services in cloud computing. Cloud users upload data to cloud servers to reduce the cost of managing data and maintaining hardware and software. To ensure data confidentiality, users can encrypt their files before uploading them to a cloud system. However, retrieving the target file from the encrypted files exactly is difficult for cloud server. This study proposes a protocol for performing multikeyword searches for encrypted cloud data by applying k-nearest neighbor technology. The protocol ranks the relevance scores of encrypted files and keywords, and prevents cloud servers from learning search keywords submitted by a cloud user. To reduce the costs of file transfer communication, the cloud server returns encrypted files in order of relevance. Moreover, when a cloud user inputs an incorrect keyword and the number of wrong alphabet does not exceed a given threshold; the user still can retrieve the target files from cloud server. In addition, the proposed scheme satisfies security requirements for outsourced data storage.
Keywords: Fault-tolerance search, multi-keywords search, outsource storage, ranked search, searchable encryption.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3081123 Empirical Mode Decomposition with Wavelet Transform Based Analytic Signal for Power Quality Assessment
Authors: Sudipta Majumdar, Amarendra Kumar Mishra
Abstract:
This paper proposes empirical mode decomposition (EMD) together with wavelet transform (WT) based analytic signal for power quality (PQ) events assessment. EMD decomposes the complex signals into several intrinsic mode functions (IMF). As the PQ events are non stationary, instantaneous parameters have been calculated from these IMFs using analytic signal obtained form WT. We obtained three parameters from IMFs and then used KNN classifier for classification of PQ disturbance. We compared the classification of proposed method for PQ events by obtaining the features using Hilbert transform (HT) method. The classification efficiency using WT based analytic method is 97.5% and using HT based analytic signal is 95.5%.Keywords: Empirical mode decomposition, Hilbert transform, wavelet transform.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1291122 Educational Data Mining: The Case of Department of Mathematics and Computing in the Period 2009-2018
Authors: M. Sitoe, O. Zacarias
Abstract:
University education is influenced by several factors that range from the adoption of strategies to strengthen the whole process to the academic performance improvement of the students themselves. This work uses data mining techniques to develop a predictive model to identify students with a tendency to evasion and retention. To this end, a database of real students’ data from the Department of University Admission (DAU) and the Department of Mathematics and Informatics (DMI) was used. The data comprised 388 undergraduate students admitted in the years 2009 to 2014. The Weka tool was used for model building, using three different techniques, namely: K-nearest neighbor, random forest, and logistic regression. To allow for training on multiple train-test splits, a cross-validation approach was employed with a varying number of folds. To reduce bias variance and improve the performance of the models, ensemble methods of Bagging and Stacking were used. After comparing the results obtained by the three classifiers, Logistic Regression using Bagging with seven folds obtained the best performance, showing results above 90% in all evaluated metrics: accuracy, rate of true positives, and precision. Retention is the most common tendency.
Keywords: Evasion and retention, cross validation, bagging, stacking.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 121121 Intrusion Detection Using a New Particle Swarm Method and Support Vector Machines
Authors: Essam Al Daoud
Abstract:
Intrusion detection is a mechanism used to protect a system and analyse and predict the behaviours of system users. An ideal intrusion detection system is hard to achieve due to nonlinearity, and irrelevant or redundant features. This study introduces a new anomaly-based intrusion detection model. The suggested model is based on particle swarm optimisation and nonlinear, multi-class and multi-kernel support vector machines. Particle swarm optimisation is used for feature selection by applying a new formula to update the position and the velocity of a particle; the support vector machine is used as a classifier. The proposed model is tested and compared with the other methods using the KDD CUP 1999 dataset. The results indicate that this new method achieves better accuracy rates than previous methods.Keywords: Feature selection, Intrusion detection, Support vector machine, Particle swarm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1990120 Cognition Technique for Developing a World Music
Authors: Haider Javed Uppal, Javed Yunas Uppal
Abstract:
In today's globalized world, it is necessary to develop a form of music that is able to evoke equal emotional responses among people from diverse cultural backgrounds. Indigenous cultures throughout history have developed their own music cognition, specifically in terms of the connections between music and mood. With the advancements in artificial intelligence technologies, it has become possible to analyze and categorize music features such as timbre, harmony, melody, and rhythm, and relate them to the resulting mood effects experienced by listeners. This paper presents a model that utilizes a screenshot translator to convert music from different origins into waveforms, which are then analyzed using machine learning and information retrieval techniques. By connecting these waveforms with Thayer's matrix of moods, a mood classifier has been developed using fuzzy logic algorithms to determine the emotional impact of different types of music on listeners from various cultures.
Keywords: Cognition, world music, artificial intelligence, Thayer’s matrix.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 162119 Computer Aided Classification of Architectural Distortion in Mammograms Using Texture Features
Authors: Birmohan Singh, V. K. Jain
Abstract:
Computer aided diagnosis systems provide vital opinion to radiologists in the detection of early signs of breast cancer from mammogram images. Architectural distortions, masses and microcalcifications are the major abnormalities. In this paper, a computer aided diagnosis system has been proposed for distinguishing abnormal mammograms with architectural distortion from normal mammogram. Four types of texture features GLCM texture, GLRLM texture, fractal texture and spectral texture features for the regions of suspicion are extracted. Support vector machine has been used as classifier in this study. The proposed system yielded an overall sensitivity of 96.47% and an accuracy of 96% for mammogram images collected from digital database for screening mammography database.Keywords: Architecture Distortion, GLCM Texture features, GLRLM Texture Features, Mammograms, Support Vector Machine.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2261118 Binary Classification Tree with Tuned Observation-based Clustering
Authors: Maythapolnun Athimethphat, Boontarika Lerteerawong
Abstract:
There are several approaches for handling multiclass classification. Aside from one-against-one (OAO) and one-against-all (OAA), hierarchical classification technique is also commonly used. A binary classification tree is a hierarchical classification structure that breaks down a k-class problem into binary sub-problems, each solved by a binary classifier. In each node, a set of classes is divided into two subsets. A good class partition should be able to group similar classes together. Many algorithms measure similarity in term of distance between class centroids. Classes are grouped together by a clustering algorithm when distances between their centroids are small. In this paper, we present a binary classification tree with tuned observation-based clustering (BCT-TOB) that finds a class partition by performing clustering on observations instead of class centroids. A merging step is introduced to merge any insignificant class split. The experiment shows that performance of BCT-TOB is comparable to other algorithms.
Keywords: multiclass classification, hierarchical classification, binary classification tree, clustering, observation-based clustering
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1733117 A Hybrid Approach for Selection of Relevant Features for Microarray Datasets
Authors: R. K. Agrawal, Rajni Bala
Abstract:
Developing an accurate classifier for high dimensional microarray datasets is a challenging task due to availability of small sample size. Therefore, it is important to determine a set of relevant genes that classify the data well. Traditionally, gene selection method often selects the top ranked genes according to their discriminatory power. Often these genes are correlated with each other resulting in redundancy. In this paper, we have proposed a hybrid method using feature ranking and wrapper method (Genetic Algorithm with multiclass SVM) to identify a set of relevant genes that classify the data more accurately. A new fitness function for genetic algorithm is defined that focuses on selecting the smallest set of genes that provides maximum accuracy. Experiments have been carried on four well-known datasets1. The proposed method provides better results in comparison to the results found in the literature in terms of both classification accuracy and number of genes selected.
Keywords: Gene selection, genetic algorithm, microarray datasets, multi-class SVM.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2059116 Modeling of Cross Flow Classifier with Water Injection
Authors: E. Pikushchak, J. Dueck, L. Minkov
Abstract:
In hydrocyclones, the particle separation efficiency is limited by the suspended fine particles, which are discharged with the coarse product in the underflow. It is well known that injecting water in the conical part of the cyclone reduces the fine particle fraction in the underflow. This paper presents a mathematical model that simulates the water injection in the conical component. The model accounts for the fluid flow and the particle motion. Particle interaction, due to hindered settling caused by increased density and viscosity of the suspension, and fine particle entrainment by settling coarse particles are included in the model. Water injection in the conical part of the hydrocyclone is performed to reduce fine particle discharge in the underflow. The model demonstrates the impact of the injection rate, injection velocity, and injection location on the shape of the partition curve. The simulations are compared with experimental data of a 50-mm cyclone.Keywords: Classification, fine particle processing, hydrocyclone, water injection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1954115 Fusion of ETM+ Multispectral and Panchromatic Texture for Remote Sensing Classification
Authors: Mahesh Pal
Abstract:
This paper proposes to use ETM+ multispectral data and panchromatic band as well as texture features derived from the panchromatic band for land cover classification. Four texture features including one 'internal texture' and three GLCM based textures namely correlation, entropy, and inverse different moment were used in combination with ETM+ multispectral data. Two data sets involving combination of multispectral, panchromatic band and its texture were used and results were compared with those obtained by using multispectral data alone. A decision tree classifier with and without boosting were used to classify different datasets. Results from this study suggest that the dataset consisting of panchromatic band, four of its texture features and multispectral data was able to increase the classification accuracy by about 2%. In comparison, a boosted decision tree was able to increase the classification accuracy by about 3% with the same dataset.Keywords: Internal texture; GLCM; decision tree; boosting; classification accuracy.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1738114 Recognition of Grocery Products in Images Captured by Cellular Phones
Authors: Farshideh Einsele, Hassan Foroosh
Abstract:
In this paper, we present a robust algorithm to recognize extracted text from grocery product images captured by mobile phone cameras. Recognition of such text is challenging since text in grocery product images varies in its size, orientation, style, illumination, and can suffer from perspective distortion. Pre-processing is performed to make the characters scale and rotation invariant. Since text degradations can not be appropriately defined using well-known geometric transformations such as translation, rotation, affine transformation and shearing, we use the whole character black pixels as our feature vector. Classification is performed with minimum distance classifier using the maximum likelihood criterion, which delivers very promising Character Recognition Rate (CRR) of 89%. We achieve considerably higher Word Recognition Rate (WRR) of 99% when using lower level linguistic knowledge about product words during the recognition process.
Keywords: Camera-based OCR, Feature extraction, Document and image processing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2470113 Multivariate Analysis of Spectroscopic Data for Agriculture Applications
Authors: Asmaa M. Hussein, Amr Wassal, Ahmed Farouk Al-Sadek, A. F. Abd El-Rahman
Abstract:
In this study, a multivariate analysis of potato spectroscopic data was presented to detect the presence of brown rot disease or not. Near-Infrared (NIR) spectroscopy (1,350-2,500 nm) combined with multivariate analysis was used as a rapid, non-destructive technique for the detection of brown rot disease in potatoes. Spectral measurements were performed in 565 samples, which were chosen randomly at the infection place in the potato slice. In this study, 254 infected and 311 uninfected (brown rot-free) samples were analyzed using different advanced statistical analysis techniques. The discrimination performance of different multivariate analysis techniques, including classification, pre-processing, and dimension reduction, were compared. Applying a random forest algorithm classifier with different pre-processing techniques to raw spectra had the best performance as the total classification accuracy of 98.7% was achieved in discriminating infected potatoes from control.
Keywords: Brown rot disease, NIR spectroscopy, potato, random forest.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 887112 Assessing and Visualizing the Stability of Feature Selectors: A Case Study with Spectral Data
Authors: R.Guzman-Martinez, Oscar Garcia-Olalla, R.Alaiz-Rodriguez
Abstract:
Feature selection plays an important role in applications with high dimensional data. The assessment of the stability of feature selection/ranking algorithms becomes an important issue when the dataset is small and the aim is to gain insight into the underlying process by analyzing the most relevant features. In this work, we propose a graphical approach that enables to analyze the similarity between feature ranking techniques as well as their individual stability. Moreover, it works with whatever stability metric (Canberra distance, Spearman's rank correlation coefficient, Kuncheva's stability index,...). We illustrate this visualization technique evaluating the stability of several feature selection techniques on a spectral binary dataset. Experimental results with a neural-based classifier show that stability and ranking quality may not be linked together and both issues have to be studied jointly in order to offer answers to the domain experts.
Keywords: Feature Selection Stability, Spectral data, Data visualization
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1526