Search results for: feature selection methods
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 5427

Search results for: feature selection methods

5397 sEMG Interface Design for Locomotion Identification

Authors: Rohit Gupta, Ravinder Agarwal

Abstract:

Surface electromyographic (sEMG) signal has the potential to identify the human activities and intention. This potential is further exploited to control the artificial limbs using the sEMG signal from residual limbs of amputees. The paper deals with the development of multichannel cost efficient sEMG signal interface for research application, along with evaluation of proposed class dependent statistical approach of the feature selection method. The sEMG signal acquisition interface was developed using ADS1298 of Texas Instruments, which is a front-end interface integrated circuit for ECG application. Further, the sEMG signal is recorded from two lower limb muscles for three locomotions namely: Plane Walk (PW), Stair Ascending (SA), Stair Descending (SD). A class dependent statistical approach is proposed for feature selection and also its performance is compared with 12 preexisting feature vectors. To make the study more extensive, performance of five different types of classifiers are compared. The outcome of the current piece of work proves the suitability of the proposed feature selection algorithm for locomotion recognition, as compared to other existing feature vectors. The SVM Classifier is found as the outperformed classifier among compared classifiers with an average recognition accuracy of 97.40%. Feature vector selection emerges as the most dominant factor affecting the classification performance as it holds 51.51% of the total variance in classification accuracy. The results demonstrate the potentials of the developed sEMG signal acquisition interface along with the proposed feature selection algorithm.

Keywords: Classifiers, feature selection, locomotion, sEMG.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1440
5396 Feature Selection with Kohonen Self Organizing Classification Algorithm

Authors: Francesco Maiorana

Abstract:

In this paper a one-dimension Self Organizing Map algorithm (SOM) to perform feature selection is presented. The algorithm is based on a first classification of the input dataset on a similarity space. From this classification for each class a set of positive and negative features is computed. This set of features is selected as result of the procedure. The procedure is evaluated on an in-house dataset from a Knowledge Discovery from Text (KDT) application and on a set of publicly available datasets used in international feature selection competitions. These datasets come from KDT applications, drug discovery as well as other applications. The knowledge of the correct classification available for the training and validation datasets is used to optimize the parameters for positive and negative feature extractions. The process becomes feasible for large and sparse datasets, as the ones obtained in KDT applications, by using both compression techniques to store the similarity matrix and speed up techniques of the Kohonen algorithm that take advantage of the sparsity of the input matrix. These improvements make it feasible, by using the grid, the application of the methodology to massive datasets.

Keywords: Clustering algorithm, Data mining, Feature selection, Grid, Kohonen Self Organizing Map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3003
5395 Feature Selection for Breast Cancer Diagnosis: A Case-Based Wrapper Approach

Authors: Mohammad Darzi, Ali AsgharLiaei, Mahdi Hosseini, HabibollahAsghari

Abstract:

This article addresses feature selection for breast cancer diagnosis. The present process contains a wrapper approach based on Genetic Algorithm (GA) and case-based reasoning (CBR). GA is used for searching the problem space to find all of the possible subsets of features and CBR is employed to estimate the evaluation result of each subset. The results of experiment show that the proposed model is comparable to the other models on Wisconsin breast cancer (WDBC) dataset.

Keywords: Case-based reasoning; Breast cancer diagnosis; Genetic algorithm; Wrapper feature selection

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2800
5394 Multiclass Support Vector Machines with Simultaneous Multi-Factors Optimization for Corporate Credit Ratings

Authors: Hyunchul Ahn, William X. S. Wong

Abstract:

Corporate credit rating prediction is one of the most important topics, which has been studied by researchers in the last decade. Over the last decade, researchers are pushing the limit to enhance the exactness of the corporate credit rating prediction model by applying several data-driven tools including statistical and artificial intelligence methods. Among them, multiclass support vector machine (MSVM) has been widely applied due to its good predictability. However, heuristics, for example, parameters of a kernel function, appropriate feature and instance subset, has become the main reason for the critics on MSVM, as they have dictate the MSVM architectural variables. This study presents a hybrid MSVM model that is intended to optimize all the parameter such as feature selection, instance selection, and kernel parameter. Our model adopts genetic algorithm (GA) to simultaneously optimize multiple heterogeneous design factors of MSVM.

Keywords: Corporate credit rating prediction, feature selection, genetic algorithms, instance selection, multiclass support vector machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1372
5393 Genetic Algorithms and Kernel Matrix-based Criteria Combined Approach to Perform Feature and Model Selection for Support Vector Machines

Authors: A. Perolini

Abstract:

Feature and model selection are in the center of attention of many researches because of their impact on classifiers- performance. Both selections are usually performed separately but recent developments suggest using a combined GA-SVM approach to perform them simultaneously. This approach improves the performance of the classifier identifying the best subset of variables and the optimal parameters- values. Although GA-SVM is an effective method it is computationally expensive, thus a rough method can be considered. The paper investigates a joined approach of Genetic Algorithm and kernel matrix criteria to perform simultaneously feature and model selection for SVM classification problem. The purpose of this research is to improve the classification performance of SVM through an efficient approach, the Kernel Matrix Genetic Algorithm method (KMGA).

Keywords: Feature and model selection, Genetic Algorithms, Support Vector Machines, kernel matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1549
5392 Feature Selection and Predictive Modeling of Housing Data Using Random Forest

Authors: Bharatendra Rai

Abstract:

Predictive data analysis and modeling involving machine learning techniques become challenging in presence of too many explanatory variables or features. Presence of too many features in machine learning is known to not only cause algorithms to slow down, but they can also lead to decrease in model prediction accuracy. This study involves housing dataset with 79 quantitative and qualitative features that describe various aspects people consider while buying a new house. Boruta algorithm that supports feature selection using a wrapper approach build around random forest is used in this study. This feature selection process leads to 49 confirmed features which are then used for developing predictive random forest models. The study also explores five different data partitioning ratios and their impact on model accuracy are captured using coefficient of determination (r-square) and root mean square error (rsme).

Keywords: Housing data, feature selection, random forest, Boruta algorithm, root mean square error.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1640
5391 Cost Sensitive Feature Selection in Decision-Theoretic Rough Set Models for Customer Churn Prediction: The Case of Telecommunication Sector Customers

Authors: Emel Kızılkaya Aydogan, Mihrimah Ozmen, Yılmaz Delice

Abstract:

In recent days, there is a change and the ongoing development of the telecommunications sector in the global market. In this sector, churn analysis techniques are commonly used for analysing why some customers terminate their service subscriptions prematurely. In addition, customer churn is utmost significant in this sector since it causes to important business loss. Many companies make various researches in order to prevent losses while increasing customer loyalty. Although a large quantity of accumulated data is available in this sector, their usefulness is limited by data quality and relevance. In this paper, a cost-sensitive feature selection framework is developed aiming to obtain the feature reducts to predict customer churn. The framework is a cost based optional pre-processing stage to remove redundant features for churn management. In addition, this cost-based feature selection algorithm is applied in a telecommunication company in Turkey and the results obtained with this algorithm.

Keywords: Churn prediction, data mining, decision-theoretic rough set, feature selection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1721
5390 Improving Worm Detection with Artificial Neural Networks through Feature Selection and Temporal Analysis Techniques

Authors: Dima Stopel, Zvi Boger, Robert Moskovitch, Yuval Shahar, Yuval Elovici

Abstract:

Computer worm detection is commonly performed by antivirus software tools that rely on prior explicit knowledge of the worm-s code (detection based on code signatures). We present an approach for detection of the presence of computer worms based on Artificial Neural Networks (ANN) using the computer's behavioral measures. Identification of significant features, which describe the activity of a worm within a host, is commonly acquired from security experts. We suggest acquiring these features by applying feature selection methods. We compare three different feature selection techniques for the dimensionality reduction and identification of the most prominent features to capture efficiently the computer behavior in the context of worm activity. Additionally, we explore three different temporal representation techniques for the most prominent features. In order to evaluate the different techniques, several computers were infected with five different worms and 323 different features of the infected computers were measured. We evaluated each technique by preprocessing the dataset according to each one and training the ANN model with the preprocessed data. We then evaluated the ability of the model to detect the presence of a new computer worm, in particular, during heavy user activity on the infected computers.

Keywords: Artificial Neural Networks, Feature Selection, Temporal Analysis, Worm Detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1684
5389 Supplier Selection Criteria and Methods in Supply Chains: A Review

Authors: Om Pal, Amit Kumar Gupta, R. K. Garg

Abstract:

An effective supplier selection process is very important to the success of any manufacturing organization. The main objective of supplier selection process is to reduce purchase risk, maximize overall value to the purchaser, and develop closeness and long-term relationships between buyers and suppliers in today’s competitive industrial scenario. The literature on supplier selection criteria and methods is full of various analytical and heuristic approaches. Some researchers have developed hybrid models by combining more than one type of selection methods. It is felt that supplier selection criteria and method is still a critical issue for the manufacturing industries therefore in the present paper the literature has been thoroughly reviewed and critically analyzed to address the issue.

Keywords: Supplier selection, AHP, ANP, TOPSIS, Mathematical Programming

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 28778
5388 Intrusion Detection Using a New Particle Swarm Method and Support Vector Machines

Authors: Essam Al Daoud

Abstract:

Intrusion detection is a mechanism used to protect a system and analyse and predict the behaviours of system users. An ideal intrusion detection system is hard to achieve due to nonlinearity, and irrelevant or redundant features. This study introduces a new anomaly-based intrusion detection model. The suggested model is based on particle swarm optimisation and nonlinear, multi-class and multi-kernel support vector machines. Particle swarm optimisation is used for feature selection by applying a new formula to update the position and the velocity of a particle; the support vector machine is used as a classifier. The proposed model is tested and compared with the other methods using the KDD CUP 1999 dataset. The results indicate that this new method achieves better accuracy rates than previous methods.

Keywords: Feature selection, Intrusion detection, Support vector machine, Particle swarm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1939
5387 Multi-Layer Perceptron Neural Network Classifier with Binary Particle Swarm Optimization Based Feature Selection for Brain-Computer Interfaces

Authors: K. Akilandeswari, G. M. Nasira

Abstract:

Brain-Computer Interfaces (BCIs) measure brain signals activity, intentionally and unintentionally induced by users, and provides a communication channel without depending on the brain’s normal peripheral nerves and muscles output pathway. Feature Selection (FS) is a global optimization machine learning problem that reduces features, removes irrelevant and noisy data resulting in acceptable recognition accuracy. It is a vital step affecting pattern recognition system performance. This study presents a new Binary Particle Swarm Optimization (BPSO) based feature selection algorithm. Multi-layer Perceptron Neural Network (MLPNN) classifier with backpropagation training algorithm and Levenberg-Marquardt training algorithm classify selected features.

Keywords: Brain-Computer Interfaces (BCI), Feature Selection (FS), Walsh–Hadamard Transform (WHT), Binary Particle Swarm Optimization (BPSO), Multi-Layer Perceptron (MLP), Levenberg–Marquardt algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2133
5386 Automatic Landmark Selection Based on Feature Clustering for Visual Autonomous Unmanned Aerial Vehicle Navigation

Authors: Paulo Fernando Silva Filho, Elcio Hideiti Shiguemori

Abstract:

The selection of specific landmarks for an Unmanned Aerial Vehicles’ Visual Navigation systems based on Automatic Landmark Recognition has significant influence on the precision of the system’s estimated position. At the same time, manual selection of the landmarks does not guarantee a high recognition rate, which would also result on a poor precision. This work aims to develop an automatic landmark selection that will take the image of the flight area and identify the best landmarks to be recognized by the Visual Navigation Landmark Recognition System. The criterion to select a landmark is based on features detected by ORB or AKAZE and edges information on each possible landmark. Results have shown that disposition of possible landmarks is quite different from the human perception.

Keywords: Clustering, edges, feature points, landmark selection, X-Means.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 753
5385 Multi-Objective Evolutionary Computation Based Feature Selection Applied to Behaviour Assessment of Children

Authors: F. Jiménez, R. Jódar, M. Martín, G. Sánchez, G. Sciavicco

Abstract:

Abstract—Attribute or feature selection is one of the basic strategies to improve the performances of data classification tasks, and, at the same time, to reduce the complexity of classifiers, and it is a particularly fundamental one when the number of attributes is relatively high. Its application to unsupervised classification is restricted to a limited number of experiments in the literature. Evolutionary computation has already proven itself to be a very effective choice to consistently reduce the number of attributes towards a better classification rate and a simpler semantic interpretation of the inferred classifiers. We present a feature selection wrapper model composed by a multi-objective evolutionary algorithm, the clustering method Expectation-Maximization (EM), and the classifier C4.5 for the unsupervised classification of data extracted from a psychological test named BASC-II (Behavior Assessment System for Children - II ed.) with two objectives: Maximizing the likelihood of the clustering model and maximizing the accuracy of the obtained classifier. We present a methodology to integrate feature selection for unsupervised classification, model evaluation, decision making (to choose the most satisfactory model according to a a posteriori process in a multi-objective context), and testing. We compare the performance of the classifier obtained by the multi-objective evolutionary algorithms ENORA and NSGA-II, and the best solution is then validated by the psychologists that collected the data.

Keywords: Feature selection, multi-objective evolutionary computation, unsupervised classification, behavior assessment system for children.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1399
5384 Wave Atom Transform Based Two Class Motor Imagery Classification

Authors: Nebi Gedik

Abstract:

Electroencephalography (EEG) investigations of the brain computer interfaces are based on the electrical signals resulting from neural activities in the brain. In this paper, it is offered a method for classifying motor imagery EEG signals. The suggested method classifies EEG signals into two classes using the wave atom transform, and the transform coefficients are assessed, creating the feature set. Classification is done with SVM and k-NN algorithms with and without feature selection. For feature selection t-test approaches are utilized. A test of the approach is performed on the BCI competition III dataset IIIa.

Keywords: motor imagery, EEG, wave atom transform, SVM, k-NN, t-test

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 416
5383 Feature Reduction of Nearest Neighbor Classifiers using Genetic Algorithm

Authors: M. Analoui, M. Fadavi Amiri

Abstract:

The design of a pattern classifier includes an attempt to select, among a set of possible features, a minimum subset of weakly correlated features that better discriminate the pattern classes. This is usually a difficult task in practice, normally requiring the application of heuristic knowledge about the specific problem domain. The selection and quality of the features representing each pattern have a considerable bearing on the success of subsequent pattern classification. Feature extraction is the process of deriving new features from the original features in order to reduce the cost of feature measurement, increase classifier efficiency, and allow higher classification accuracy. Many current feature extraction techniques involve linear transformations of the original pattern vectors to new vectors of lower dimensionality. While this is useful for data visualization and increasing classification efficiency, it does not necessarily reduce the number of features that must be measured since each new feature may be a linear combination of all of the features in the original pattern vector. In this paper a new approach is presented to feature extraction in which feature selection, feature extraction, and classifier training are performed simultaneously using a genetic algorithm. In this approach each feature value is first normalized by a linear equation, then scaled by the associated weight prior to training, testing, and classification. A knn classifier is used to evaluate each set of feature weights. The genetic algorithm optimizes a vector of feature weights, which are used to scale the individual features in the original pattern vectors in either a linear or a nonlinear fashion. By this approach, the number of features used in classifying can be finely reduced.

Keywords: Feature reduction, genetic algorithm, pattern classification, nearest neighbor rule classifiers (k-NNR).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1727
5382 A New DIDS Design Based on a Combination Feature Selection Approach

Authors: Adel Sabry Eesa, Adnan Mohsin Abdulazeez Brifcani, Zeynep Orman

Abstract:

Feature selection has been used in many fields such as classification, data mining and object recognition and proven to be effective for removing irrelevant and redundant features from the original dataset. In this paper, a new design of distributed intrusion detection system using a combination feature selection model based on bees and decision tree. Bees algorithm is used as the search strategy to find the optimal subset of features, whereas decision tree is used as a judgment for the selected features. Both the produced features and the generated rules are used by Decision Making Mobile Agent to decide whether there is an attack or not in the networks. Decision Making Mobile Agent will migrate through the networks, moving from node to another, if it found that there is an attack on one of the nodes, it then alerts the user through User Interface Agent or takes some action through Action Mobile Agent. The KDD Cup 99 dataset is used to test the effectiveness of the proposed system. The results show that even if only four features are used, the proposed system gives a better performance when it is compared with the obtained results using all 41 features.

Keywords: Distributed intrusion detection system, mobile agent, feature selection, Bees Algorithm, decision tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1883
5381 Simultaneous Clustering and Feature Selection Method for Gene Expression Data

Authors: T. Chandrasekhar, K. Thangavel, E. N. Sathishkumar

Abstract:

Microarrays are made it possible to simultaneously monitor the expression profiles of thousands of genes under various experimental conditions. It is used to identify the co-expressed genes in specific cells or tissues that are actively used to make proteins. This method is used to analysis the gene expression, an important task in bioinformatics research. Cluster analysis of gene expression data has proved to be a useful tool for identifying co-expressed genes, biologically relevant groupings of genes and samples. In this work K-Means algorithms has been applied for clustering of Gene Expression Data. Further, rough set based Quick reduct algorithm has been applied for each cluster in order to select the most similar genes having high correlation. Then the ACV measure is used to evaluate the refined clusters and classification is used to evaluate the proposed method. They could identify compact clusters with feature selection method used to genes are selected.

Keywords: Clustering, Feature selection, Gene expression data, Quick reduct.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1926
5380 Improving Fake News Detection Using K-means and Support Vector Machine Approaches

Authors: Kasra Majbouri Yazdi, Adel Majbouri Yazdi, Saeid Khodayi, Jingyu Hou, Wanlei Zhou, Saeed Saedy

Abstract:

Fake news and false information are big challenges of all types of media, especially social media. There is a lot of false information, fake likes, views and duplicated accounts as big social networks such as Facebook and Twitter admitted. Most information appearing on social media is doubtful and in some cases misleading. They need to be detected as soon as possible to avoid a negative impact on society. The dimensions of the fake news datasets are growing rapidly, so to obtain a better result of detecting false information with less computation time and complexity, the dimensions need to be reduced. One of the best techniques of reducing data size is using feature selection method. The aim of this technique is to choose a feature subset from the original set to improve the classification performance. In this paper, a feature selection method is proposed with the integration of K-means clustering and Support Vector Machine (SVM) approaches which work in four steps. First, the similarities between all features are calculated. Then, features are divided into several clusters. Next, the final feature set is selected from all clusters, and finally, fake news is classified based on the final feature subset using the SVM method. The proposed method was evaluated by comparing its performance with other state-of-the-art methods on several specific benchmark datasets and the outcome showed a better classification of false information for our work. The detection performance was improved in two aspects. On the one hand, the detection runtime process decreased, and on the other hand, the classification accuracy increased because of the elimination of redundant features and the reduction of datasets dimensions.

Keywords: Fake news detection, feature selection, support vector machine, K-means clustering, machine learning, social media.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4377
5379 Novel Hybrid Method for Gene Selection and Cancer Prediction

Authors: Liping Jing, Michael K. Ng, Tieyong Zeng

Abstract:

Microarray data profiles gene expression on a whole genome scale, therefore, it provides a good way to study associations between gene expression and occurrence or progression of cancer. More and more researchers realized that microarray data is helpful to predict cancer sample. However, the high dimension of gene expressions is much larger than the sample size, which makes this task very difficult. Therefore, how to identify the significant genes causing cancer becomes emergency and also a hot and hard research topic. Many feature selection algorithms have been proposed in the past focusing on improving cancer predictive accuracy at the expense of ignoring the correlations between the features. In this work, a novel framework (named by SGS) is presented for stable gene selection and efficient cancer prediction . The proposed framework first performs clustering algorithm to find the gene groups where genes in each group have higher correlation coefficient, and then selects the significant genes in each group with Bayesian Lasso and important gene groups with group Lasso, and finally builds prediction model based on the shrinkage gene space with efficient classification algorithm (such as, SVM, 1NN, Regression and etc.). Experiment results on real world data show that the proposed framework often outperforms the existing feature selection and prediction methods, say SAM, IG and Lasso-type prediction model.

Keywords: Gene Selection, Cancer Prediction, Lasso, Clustering, Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1997
5378 Detecting HCC Tumor in Three Phasic CT Liver Images with Optimization of Neural Network

Authors: Mahdieh Khalilinezhad, Silvana Dellepiane, Gianni Vernazza

Abstract:

The aim of this work is to build a model based on tissue characterization that is able to discriminate pathological and non-pathological regions from three-phasic CT images. With our research and based on a feature selection in different phases, we are trying to design a neural network system with an optimal neuron number in a hidden layer. Our approach consists of three steps: feature selection, feature reduction, and classification. For each region of interest (ROI), 6 distinct sets of texture features are extracted such as: first order histogram parameters, absolute gradient, run-length matrix, co-occurrence matrix, autoregressive model, and wavelet, for a total of 270 texture features. When analyzing more phases, we show that the injection of liquid cause changes to the high relevant features in each region. Our results demonstrate that for detecting HCC tumor phase 3 is the best one in most of the features that we apply to the classification algorithm. The percentage of detection between pathology and healthy classes, according to our method, relates to first order histogram parameters with accuracy of 85% in phase 1, 95% in phase 2, and 95% in phase 3.

Keywords: Feature selection, Multi-phasic liver images, Neural network, Texture analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2493
5377 Implementation of a Multimodal Biometrics Recognition System with Combined Palm Print and Iris Features

Authors: Rabab M. Ramadan, Elaraby A. Elgallad

Abstract:

With extensive application, the performance of unimodal biometrics systems has to face a diversity of problems such as signal and background noise, distortion, and environment differences. Therefore, multimodal biometric systems are proposed to solve the above stated problems. This paper introduces a bimodal biometric recognition system based on the extracted features of the human palm print and iris. Palm print biometric is fairly a new evolving technology that is used to identify people by their palm features. The iris is a strong competitor together with face and fingerprints for presence in multimodal recognition systems. In this research, we introduced an algorithm to the combination of the palm and iris-extracted features using a texture-based descriptor, the Scale Invariant Feature Transform (SIFT). Since the feature sets are non-homogeneous as features of different biometric modalities are used, these features will be concatenated to form a single feature vector. Particle swarm optimization (PSO) is used as a feature selection technique to reduce the dimensionality of the feature. The proposed algorithm will be applied to the Institute of Technology of Delhi (IITD) database and its performance will be compared with various iris recognition algorithms found in the literature.

Keywords: Iris recognition, particle swarm optimization, feature extraction, feature selection, palm print, scale invariant feature transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 835
5376 Rotation Invariant Face Recognition Based on Hybrid LPT/DCT Features

Authors: Rehab F. Abdel-Kader, Rabab M. Ramadan, Rawya Y. Rizk

Abstract:

The recognition of human faces, especially those with different orientations is a challenging and important problem in image analysis and classification. This paper proposes an effective scheme for rotation invariant face recognition using Log-Polar Transform and Discrete Cosine Transform combined features. The rotation invariant feature extraction for a given face image involves applying the logpolar transform to eliminate the rotation effect and to produce a row shifted log-polar image. The discrete cosine transform is then applied to eliminate the row shift effect and to generate the low-dimensional feature vector. A PSO-based feature selection algorithm is utilized to search the feature vector space for the optimal feature subset. Evolution is driven by a fitness function defined in terms of maximizing the between-class separation (scatter index). Experimental results, based on the ORL face database using testing data sets for images with different orientations; show that the proposed system outperforms other face recognition methods. The overall recognition rate for the rotated test images being 97%, demonstrating that the extracted feature vector is an effective rotation invariant feature set with minimal set of selected features.

Keywords: Discrete Cosine Transform, Face Recognition, Feature Extraction, Log Polar Transform, Particle SwarmOptimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1826
5375 Hierarchical PSO-Adaboost Based Classifiers for Fast and Robust Face Detection

Authors: Hong Pan, Yaping Zhu, Liang Zheng Xia

Abstract:

We propose a fast and robust hierarchical face detection system which finds and localizes face images with a cascade of classifiers. Three modules contribute to the efficiency of our detector. First, heterogeneous feature descriptors are exploited to enrich feature types and feature numbers for face representation. Second, a PSO-Adaboost algorithm is proposed to efficiently select discriminative features from a large pool of available features and reinforce them into the final ensemble classifier. Compared with the standard exhaustive Adaboost for feature selection, the new PSOAdaboost algorithm reduces the training time up to 20 times. Finally, a three-stage hierarchical classifier framework is developed for rapid background removal. In particular, candidate face regions are detected more quickly by using a large size window in the first stage. Nonlinear SVM classifiers are used instead of decision stump functions in the last stage to remove those remaining complex nonface patterns that can not be rejected in the previous two stages. Experimental results show our detector achieves superior performance on the CMU+MIT frontal face dataset.

Keywords: Adaboost, Face detection, Feature selection, PSO

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2143
5374 Input Textural Feature Selection By Mutual Information For Multispectral Image Classification

Authors: Mounir Ait kerroum, Ahmed Hammouch, Driss Aboutajdine

Abstract:

Texture information plays increasingly an important role in remotely sensed imagery classification and many pattern recognition applications. However, the selection of relevant textural features to improve this classification accuracy is not a straightforward task. This work investigates the effectiveness of two Mutual Information Feature Selector (MIFS) algorithms to select salient textural features that contain highly discriminatory information for multispectral imagery classification. The input candidate features are extracted from a SPOT High Resolution Visible(HRV) image using Wavelet Transform (WT) at levels (l = 1,2). The experimental results show that the selected textural features according to MIFS algorithms make the largest contribution to improve the classification accuracy than classical approaches such as Principal Components Analysis (PCA) and Linear Discriminant Analysis (LDA).

Keywords: Feature Selection, Texture, Mutual Information, Wavelet Transform, SVM classification, SPOT Imagery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1507
5373 Active Segment Selection Method in EEG Classification Using Fractal Features

Authors: Samira Vafaye Eslahi

Abstract:

BCI (Brain Computer Interface) is a communication machine that translates brain massages to computer commands. These machines with the help of computer programs can recognize the tasks that are imagined. Feature extraction is an important stage of the process in EEG classification that can effect in accuracy and the computation time of processing the signals. In this study we process the signal in three steps of active segment selection, fractal feature extraction, and classification. One of the great challenges in BCI applications is to improve classification accuracy and computation time together. In this paper, we have used student’s 2D sample t-statistics on continuous wavelet transforms for active segment selection to reduce the computation time. In the next level, the features are extracted from some famous fractal dimension estimation of the signal. These fractal features are Katz and Higuchi. In the classification stage we used ANFIS (Adaptive Neuro-Fuzzy Inference System) classifier, FKNN (Fuzzy K-Nearest Neighbors), LDA (Linear Discriminate Analysis), and SVM (Support Vector Machines). We resulted that active segment selection method would reduce the computation time and Fractal dimension features with ANFIS analysis on selected active segments is the best among investigated methods in EEG classification.

Keywords: EEG, Student’s t- statistics, BCI, Fractal Features, ANFIS, FKNN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2077
5372 Feature Subset Selection approach based on Maximizing Margin of Support Vector Classifier

Authors: Khin May Win, Nan Sai Moon Kham

Abstract:

Identification of cancer genes that might anticipate the clinical behaviors from different types of cancer disease is challenging due to the huge number of genes and small number of patients samples. The new method is being proposed based on supervised learning of classification like support vector machines (SVMs).A new solution is described by the introduction of the Maximized Margin (MM) in the subset criterion, which permits to get near the least generalization error rate. In class prediction problem, gene selection is essential to improve the accuracy and to identify genes for cancer disease. The performance of the new method was evaluated with real-world data experiment. It can give the better accuracy for classification.

Keywords: Microarray data, feature selection, recursive featureelimination, support vector machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1502
5371 Learning to Recognize Faces by Local Feature Design and Selection

Authors: Yanwei Pang, Lei Zhang, Zhengkai Liu

Abstract:

Studies in neuroscience suggest that both global and local feature information are crucial for perception and recognition of faces. It is widely believed that local feature is less sensitive to variations caused by illumination, expression and illumination. In this paper, we target at designing and learning local features for face recognition. We designed three types of local features. They are semi-global feature, local patch feature and tangent shape feature. The designing of semi-global feature aims at taking advantage of global-like feature and meanwhile avoiding suppressing AdaBoost algorithm in boosting weak classifies established from small local patches. The designing of local patch feature targets at automatically selecting discriminative features, and is thus different with traditional ways, in which local patches are usually selected manually to cover the salient facial components. Also, shape feature is considered in this paper for frontal view face recognition. These features are selected and combined under the framework of boosting algorithm and cascade structure. The experimental results demonstrate that the proposed approach outperforms the standard eigenface method and Bayesian method. Moreover, the selected local features and observations in the experiments are enlightening to researches in local feature design in face recognition.

Keywords: Face recognition, local feature, AdaBoost, subspace analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1551
5370 Frequent Itemset Mining Using Rough-Sets

Authors: Usman Qamar, Younus Javed

Abstract:

Frequent pattern mining is the process of finding a pattern (a set of items, subsequences, substructures, etc.) that occurs frequently in a data set. It was proposed in the context of frequent itemsets and association rule mining. Frequent pattern mining is used to find inherent regularities in data. What products were often purchased together? Its applications include basket data analysis, cross-marketing, catalog design, sale campaign analysis, Web log (click stream) analysis, and DNA sequence analysis. However, one of the bottlenecks of frequent itemset mining is that as the data increase the amount of time and resources required to mining the data increases at an exponential rate. In this investigation a new algorithm is proposed which can be uses as a pre-processor for frequent itemset mining. FASTER (FeAture SelecTion using Entropy and Rough sets) is a hybrid pre-processor algorithm which utilizes entropy and roughsets to carry out record reduction and feature (attribute) selection respectively. FASTER for frequent itemset mining can produce a speed up of 3.1 times when compared to original algorithm while maintaining an accuracy of 71%.

Keywords: Rough-sets, Classification, Feature Selection, Entropy, Outliers, Frequent itemset mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2394
5369 Reducing SAGE Data Using Genetic Algorithms

Authors: Cheng-Hong Yang, Tsung-Mu Shih, Li-Yeh Chuang

Abstract:

Serial Analysis of Gene Expression is a powerful quantification technique for generating cell or tissue gene expression data. The profile of the gene expression of cell or tissue in several different states is difficult for biologists to analyze because of the large number of genes typically involved. However, feature selection in machine learning can successfully reduce this problem. The method allows reducing the features (genes) in specific SAGE data, and determines only relevant genes. In this study, we used a genetic algorithm to implement feature selection, and evaluate the classification accuracy of the selected features with the K-nearest neighbor method. In order to validate the proposed method, we used two SAGE data sets for testing. The results of this study conclusively prove that the number of features of the original SAGE data set can be significantly reduced and higher classification accuracy can be achieved.

Keywords: Serial Analysis of Gene Expression, Feature selection, Genetic Algorithm, K-nearest neighbor method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1566
5368 Data Preprocessing for Supervised Leaning

Authors: S. B. Kotsiantis, D. Kanellopoulos, P. E. Pintelas

Abstract:

Many factors affect the success of Machine Learning (ML) on a given task. The representation and quality of the instance data is first and foremost. If there is much irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase is more difficult. It is well known that data preparation and filtering steps take considerable amount of processing time in ML problems. Data pre-processing includes data cleaning, normalization, transformation, feature extraction and selection, etc. The product of data pre-processing is the final training set. It would be nice if a single sequence of data pre-processing algorithms had the best performance for each data set but this is not happened. Thus, we present the most well know algorithms for each step of data pre-processing so that one achieves the best performance for their data set.

Keywords: Data mining, feature selection, data cleaning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5935