Search results for: feature selection feature subset selection feature extraction/transformation
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 6850

Search results for: feature selection feature subset selection feature extraction/transformation

6790 Genetic Algorithms for Feature Generation in the Context of Audio Classification

Authors: José A. Menezes, Giordano Cabral, Bruno T. Gomes

Abstract:

Choosing good features is an essential part of machine learning. Recent techniques aim to automate this process. For instance, feature learning intends to learn the transformation of raw data into a useful representation to machine learning tasks. In automatic audio classification tasks, this is interesting since the audio, usually complex information, needs to be transformed into a computationally convenient input to process. Another technique tries to generate features by searching a feature space. Genetic algorithms, for instance, have being used to generate audio features by combining or modifying them. We find this approach particularly interesting and, despite the undeniable advances of feature learning approaches, we wanted to take a step forward in the use of genetic algorithms to find audio features, combining them with more conventional methods, like PCA, and inserting search control mechanisms, such as constraints over a confusion matrix. This work presents the results obtained on particular audio classification problems.

Keywords: feature generation, feature learning, genetic algorithm, music information retrieval

Procedia PDF Downloads 403
6789 Feature Location Restoration for Under-Sampled Photoplethysmogram Using Spline Interpolation

Authors: Hangsik Shin

Abstract:

The purpose of this research is to restore the feature location of under-sampled photoplethysmogram using spline interpolation and to investigate feasibility for feature shape restoration. We obtained 10 kHz-sampled photoplethysmogram and decimated it to generate under-sampled dataset. Decimated dataset has 5 kHz, 2.5 k Hz, 1 kHz, 500 Hz, 250 Hz, 25 Hz and 10 Hz sampling frequency. To investigate the restoration performance, we interpolated under-sampled signals with 10 kHz, then compared feature locations with feature locations of 10 kHz sampled photoplethysmogram. Features were upper and lower peak of photplethysmography waveform. Result showed that time differences were dramatically decreased by interpolation. Location error was lesser than 1 ms in both feature types. In 10 Hz sampled cases, location error was also deceased a lot, however, they were still over 10 ms.

Keywords: peak detection, photoplethysmography, sampling, signal reconstruction

Procedia PDF Downloads 343
6788 Feature Extraction of MFCC Based on Fisher-Ratio and Correlated Distance Criterion for Underwater Target Signal

Authors: Han Xue, Zhang Lanyue

Abstract:

In order to seek more effective feature extraction technology, feature extraction method based on MFCC combined with vector hydrophone is exposed in the paper. The sound pressure signal and particle velocity signal of two kinds of ships are extracted by using MFCC and its evolution form, and the extracted features are fused by using fisher-ratio and correlated distance criterion. The features are then identified by BP neural network. The results showed that MFCC, First-Order Differential MFCC and Second-Order Differential MFCC features can be used as effective features for recognition of underwater targets, and the fusion feature can improve the recognition rate. Moreover, the results also showed that the recognition rate of the particle velocity signal is higher than that of the sound pressure signal, and it reflects the superiority of vector signal processing.

Keywords: vector information, MFCC, differential MFCC, fusion feature, BP neural network

Procedia PDF Downloads 494
6787 A Hybrid Feature Selection and Deep Learning Algorithm for Cancer Disease Classification

Authors: Niousha Bagheri Khulenjani, Mohammad Saniee Abadeh

Abstract:

Learning from very big datasets is a significant problem for most present data mining and machine learning algorithms. MicroRNA (miRNA) is one of the important big genomic and non-coding datasets presenting the genome sequences. In this paper, a hybrid method for the classification of the miRNA data is proposed. Due to the variety of cancers and high number of genes, analyzing the miRNA dataset has been a challenging problem for researchers. The number of features corresponding to the number of samples is high and the data suffer from being imbalanced. The feature selection method has been used to select features having more ability to distinguish classes and eliminating obscures features. Afterward, a Convolutional Neural Network (CNN) classifier for classification of cancer types is utilized, which employs a Genetic Algorithm to highlight optimized hyper-parameters of CNN. In order to make the process of classification by CNN faster, Graphics Processing Unit (GPU) is recommended for calculating the mathematic equation in a parallel way. The proposed method is tested on a real-world dataset with 8,129 patients, 29 different types of tumors, and 1,046 miRNA biomarkers, taken from The Cancer Genome Atlas (TCGA) database.

Keywords: cancer classification, feature selection, deep learning, genetic algorithm

Procedia PDF Downloads 90
6786 Automatic Extraction of Water Bodies Using Whole-R Method

Authors: Nikhat Nawaz, S. Srinivasulu, P. Kesava Rao

Abstract:

Feature extraction plays an important role in many remote sensing applications. Automatic extraction of water bodies is of great significance in many remote sensing applications like change detection, image retrieval etc. This paper presents a procedure for automatic extraction of water information from remote sensing images. The algorithm uses the relative location of R-colour component of the chromaticity diagram. This method is then integrated with the effectiveness of the spatial scale transformation of whole method. The whole method is based on water index fitted from spectral library. Experimental results demonstrate the improved accuracy and effectiveness of the integrated method for automatic extraction of water bodies.

Keywords: feature extraction, remote sensing, image retrieval, chromaticity, water index, spectral library, integrated method

Procedia PDF Downloads 349
6785 An Automatic Feature Extraction Technique for 2D Punch Shapes

Authors: Awais Ahmad Khan, Emad Abouel Nasr, H. M. A. Hussein, Abdulrahman Al-Ahmari

Abstract:

Sheet-metal parts have been widely applied in electronics, communication and mechanical industries in recent decades; but the advancement in sheet-metal part design and manufacturing is still behind in comparison with the increasing importance of sheet-metal parts in modern industry. This paper presents a methodology for automatic extraction of some common 2D internal sheet metal features. The features used in this study are taken from Unipunch ™ catalogue. The extraction process starts with the data extraction from STEP file using an object oriented approach and with the application of suitable algorithms and rules, all features contained in the catalogue are automatically extracted. Since the extracted features include geometry and engineering information, they will be effective for downstream application such as feature rebuilding and process planning.

Keywords: feature extraction, internal features, punch shapes, sheet metal

Procedia PDF Downloads 591
6784 Multi-Stage Classification for Lung Lesion Detection on CT Scan Images Applying Medical Image Processing Technique

Authors: Behnaz Sohani, Sahand Shahalinezhad, Amir Rahmani, Aliyu Aliyu

Abstract:

Recently, medical imaging and specifically medical image processing is becoming one of the most dynamically developing areas of medical science. It has led to the emergence of new approaches in terms of the prevention, diagnosis, and treatment of various diseases. In the process of diagnosis of lung cancer, medical professionals rely on computed tomography (CT) scans, in which failure to correctly identify masses can lead to incorrect diagnosis or sampling of lung tissue. Identification and demarcation of masses in terms of detecting cancer within lung tissue are critical challenges in diagnosis. In this work, a segmentation system in image processing techniques has been applied for detection purposes. Particularly, the use and validation of a novel lung cancer detection algorithm have been presented through simulation. This has been performed employing CT images based on multilevel thresholding. The proposed technique consists of segmentation, feature extraction, and feature selection and classification. More in detail, the features with useful information are selected after featuring extraction. Eventually, the output image of lung cancer is obtained with 96.3% accuracy and 87.25%. The purpose of feature extraction applying the proposed approach is to transform the raw data into a more usable form for subsequent statistical processing. Future steps will involve employing the current feature extraction method to achieve more accurate resulting images, including further details available to machine vision systems to recognise objects in lung CT scan images.

Keywords: lung cancer detection, image segmentation, lung computed tomography (CT) images, medical image processing

Procedia PDF Downloads 65
6783 Deep Feature Augmentation with Generative Adversarial Networks for Class Imbalance Learning in Medical Images

Authors: Rongbo Shen, Jianhua Yao, Kezhou Yan, Kuan Tian, Cheng Jiang, Ke Zhou

Abstract:

This study proposes a generative adversarial networks (GAN) framework to perform synthetic sampling in feature space, i.e., feature augmentation, to address the class imbalance problem in medical image analysis. A feature extraction network is first trained to convert images into feature space. Then the GAN framework incorporates adversarial learning to train a feature generator for the minority class through playing a minimax game with a discriminator. The feature generator then generates features for minority class from arbitrary latent distributions to balance the data between the majority class and the minority class. Additionally, a data cleaning technique, i.e., Tomek link, is employed to clean up undesirable conflicting features introduced from the feature augmentation and thus establish well-defined class clusters for the training. The experiment section evaluates the proposed method on two medical image analysis tasks, i.e., mass classification on mammogram and cancer metastasis classification on histopathological images. Experimental results suggest that the proposed method obtains superior or comparable performance over the state-of-the-art counterparts. Compared to all counterparts, our proposed method improves more than 1.5 percentage of accuracy.

Keywords: class imbalance, synthetic sampling, feature augmentation, generative adversarial networks, data cleaning

Procedia PDF Downloads 106
6782 Verbal Prefix Selection in Old Japanese: A Corpus-Based Study

Authors: Zixi You

Abstract:

There are a number of verbal prefixes in Old Japanese. However, the selection or the compatibility of verbs and verbal prefixes is among the least investigated topics on Old Japanese language. Unlike other types of prefixes, verbal prefixes in dictionaries are more often than not listed with very brief information such as ‘unknown meaning’ or ‘rhythmic function only’. To fill in a part of this knowledge gap, this paper presents an exhaustive investigation based on the newly developed ‘Oxford Corpus of Old Japanese’ (OCOJ), which included nearly all existing resource of Old Japanese language, with detailed linguistics information in TEI-XML tags. In this paper, we propose the possibility that the following three prefixes, i-, sa-, ta- (with ta- being considered as a variation of sa-), are relevant to split intransitivity in Old Japanese, with evidence that unergative verbs favor i- and that unergative verbs favor sa-(ta-). This might be undermined by the fact that transitives are also found to follow i-. However, with several manifestations of split intransitivity in Old Japanese discussed, the behavior of transitives in verbal prefix selection is no longer as surprising as it may seem to be when one look at the selection of verbal prefix in isolation. It is possible that there are one or more features that played essential roles in determining the selection of i-, and the attested transitive verbs happen to have these features. The data suggest that this feature is a sense of ‘change’ of location or state involved in the event donated by the verb, which is a feature of typical unaccusatives. This is further discussed in the ‘affectedness’ hierarchy. The presentation of this paper, which includes a brief demonstration of the OCOJ, is expected to be of the interest of both specialists and general audiences.

Keywords: old Japanese, split intransitivity, unaccusatives, unergatives, verbal prefix selection

Procedia PDF Downloads 388
6781 Sentiment Analysis: An Enhancement of Ontological-Based Features Extraction Techniques and Word Equations

Authors: Mohd Ridzwan Yaakub, Muhammad Iqbal Abu Latiffi

Abstract:

Online business has become popular recently due to the massive amount of information and medium available on the Internet. This has resulted in the huge number of reviews where the consumers share their opinion, criticisms, and satisfaction on the products they have purchased on the websites or the social media such as Facebook and Twitter. However, to analyze customer’s behavior has become very important for organizations to find new market trends and insights. The reviews from the websites or the social media are in structured and unstructured data that need a sentiment analysis approach in analyzing customer’s review. In this article, techniques used in will be defined. Definition of the ontology and description of its possible usage in sentiment analysis will be defined. It will lead to empirical research that related to mobile phones used in research and the ontology used in the experiment. The researcher also will explore the role of preprocessing data and feature selection methodology. As the result, ontology-based approach in sentiment analysis can help in achieving high accuracy for the classification task.

Keywords: feature selection, ontology, opinion, preprocessing data, sentiment analysis

Procedia PDF Downloads 178
6780 Features Reduction Using Bat Algorithm for Identification and Recognition of Parkinson Disease

Authors: P. Shrivastava, A. Shukla, K. Verma, S. Rungta

Abstract:

Parkinson's disease is a chronic neurological disorder that directly affects human gait. It leads to slowness of movement, causes muscle rigidity and tremors. Gait serve as a primary outcome measure for studies aiming at early recognition of disease. Using gait techniques, this paper implements efficient binary bat algorithm for an early detection of Parkinson's disease by selecting optimal features required for classification of affected patients from others. The data of 166 people, both fit and affected is collected and optimal feature selection is done using PSO and Bat algorithm. The reduced dataset is then classified using neural network. The experiments indicate that binary bat algorithm outperforms traditional PSO and genetic algorithm and gives a fairly good recognition rate even with the reduced dataset.

Keywords: parkinson, gait, feature selection, bat algorithm

Procedia PDF Downloads 518
6779 Product Feature Modelling for Integrating Product Design and Assembly Process Planning

Authors: Baha Hasan, Jan Wikander

Abstract:

This paper describes a part of the integrating work between assembly design and assembly process planning domains (APP). The work is based, in its first stage, on modelling assembly features to support APP. A multi-layer architecture, based on feature-based modelling, is proposed to establish a dynamic and adaptable link between product design using CAD tools and APP. The proposed approach is based on deriving “specific function” features from the “generic” assembly and form features extracted from the CAD tools. A hierarchal structure from “generic” to “specific” and from “high level geometrical entities” to “low level geometrical entities” is proposed in order to integrate geometrical and assembly data extracted from geometrical and assembly modelers to the required processes and resources in APP. The feature concept, feature-based modelling, and feature recognition techniques are reviewed.

Keywords: assembly feature, assembly process planning, feature, feature-based modelling, form feature, ontology

Procedia PDF Downloads 281
6778 From Type-I to Type-II Fuzzy System Modeling for Diagnosis of Hepatitis

Authors: Shahabeddin Sotudian, M. H. Fazel Zarandi, I. B. Turksen

Abstract:

Hepatitis is one of the most common and dangerous diseases that affects humankind, and exposes millions of people to serious health risks every year. Diagnosis of Hepatitis has always been a challenge for physicians. This paper presents an effective method for diagnosis of hepatitis based on interval Type-II fuzzy. This proposed system includes three steps: pre-processing (feature selection), Type-I and Type-II fuzzy classification, and system evaluation. KNN-FD feature selection is used as the preprocessing step in order to exclude irrelevant features and to improve classification performance and efficiency in generating the classification model. In the fuzzy classification step, an “indirect approach” is used for fuzzy system modeling by implementing the exponential compactness and separation index for determining the number of rules in the fuzzy clustering approach. Therefore, we first proposed a Type-I fuzzy system that had an accuracy of approximately 90.9%. In the proposed system, the process of diagnosis faces vagueness and uncertainty in the final decision. Thus, the imprecise knowledge was managed by using interval Type-II fuzzy logic. The results that were obtained show that interval Type-II fuzzy has the ability to diagnose hepatitis with an average accuracy of 93.94%. The classification accuracy obtained is the highest one reached thus far. The aforementioned rate of accuracy demonstrates that the Type-II fuzzy system has a better performance in comparison to Type-I and indicates a higher capability of Type-II fuzzy system for modeling uncertainty.

Keywords: hepatitis disease, medical diagnosis, type-I fuzzy logic, type-II fuzzy logic, feature selection

Procedia PDF Downloads 280
6777 Detecting HCC Tumor in Three Phasic CT Liver Images with Optimization of Neural Network

Authors: Mahdieh Khalilinezhad, Silvana Dellepiane, Gianni Vernazza

Abstract:

The aim of the present work is to build a model based on tissue characterization that is able to discriminate pathological and non-pathological regions from three-phasic CT images. Based on feature selection in different phases, in this research, we design a neural network system that has optimal neuron number in a hidden layer. Our approach consists of three steps: feature selection, feature reduction, and classification. For each ROI, 6 distinct set of texture features are extracted such as first order histogram parameters, absolute gradient, run-length matrix, co-occurrence matrix, autoregressive model, and wavelet, for a total of 270 texture features. We show that with the injection of liquid and the analysis of more phases the high relevant features in each region changed. Our results show that for detecting HCC tumor phase3 is the best one in most of the features that we apply to the classification algorithm. The percentage of detection between these two classes according to our method, relates to first order histogram parameters with the accuracy of 85% in phase 1, 95% phase 2, and 95% in phase 3.

Keywords: multi-phasic liver images, texture analysis, neural network, hidden layer

Procedia PDF Downloads 243
6776 Barnard Feature Point Detector for Low-Contractperiapical Radiography Image

Authors: Chih-Yi Ho, Tzu-Fang Chang, Chih-Chia Huang, Chia-Yen Lee

Abstract:

In dental clinics, the dentists use the periapical radiography image to assess the effectiveness of endodontic treatment of teeth with chronic apical periodontitis. Periapical radiography images are taken at different times to assess alveolar bone variation before and after the root canal treatment, and furthermore to judge whether the treatment was successful. Current clinical assessment of apical tissue recovery relies only on dentist personal experience. It is difficult to have the same standard and objective interpretations due to the dentist or radiologist personal background and knowledge. If periapical radiography images at the different time could be registered well, the endodontic treatment could be evaluated. In the image registration area, it is necessary to assign representative control points to the transformation model for good performances of registration results. However, detection of representative control points (feature points) on periapical radiography images is generally very difficult. Regardless of which traditional detection methods are practiced, sufficient feature points may not be detected due to the low-contrast characteristics of the x-ray image. Barnard detector is an algorithm for feature point detection based on grayscale value gradients, which can obtain sufficient feature points in the case of gray-scale contrast is not obvious. However, the Barnard detector would detect too many feature points, and they would be too clustered. This study uses the local extrema of clustering feature points and the suppression radius to overcome the problem, and compared different feature point detection methods. In the preliminary result, the feature points could be detected as representative control points by the proposed method.

Keywords: feature detection, Barnard detector, registration, periapical radiography image, endodontic treatment

Procedia PDF Downloads 426
6775 Automatic Moment-Based Texture Segmentation

Authors: Tudor Barbu

Abstract:

An automatic moment-based texture segmentation approach is proposed in this paper. First, we describe the related work in this computer vision domain. Our texture feature extraction, the first part of the texture recognition process, produces a set of moment-based feature vectors. For each image pixel, a texture feature vector is computed as a sequence of area moments. Second, an automatic pixel classification approach is proposed. The feature vectors are clustered using some unsupervised classification algorithm, the optimal number of clusters being determined using a measure based on validation indexes. From the resulted pixel classes one determines easily the desired texture regions of the image.

Keywords: image segmentation, moment-based, texture analysis, automatic classification, validation indexes

Procedia PDF Downloads 390
6774 Feature Weighting Comparison Based on Clustering Centers in the Detection of Diabetic Retinopathy

Authors: Kemal Polat

Abstract:

In this paper, three feature weighting methods have been used to improve the classification performance of diabetic retinopathy (DR). To classify the diabetic retinopathy, features extracted from the output of several retinal image processing algorithms, such as image-level, lesion-specific and anatomical components, have been used and fed them into the classifier algorithms. The dataset used in this study has been taken from University of California, Irvine (UCI) machine learning repository. Feature weighting methods including the fuzzy c-means clustering based feature weighting, subtractive clustering based feature weighting, and Gaussian mixture clustering based feature weighting, have been used and compered with each other in the classification of DR. After feature weighting, five different classifier algorithms comprising multi-layer perceptron (MLP), k- nearest neighbor (k-NN), decision tree, support vector machine (SVM), and Naïve Bayes have been used. The hybrid method based on combination of subtractive clustering based feature weighting and decision tree classifier has been obtained the classification accuracy of 100% in the screening of DR. These results have demonstrated that the proposed hybrid scheme is very promising in the medical data set classification.

Keywords: machine learning, data weighting, classification, data mining

Procedia PDF Downloads 307
6773 Video Text Information Detection and Localization in Lecture Videos Using Moments

Authors: Belkacem Soundes, Guezouli Larbi

Abstract:

This paper presents a robust and accurate method for text detection and localization over lecture videos. Frame regions are classified into text or background based on visual feature analysis. However, lecture video shows significant degradation mainly related to acquisition conditions, camera motion and environmental changes resulting in low quality videos. Hence, affecting feature extraction and description efficiency. Moreover, traditional text detection methods cannot be directly applied to lecture videos. Therefore, robust feature extraction methods dedicated to this specific video genre are required for robust and accurate text detection and extraction. Method consists of a three-step process: Slide region detection and segmentation; Feature extraction and non-text filtering. For robust and effective features extraction moment functions are used. Two distinct types of moments are used: orthogonal and non-orthogonal. For orthogonal Zernike Moments, both Pseudo Zernike moments are used, whereas for non-orthogonal ones Hu moments are used. Expressivity and description efficiency are given and discussed. Proposed approach shows that in general, orthogonal moments show high accuracy in comparison to the non-orthogonal one. Pseudo Zernike moments are more effective than Zernike with better computation time.

Keywords: text detection, text localization, lecture videos, pseudo zernike moments

Procedia PDF Downloads 128
6772 Dimensionality Reduction in Modal Analysis for Structural Health Monitoring

Authors: Elia Favarelli, Enrico Testi, Andrea Giorgetti

Abstract:

Autonomous structural health monitoring (SHM) of many structures and bridges became a topic of paramount importance for maintenance purposes and safety reasons. This paper proposes a set of machine learning (ML) tools to perform automatic feature selection and detection of anomalies in a bridge from vibrational data and compare different feature extraction schemes to increase the accuracy and reduce the amount of data collected. As a case study, the Z-24 bridge is considered because of the extensive database of accelerometric data in both standard and damaged conditions. The proposed framework starts from the first four fundamental frequencies extracted through operational modal analysis (OMA) and clustering, followed by density-based time-domain filtering (tracking). The fundamental frequencies extracted are then fed to a dimensionality reduction block implemented through two different approaches: feature selection (intelligent multiplexer) that tries to estimate the most reliable frequencies based on the evaluation of some statistical features (i.e., mean value, variance, kurtosis), and feature extraction (auto-associative neural network (ANN)) that combine the fundamental frequencies to extract new damage sensitive features in a low dimensional feature space. Finally, one class classifier (OCC) algorithms perform anomaly detection, trained with standard condition points, and tested with normal and anomaly ones. In particular, a new anomaly detector strategy is proposed, namely one class classifier neural network two (OCCNN2), which exploit the classification capability of standard classifiers in an anomaly detection problem, finding the standard class (the boundary of the features space in normal operating conditions) through a two-step approach: coarse and fine boundary estimation. The coarse estimation uses classics OCC techniques, while the fine estimation is performed through a feedforward neural network (NN) trained that exploits the boundaries estimated in the coarse step. The detection algorithms vare then compared with known methods based on principal component analysis (PCA), kernel principal component analysis (KPCA), and auto-associative neural network (ANN). In many cases, the proposed solution increases the performance with respect to the standard OCC algorithms in terms of F1 score and accuracy. In particular, by evaluating the correct features, the anomaly can be detected with accuracy and an F1 score greater than 96% with the proposed method.

Keywords: anomaly detection, frequencies selection, modal analysis, neural network, sensor network, structural health monitoring, vibration measurement

Procedia PDF Downloads 98
6771 A New Approach of Preprocessing with SVM Optimization Based on PSO for Bearing Fault Diagnosis

Authors: Tawfik Thelaidjia, Salah Chenikher

Abstract:

Bearing fault diagnosis has attracted significant attention over the past few decades. It consists of two major parts: vibration signal feature extraction and condition classification for the extracted features. In this paper, feature extraction from faulty bearing vibration signals is performed by a combination of the signal’s Kurtosis and features obtained through the preprocessing of the vibration signal samples using Db2 discrete wavelet transform at the fifth level of decomposition. In this way, a 7-dimensional vector of the vibration signal feature is obtained. After feature extraction from vibration signal, the support vector machine (SVM) was applied to automate the fault diagnosis procedure. To improve the classification accuracy for bearing fault prediction, particle swarm optimization (PSO) is employed to simultaneously optimize the SVM kernel function parameter and the penalty parameter. The results have shown feasibility and effectiveness of the proposed approach

Keywords: condition monitoring, discrete wavelet transform, fault diagnosis, kurtosis, machine learning, particle swarm optimization, roller bearing, rotating machines, support vector machine, vibration measurement

Procedia PDF Downloads 413
6770 Smartphone-Based Human Activity Recognition by Machine Learning Methods

Authors: Yanting Cao, Kazumitsu Nawata

Abstract:

As smartphones upgrading, their software and hardware are getting smarter, so the smartphone-based human activity recognition will be described as more refined, complex, and detailed. In this context, we analyzed a set of experimental data obtained by observing and measuring 30 volunteers with six activities of daily living (ADL). Due to the large sample size, especially a 561-feature vector with time and frequency domain variables, cleaning these intractable features and training a proper model becomes extremely challenging. After a series of feature selection and parameters adjustment, a well-performed SVM classifier has been trained.

Keywords: smart sensors, human activity recognition, artificial intelligence, SVM

Procedia PDF Downloads 122
6769 The Acquisition of Case in Biological Domain Based on Text Mining

Authors: Shen Jian, Hu Jie, Qi Jin, Liu Wei Jie, Chen Ji Yi, Peng Ying Hong

Abstract:

In order to settle the problem of acquiring case in biological related to design problems, a biometrics instance acquisition method based on text mining is presented. Through the construction of corpus text vector space and knowledge mining, the feature selection, similarity measure and case retrieval method of text in the field of biology are studied. First, we establish a vector space model of the corpus in the biological field and complete the preprocessing steps. Then, the corpus is retrieved by using the vector space model combined with the functional keywords to obtain the biological domain examples related to the design problems. Finally, we verify the validity of this method by taking the example of text.

Keywords: text mining, vector space model, feature selection, biologically inspired design

Procedia PDF Downloads 232
6768 Optimal Feature Extraction Dimension in Finger Vein Recognition Using Kernel Principal Component Analysis

Authors: Amir Hajian, Sepehr Damavandinejadmonfared

Abstract:

In this paper the issue of dimensionality reduction is investigated in finger vein recognition systems using kernel Principal Component Analysis (KPCA). One aspect of KPCA is to find the most appropriate kernel function on finger vein recognition as there are several kernel functions which can be used within PCA-based algorithms. In this paper, however, another side of PCA-based algorithms -particularly KPCA- is investigated. The aspect of dimension of feature vector in PCA-based algorithms is of importance especially when it comes to the real-world applications and usage of such algorithms. It means that a fixed dimension of feature vector has to be set to reduce the dimension of the input and output data and extract the features from them. Then a classifier is performed to classify the data and make the final decision. We analyze KPCA (Polynomial, Gaussian, and Laplacian) in details in this paper and investigate the optimal feature extraction dimension in finger vein recognition using KPCA.

Keywords: biometrics, finger vein recognition, principal component analysis (PCA), kernel principal component analysis (KPCA)

Procedia PDF Downloads 342
6767 Exploring Syntactic and Semantic Features for Text-Based Authorship Attribution

Authors: Haiyan Wu, Ying Liu, Shaoyun Shi

Abstract:

Authorship attribution is to extract features to identify authors of anonymous documents. Many previous works on authorship attribution focus on statistical style features (e.g., sentence/word length), content features (e.g., frequent words, n-grams). Modeling these features by regression or some transparent machine learning methods gives a portrait of the authors' writing style. But these methods do not capture the syntactic (e.g., dependency relationship) or semantic (e.g., topics) information. In recent years, some researchers model syntactic trees or latent semantic information by neural networks. However, few works take them together. Besides, predictions by neural networks are difficult to explain, which is vital in authorship attribution tasks. In this paper, we not only utilize the statistical style and content features but also take advantage of both syntactic and semantic features. Different from an end-to-end neural model, feature selection and prediction are two steps in our method. An attentive n-gram network is utilized to select useful features, and logistic regression is applied to give prediction and understandable representation of writing style. Experiments show that our extracted features can improve the state-of-the-art methods on three benchmark datasets.

Keywords: authorship attribution, attention mechanism, syntactic feature, feature extraction

Procedia PDF Downloads 110
6766 Multimodal Convolutional Neural Network for Musical Instrument Recognition

Authors: Yagya Raj Pandeya, Joonwhoan Lee

Abstract:

The dynamic behavior of music and video makes it difficult to evaluate musical instrument playing in a video by computer system. Any television or film video clip with music information are rich sources for analyzing musical instruments using modern machine learning technologies. In this research, we integrate the audio and video information sources using convolutional neural network (CNN) and pass network learned features through recurrent neural network (RNN) to preserve the dynamic behaviors of audio and video. We use different pre-trained CNN for music and video feature extraction and then fine tune each model. The music network use 2D convolutional network and video network use 3D convolution (C3D). Finally, we concatenate each music and video feature by preserving the time varying features. The long short term memory (LSTM) network is used for long-term dynamic feature characterization and then use late fusion with generalized mean. The proposed network performs better performance to recognize the musical instrument using audio-video multimodal neural network.

Keywords: multimodal, 3D convolution, music-video feature extraction, generalized mean

Procedia PDF Downloads 191
6765 Exploring Multi-Feature Based Action Recognition Using Multi-Dimensional Dynamic Time Warping

Authors: Guoliang Lu, Changhou Lu, Xueyong Li

Abstract:

In action recognition, previous studies have demonstrated the effectiveness of using multiple features to improve the recognition performance. We focus on two practical issues: i) most studies use a direct way of concatenating/accumulating multi features to evaluate the similarity between two actions. This way could be too strong since each kind of feature can include different dimensions, quantities, etc; ii) in many studies, the employed classification methods lack of a flexible and effective mechanism to add new feature(s) into classification. In this paper, we explore an unified scheme based on recently-proposed multi-dimensional dynamic time warping (MD-DTW). Experiments demonstrated the scheme's effectiveness of combining multi-feature and the flexibility of adding new feature(s) to increase the recognition performance. In addition, the explored scheme also provides us an open architecture for using new advanced classification methods in the future to enhance action recognition.

Keywords: action recognition, multi features, dynamic time warping, feature combination

Procedia PDF Downloads 418
6764 Retina Registration for Biometrics Based on Characterization of Retinal Feature Points

Authors: Nougrara Zineb

Abstract:

The unique structure of the blood vessels in the retina has been used for biometric identification. The retina blood vessel pattern is a unique pattern in each individual and it is almost impossible to forge that pattern in a false individual. The retina biometrics’ advantages include high distinctiveness, universality, and stability overtime of the blood vessel pattern. Once the creases have been extracted from the images, a registration stage is necessary, since the position of the retinal vessel structure could change between acquisitions due to the movements of the eye. Image registration consists of following steps: Feature detection, feature matching, transform model estimation and image resembling and transformation. In this paper, we present an algorithm of registration; it is based on the characterization of retinal feature points. For experiments, retinal images from the DRIVE database have been tested. The proposed methodology achieves good results for registration in general.

Keywords: fovea, optic disc, registration, retinal images

Procedia PDF Downloads 240
6763 A Comprehensive Study and Evaluation on Image Fashion Features Extraction

Authors: Yuanchao Sang, Zhihao Gong, Longsheng Chen, Long Chen

Abstract:

Clothing fashion represents a human’s aesthetic appreciation towards everyday outfits and appetite for fashion, and it reflects the development of status in society, humanity, and economics. However, modelling fashion by machine is extremely challenging because fashion is too abstract to be efficiently described by machines. Even human beings can hardly reach a consensus about fashion. In this paper, we are dedicated to answering a fundamental fashion-related problem: what image feature best describes clothing fashion? To address this issue, we have designed and evaluated various image features, ranging from traditional low-level hand-crafted features to mid-level style awareness features to various current popular deep neural network-based features, which have shown state-of-the-art performance in various vision tasks. In summary, we tested the following 9 feature representations: color, texture, shape, style, convolutional neural networks (CNNs), CNNs with distance metric learning (CNNs&DML), AutoEncoder, CNNs with multiple layer combination (CNNs&MLC) and CNNs with dynamic feature clustering (CNNs&DFC). Finally, we validated the performance of these features on two publicly available datasets. Quantitative and qualitative experimental results on both intra-domain and inter-domain fashion clothing image retrieval showed that deep learning based feature representations far outweigh traditional hand-crafted feature representation. Additionally, among all deep learning based methods, CNNs with explicit feature clustering performs best, which shows feature clustering is essential for discriminative fashion feature representation.

Keywords: convolutional neural network, feature representation, image processing, machine modelling

Procedia PDF Downloads 116
6762 Automatic Multi-Label Image Annotation System Guided by Firefly Algorithm and Bayesian Method

Authors: Saad M. Darwish, Mohamed A. El-Iskandarani, Guitar M. Shawkat

Abstract:

Nowadays, the amount of available multimedia data is continuously on the rise. The need to find a required image for an ordinary user is a challenging task. Content based image retrieval (CBIR) computes relevance based on the visual similarity of low-level image features such as color, textures, etc. However, there is a gap between low-level visual features and semantic meanings required by applications. The typical method of bridging the semantic gap is through the automatic image annotation (AIA) that extracts semantic features using machine learning techniques. In this paper, a multi-label image annotation system guided by Firefly and Bayesian method is proposed. Firstly, images are segmented using the maximum variance intra cluster and Firefly algorithm, which is a swarm-based approach with high convergence speed, less computation rate and search for the optimal multiple threshold. Feature extraction techniques based on color features and region properties are applied to obtain the representative features. After that, the images are annotated using translation model based on the Net Bayes system, which is efficient for multi-label learning with high precision and less complexity. Experiments are performed using Corel Database. The results show that the proposed system is better than traditional ones for automatic image annotation and retrieval.

Keywords: feature extraction, feature selection, image annotation, classification

Procedia PDF Downloads 563
6761 Customer Churn Prediction by Using Four Machine Learning Algorithms Integrating Features Selection and Normalization in the Telecom Sector

Authors: Alanoud Moraya Aldalan, Abdulaziz Almaleh

Abstract:

A crucial component of maintaining a customer-oriented business as in the telecom industry is understanding the reasons and factors that lead to customer churn. Competition between telecom companies has greatly increased in recent years. It has become more important to understand customers’ needs in this strong market of telecom industries, especially for those who are looking to turn over their service providers. So, predictive churn is now a mandatory requirement for retaining those customers. Machine learning can be utilized to accomplish this. Churn Prediction has become a very important topic in terms of machine learning classification in the telecommunications industry. Understanding the factors of customer churn and how they behave is very important to building an effective churn prediction model. This paper aims to predict churn and identify factors of customers’ churn based on their past service usage history. Aiming at this objective, the study makes use of feature selection, normalization, and feature engineering. Then, this study compared the performance of four different machine learning algorithms on the Orange dataset: Logistic Regression, Random Forest, Decision Tree, and Gradient Boosting. Evaluation of the performance was conducted by using the F1 score and ROC-AUC. Comparing the results of this study with existing models has proven to produce better results. The results showed the Gradients Boosting with feature selection technique outperformed in this study by achieving a 99% F1-score and 99% AUC, and all other experiments achieved good results as well.

Keywords: machine learning, gradient boosting, logistic regression, churn, random forest, decision tree, ROC, AUC, F1-score

Procedia PDF Downloads 109