Search results for: Features extraction
1848 Voice Features as the Diagnostic Marker of Autism
Authors: Elena Lyakso, Olga Frolova, Yuri Matveev
Abstract:
The aim of the study is to determine the acoustic features of voice and speech of children with autism spectrum disorders (ASD) as a possible additional diagnostic criterion. The participants in the study were 95 children with ASD aged 5-16 years, 150 typically development (TD) children, and 103 adults – listening to children’s speech samples. Three types of experimental methods for speech analysis were performed: spectrographic, perceptual by listeners, and automatic recognition. In the speech of children with ASD, the pitch values, pitch range, values of frequency and intensity of the third formant (emotional) leading to the “atypical” spectrogram of vowels are higher than corresponding parameters in the speech of TD children. High values of vowel articulation index (VAI) are specific for ASD children’s speech signals. These acoustic features can be considered as diagnostic marker of autism. The ability of humans and automatic recognition of the psychoneurological state of children via their speech is determined.
Keywords: Autism spectrum disorders, biomarker of autism, child speech, voice features.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6211847 Distinguishing Innocent Murmurs from Murmurs caused by Aortic Stenosis by Recurrence Quantification Analysis
Authors: Christer Ahlstrom, Katja Höglund, Peter Hult, Jens Häggström, Clarence Kvart, Per Ask
Abstract:
It is sometimes difficult to differentiate between innocent murmurs and pathological murmurs during auscultation. In these difficult cases, an intelligent stethoscope with decision support abilities would be of great value. In this study, using a dog model, phonocardiographic recordings were obtained from 27 boxer dogs with various degrees of aortic stenosis (AS) severity. As a reference for severity assessment, continuous wave Doppler was used. The data were analyzed with recurrence quantification analysis (RQA) with the aim to find features able to distinguish innocent murmurs from murmurs caused by AS. Four out of eight investigated RQA features showed significant differences between innocent murmurs and pathological murmurs. Using a plain linear discriminant analysis classifier, the best pair of features (recurrence rate and entropy) resulted in a sensitivity of 90% and a specificity of 88%. In conclusion, RQA provide valid features which can be used for differentiation between innocent murmurs and murmurs caused by AS.Keywords: Bioacoustics, murmur, phonocardiographic signal, recurrence quantification analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20051846 Evaluation of Classification Algorithms for Road Environment Detection
Authors: T. Anbu, K. Aravind Kumar
Abstract:
The road environment information is needed accurately for applications such as road maintenance and virtual 3D city modeling. Mobile laser scanning (MLS) produces dense point clouds from huge areas efficiently from which the road and its environment can be modeled in detail. Objects such as buildings, cars and trees are an important part of road environments. Different methods have been developed for detection of above such objects, but still there is a lack of accuracy due to the problems of illumination, environmental changes, and multiple objects with same features. In this work the comparison between different classifiers such as Multiclass SVM, kNN and Multiclass LDA for the road environment detection is analyzed. Finally the classification accuracy for kNN with LBP feature improved the classification accuracy as 93.3% than the other classifiers.Keywords: Classifiers, feature extraction, mobile-based laser scanning, object location estimation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7751845 An Edge-based Text Region Extraction Algorithm for Indoor Mobile Robot Navigation
Authors: Jagath Samarabandu, Xiaoqing Liu
Abstract:
Using bottom-up image processing algorithms to predict human eye fixations and extract the relevant embedded information in images has been widely applied in the design of active machine vision systems. Scene text is an important feature to be extracted, especially in vision-based mobile robot navigation as many potential landmarks such as nameplates and information signs contain text. This paper proposes an edge-based text region extraction algorithm, which is robust with respect to font sizes, styles, color/intensity, orientations, and effects of illumination, reflections, shadows, perspective distortion, and the complexity of image backgrounds. Performance of the proposed algorithm is compared against a number of widely used text localization algorithms and the results show that this method can quickly and effectively localize and extract text regions from real scenes and can be used in mobile robot navigation under an indoor environment to detect text based landmarks.
Keywords: Landmarks, mobile robot navigation, scene text, text localization and extraction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 29241844 Reduction of False Positives in Head-Shoulder Detection Based on Multi-Part Color Segmentation
Authors: Lae-Jeong Park
Abstract:
The paper presents a method that utilizes figure-ground color segmentation to extract effective global feature in terms of false positive reduction in the head-shoulder detection. Conventional detectors that rely on local features such as HOG due to real-time operation suffer from false positives. Color cue in an input image provides salient information on a global characteristic which is necessary to alleviate the false positives of the local feature based detectors. An effective approach that uses figure-ground color segmentation has been presented in an effort to reduce the false positives in object detection. In this paper, an extended version of the approach is presented that adopts separate multipart foregrounds instead of a single prior foreground and performs the figure-ground color segmentation with each of the foregrounds. The multipart foregrounds include the parts of the head-shoulder shape and additional auxiliary foregrounds being optimized by a search algorithm. A classifier is constructed with the feature that consists of a set of the multiple resulting segmentations. Experimental results show that the presented method can discriminate more false positive than the single prior shape-based classifier as well as detectors with the local features. The improvement is possible because the presented approach can reduce the false positives that have the same colors in the head and shoulder foregrounds.
Keywords: Pedestrian detection, color segmentation, false positives, feature extraction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11441843 Unsupervised Feature Selection Using Feature Density Functions
Authors: Mina Alibeigi, Sattar Hashemi, Ali Hamzeh
Abstract:
Since dealing with high dimensional data is computationally complex and sometimes even intractable, recently several feature reductions methods have been developed to reduce the dimensionality of the data in order to simplify the calculation analysis in various applications such as text categorization, signal processing, image retrieval, gene expressions and etc. Among feature reduction techniques, feature selection is one the most popular methods due to the preservation of the original features. In this paper, we propose a new unsupervised feature selection method which will remove redundant features from the original feature space by the use of probability density functions of various features. To show the effectiveness of the proposed method, popular feature selection methods have been implemented and compared. Experimental results on the several datasets derived from UCI repository database, illustrate the effectiveness of our proposed methods in comparison with the other compared methods in terms of both classification accuracy and the number of selected features.Keywords: Feature, Feature Selection, Filter, Probability Density Function
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20771842 Classification Influence Index and its Application for k-Nearest Neighbor Classifier
Authors: Sejong Oh
Abstract:
Classification is an important topic in machine learning and bioinformatics. Many datasets have been introduced for classification tasks. A dataset contains multiple features, and the quality of features influences the classification accuracy of the dataset. The power of classification for each feature differs. In this study, we suggest the Classification Influence Index (CII) as an indicator of classification power for each feature. CII enables evaluation of the features in a dataset and improved classification accuracy by transformation of the dataset. By conducting experiments using CII and the k-nearest neighbor classifier to analyze real datasets, we confirmed that the proposed index provided meaningful improvement of the classification accuracy.Keywords: accuracy, classification, dataset, data preprocessing
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14951841 Remaining Useful Life Estimation of Bearings Based on Nonlinear Dimensional Reduction Combined with Timing Signals
Authors: Zhongmin Wang, Wudong Fan, Hengshan Zhang, Yimin Zhou
Abstract:
In data-driven prognostic methods, the prediction accuracy of the estimation for remaining useful life of bearings mainly depends on the performance of health indicators, which are usually fused some statistical features extracted from vibrating signals. However, the existing health indicators have the following two drawbacks: (1) The differnet ranges of the statistical features have the different contributions to construct the health indicators, the expert knowledge is required to extract the features. (2) When convolutional neural networks are utilized to tackle time-frequency features of signals, the time-series of signals are not considered. To overcome these drawbacks, in this study, the method combining convolutional neural network with gated recurrent unit is proposed to extract the time-frequency image features. The extracted features are utilized to construct health indicator and predict remaining useful life of bearings. First, original signals are converted into time-frequency images by using continuous wavelet transform so as to form the original feature sets. Second, with convolutional and pooling layers of convolutional neural networks, the most sensitive features of time-frequency images are selected from the original feature sets. Finally, these selected features are fed into the gated recurrent unit to construct the health indicator. The results state that the proposed method shows the enhance performance than the related studies which have used the same bearing dataset provided by PRONOSTIA.Keywords: Continuous wavelet transform, convolution neural network, gated recurrent unit, health indicators, remaining useful life.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 7691840 Adaptive Kernel Principal Analysis for Online Feature Extraction
Authors: Mingtao Ding, Zheng Tian, Haixia Xu
Abstract:
The batch nature limits the standard kernel principal component analysis (KPCA) methods in numerous applications, especially for dynamic or large-scale data. In this paper, an efficient adaptive approach is presented for online extraction of the kernel principal components (KPC). The contribution of this paper may be divided into two parts. First, kernel covariance matrix is correctly updated to adapt to the changing characteristics of data. Second, KPC are recursively formulated to overcome the batch nature of standard KPCA.This formulation is derived from the recursive eigen-decomposition of kernel covariance matrix and indicates the KPC variation caused by the new data. The proposed method not only alleviates sub-optimality of the KPCA method for non-stationary data, but also maintains constant update speed and memory usage as the data-size increases. Experiments for simulation data and real applications demonstrate that our approach yields improvements in terms of both computational speed and approximation accuracy.
Keywords: adaptive method, kernel principal component analysis, online extraction, recursive algorithm
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15521839 Key Frame Based Video Summarization via Dependency Optimization
Authors: Janya Sainui
Abstract:
As a rapid growth of digital videos and data communications, video summarization that provides a shorter version of the video for fast video browsing and retrieval is necessary. Key frame extraction is one of the mechanisms to generate video summary. In general, the extracted key frames should both represent the entire video content and contain minimum redundancy. However, most of the existing approaches heuristically select key frames; hence, the selected key frames may not be the most different frames and/or not cover the entire content of a video. In this paper, we propose a method of video summarization which provides the reasonable objective functions for selecting key frames. In particular, we apply a statistical dependency measure called quadratic mutual informaion as our objective functions for maximizing the coverage of the entire video content as well as minimizing the redundancy among selected key frames. The proposed key frame extraction algorithm finds key frames as an optimization problem. Through experiments, we demonstrate the success of the proposed video summarization approach that produces video summary with better coverage of the entire video content while less redundancy among key frames comparing to the state-of-the-art approaches.Keywords: Video summarization, key frame extraction, dependency measure, quadratic mutual information, optimization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9641838 Development of Algorithms for the Study of the Image in Digital Form for Satellite Applications: Extraction of a Road Network and Its Nodes
Authors: Z. Nougrara
Abstract:
In this paper we propose a novel methodology for extracting a road network and its nodes from satellite images of Algeria country. This developed technique is a progress of our previous research works. It is founded on the information theory and the mathematical morphology; the information theory and the mathematical morphology are combined together to extract and link the road segments to form a road network and its nodes. We therefore have to define objects as sets of pixels and to study the shape of these objects and the relations that exist between them. In this approach, geometric and radiometric features of roads are integrated by a cost function and a set of selected points of a crossing road. Its performances were tested on satellite images of Algeria country.Keywords: Satellite image, road network, nodes.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16981837 Intelligent Assistive Methods for Diagnosis of Rheumatoid Arthritis Using Histogram Smoothing and Feature Extraction of Bone Images
Authors: SP. Chokkalingam, K. Komathy
Abstract:
Advances in the field of image processing envision a new era of evaluation techniques and application of procedures in various different fields. One such field being considered is the biomedical field for prognosis as well as diagnosis of diseases. This plethora of methods though provides a wide range of options to select from, it also proves confusion in selecting the apt process and also in finding which one is more suitable. Our objective is to use a series of techniques on bone scans, so as to detect the occurrence of rheumatoid arthritis (RA) as accurately as possible. Amongst other techniques existing in the field our proposed system tends to be more effective as it depends on new methodologies that have been proved to be better and more consistent than others. Computer aided diagnosis will provide more accurate and infallible rate of consistency that will help to improve the efficiency of the system. The image first undergoes histogram smoothing and specification, morphing operation, boundary detection by edge following algorithm and finally image subtraction to determine the presence of rheumatoid arthritis in a more efficient and effective way. Using preprocessing noises are removed from images and using segmentation, region of interest is found and Histogram smoothing is applied for a specific portion of the images. Gray level co-occurrence matrix (GLCM) features like Mean, Median, Energy, Correlation, Bone Mineral Density (BMD) and etc. After finding all the features it stores in the database. This dataset is trained with inflamed and noninflamed values and with the help of neural network all the new images are checked properly for their status and Rough set is implemented for further reduction.
Keywords: Computer Aided Diagnosis, Edge Detection, Histogram Smoothing, Rheumatoid Arthritis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24791836 2D Gabor Functions and FCMI Algorithm for Flaws Detection in Ultrasonic Images
Authors: Kechida Ahmed, Drai Redouane, Khelil Mohamed
Abstract:
In this paper we present a new approach to detecting a flaw in T.O.F.D (Time Of Flight Diffraction) type ultrasonic image based on texture features. Texture is one of the most important features used in recognizing patterns in an image. The paper describes texture features based on 2D Gabor functions, i.e., Gaussian shaped band-pass filters, with dyadic treatment of the radial spatial frequency range and multiple orientations, which represent an appropriate choice for tasks requiring simultaneous measurement in both space and frequency domains. The most relevant features are used as input data on a Fuzzy c-mean clustering classifier. The classes that exist are only two: 'defects' or 'no defects'. The proposed approach is tested on the T.O.F.D image achieved at the laboratory and on the industrial field.Keywords: 2D Gabor Functions, flaw detection, fuzzy c-mean clustering, non destructive testing, texture analysis, T.O.F.D Image (Time of Flight Diffraction).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17531835 Development of Sleep Quality Index Using Heart Rate
Authors: Dongjoo Kim, Chang-Sik Son, Won-Seok Kang
Abstract:
Adequate sleep affects various parts of one’s overall physical and mental life. As one of the methods in determining the appropriate amount of sleep, this research presents a heart rate based sleep quality index. In order to evaluate sleep quality using the heart rate, sleep data from 280 subjects taken over one month are used. Their sleep data are categorized by a three-part heart rate range. After categorizing, some features are extracted, and the statistical significances are verified for these features. The results show that some features of this sleep quality index model have statistical significance. Thus, this heart rate based sleep quality index may be a useful discriminator of sleep.Keywords: Sleep, sleep quality, heart rate, statistical analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15041834 A Fast Object Detection Method with Rotation Invariant Features
Authors: Zilong He, Yuesheng Zhu
Abstract:
Based on the combined shape feature and texture feature, a fast object detection method with rotation invariant features is proposed in this paper. A quick template matching scheme based online learning designed for online applications is also introduced in this paper. The experimental results have shown that the proposed approach has the features of lower computation complexity and higher detection rate, while keeping almost the same performance compared to the HOG-based method, and can be more suitable for run time applications.Keywords: gradient feature, online learning, rotationinvariance, template feature
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24771833 Comparison of Parameterization Methods in Recognizing Spoken Arabic Digits
Authors: Ali Ganoun
Abstract:
This paper proposes evaluation of sound parameterization methods in recognizing some spoken Arabic words, namely digits from zero to nine. Each isolated spoken word is represented by a single template based on a specific recognition feature, and the recognition is based on the Euclidean distance from those templates. The performance analysis of recognition is based on four parameterization features: the Burg Spectrum Analysis, the Walsh Spectrum Analysis, the Thomson Multitaper Spectrum Analysis and the Mel Frequency Cepstral Coefficients (MFCC) features. The main aim of this paper was to compare, analyze, and discuss the outcomes of spoken Arabic digits recognition systems based on the selected recognition features. The results acqired confirm that the use of MFCC features is a very promising method in recognizing Spoken Arabic digits.
Keywords: Speech Recognition, Spectrum Analysis, Burg Spectrum, Walsh Spectrum Analysis, Thomson Multitaper Spectrum, MFCC.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15931832 Fusing Local Binary Patterns with Wavelet Features for Ethnicity Identification
Authors: S. Hma Salah, H. Du, N. Al-Jawad
Abstract:
Ethnicity identification of face images is of interest in many areas of application, but existing methods are few and limited. This paper presents a fusion scheme that uses block-based uniform local binary patterns and Haar wavelet transform to combine local and global features. In particular, the LL subband coefficients of the whole face are fused with the histograms of uniform local binary patterns from block partitions of the face. We applied the principal component analysis on the fused features and managed to reduce the dimensionality of the feature space from 536 down to around 15 without sacrificing too much accuracy. We have conducted a number of preliminary experiments using a collection of 746 subject face images. The test results show good accuracy and demonstrate the potential of fusing global and local features. The fusion approach is robust, making it easy to further improve the identification at both feature and score levels.
Keywords: Ethnicity identification, fusion, local binary patterns, wavelet.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 29921831 Comparison of Domain and Hydrophobicity Features for the Prediction of Protein-Protein Interactions using Support Vector Machines
Authors: Hany Alashwal, Safaai Deris, Razib M. Othman
Abstract:
The protein domain structure has been widely used as the most informative sequence feature to computationally predict protein-protein interactions. However, in a recent study, a research group has reported a very high accuracy of 94% using hydrophobicity feature. Therefore, in this study we compare and verify the usefulness of protein domain structure and hydrophobicity properties as the sequence features. Using the Support Vector Machines (SVM) as the learning system, our results indicate that both features achieved accuracy of nearly 80%. Furthermore, domains structure had receiver operating characteristic (ROC) score of 0.8480 with running time of 34 seconds, while hydrophobicity had ROC score of 0.8159 with running time of 20,571 seconds (5.7 hours). These results indicate that protein-protein interaction can be predicted from domain structure with reliable accuracy and acceptable running time.
Keywords: Bioinformatics, protein-protein interactions, support vector machines, protein features.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19191830 Normalization Discriminant Independent Component Analysis
Authors: Liew Yee Ping, Pang Ying Han, Lau Siong Hoe, Ooi Shih Yin, Housam Khalifa Bashier Babiker
Abstract:
In face recognition, feature extraction techniques attempts to search for appropriate representation of the data. However, when the feature dimension is larger than the samples size, it brings performance degradation. Hence, we propose a method called Normalization Discriminant Independent Component Analysis (NDICA). The input data will be regularized to obtain the most reliable features from the data and processed using Independent Component Analysis (ICA). The proposed method is evaluated on three face databases, Olivetti Research Ltd (ORL), Face Recognition Technology (FERET) and Face Recognition Grand Challenge (FRGC). NDICA showed it effectiveness compared with other unsupervised and supervised techniques.
Keywords: Face recognition, small sample size, regularization, independent component analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19551829 Breast Cancer Survivability Prediction via Classifier Ensemble
Authors: Mohamed Al-Badrashiny, Abdelghani Bellaachia
Abstract:
This paper presents a classifier ensemble approach for predicting the survivability of the breast cancer patients using the latest database version of the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute. The system consists of two main components; features selection and classifier ensemble components. The features selection component divides the features in SEER database into four groups. After that it tries to find the most important features among the four groups that maximizes the weighted average F-score of a certain classification algorithm. The ensemble component uses three different classifiers, each of which models different set of features from SEER through the features selection module. On top of them, another classifier is used to give the final decision based on the output decisions and confidence scores from each of the underlying classifiers. Different classification algorithms have been examined; the best setup found is by using the decision tree, Bayesian network, and Na¨ıve Bayes algorithms for the underlying classifiers and Na¨ıve Bayes for the classifier ensemble step. The system outperforms all published systems to date when evaluated against the exact same data of SEER (period of 1973-2002). It gives 87.39% weighted average F-score compared to 85.82% and 81.34% of the other published systems. By increasing the data size to cover the whole database (period of 1973-2014), the overall weighted average F-score jumps to 92.4% on the held out unseen test set.Keywords: Classifier ensemble, breast cancer survivability, data mining, SEER.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16721828 Extracting Human Body based on Background Estimation in Modified HLS Color Space
Authors: Jang-Hee Yoo, Doosung Hwang, Jong-Wook Han, Ki-Young Moon
Abstract:
The ability to recognize humans and their activities by computer vision is a very important task, with many potential application. Study of human motion analysis is related to several research areas of computer vision such as the motion capture, detection, tracking and segmentation of people. In this paper, we describe a segmentation method for extracting human body contour in modified HLS color space. To estimate a background, the modified HLS color space is proposed, and the background features are estimated by using the HLS color components. Here, the large amount of human dataset, which was collected from DV cameras, is pre-processed. The human body and its contour is successfully extracted from the image sequences.
Keywords: Background Subtraction, Human Silhouette Extraction, HLS Color Space, and Object Segmentation
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24371827 Using PFA in Feature Analysis and Selection for H.264 Adaptation
Authors: Nora A. Naguib, Ahmed E. Hussein, Hesham A. Keshk, Mohamed I. El-Adawy
Abstract:
Classification of video sequences based on their contents is a vital process for adaptation techniques. It helps decide which adaptation technique best fits the resource reduction requested by the client. In this paper we used the principal feature analysis algorithm to select a reduced subset of video features. The main idea is to select only one feature from each class based on the similarities between the features within that class. Our results showed that using this feature reduction technique the source video features can be completely omitted from future classification of video sequences.
Keywords: Adaptation, feature selection, H.264, Principal Feature Analysis (PFA)
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16071826 A 3D Approach for Extraction of the Coronaryartery and Quantification of the Stenosis
Authors: Mahdi Mazinani, S. D. Qanadli, Rahil Hosseini, Tim Ellis, Jamshid Dehmeshki
Abstract:
Segmentation and quantification of stenosis is an important task in assessing coronary artery disease. One of the main challenges is measuring the real diameter of curved vessels. Moreover, uncertainty in segmentation of different tissues in the narrow vessel is an important issue that affects accuracy. This paper proposes an algorithm to extract coronary arteries and measure the degree of stenosis. Markovian fuzzy clustering method is applied to model uncertainty arises from partial volume effect problem. The algorithm employs: segmentation, centreline extraction, estimation of orthogonal plane to centreline, measurement of the degree of stenosis. To evaluate the accuracy and reproducibility, the approach has been applied to a vascular phantom and the results are compared with real diameter. The results of 10 patient datasets have been visually judged by a qualified radiologist. The results reveal the superiority of the proposed method compared to the Conventional thresholding Method (CTM) on both datasets.Keywords: 3D coronary artery tree extraction, segmentation, quantification, fuzzy clustering, and Markov random field
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15821825 Feature Selection and Predictive Modeling of Housing Data Using Random Forest
Authors: Bharatendra Rai
Abstract:
Predictive data analysis and modeling involving machine learning techniques become challenging in presence of too many explanatory variables or features. Presence of too many features in machine learning is known to not only cause algorithms to slow down, but they can also lead to decrease in model prediction accuracy. This study involves housing dataset with 79 quantitative and qualitative features that describe various aspects people consider while buying a new house. Boruta algorithm that supports feature selection using a wrapper approach build around random forest is used in this study. This feature selection process leads to 49 confirmed features which are then used for developing predictive random forest models. The study also explores five different data partitioning ratios and their impact on model accuracy are captured using coefficient of determination (r-square) and root mean square error (rsme).
Keywords: Housing data, feature selection, random forest, Boruta algorithm, root mean square error.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17171824 Ottoman Script Recognition Using Hidden Markov Model
Authors: Ayşe Onat, Ferruh Yildiz, Mesut Gündüz
Abstract:
In this study, an OCR system for segmentation, feature extraction and recognition of Ottoman Scripts has been developed using handwritten characters. Detection of handwritten characters written by humans is a difficult process. Segmentation and feature extraction stages are based on geometrical feature analysis, followed by the chain code transformation of the main strokes of each character. The output of segmentation is well-defined segments that can be fed into any classification approach. The classes of main strokes are identified through left-right Hidden Markov Model (HMM).Keywords: Chain Code, HMM, Ottoman Script Recognition, OCR
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23211823 Survey on Image Mining Using Genetic Algorithm
Authors: Jyoti Dua
Abstract:
One image is worth more than thousand words. Images if analyzed can reveal useful information. Low level image processing deals with the extraction of specific feature from a single image. Now the question arises: What technique should be used to extract patterns of very large and detailed image database? The answer of the question is: “Image Mining”. Image Mining deals with the extraction of image data relationship, implicit knowledge, and another pattern from the collection of images or image database. It is nothing but the extension of Data Mining. In the following paper, not only we are going to scrutinize the current techniques of image mining but also present a new technique for mining images using Genetic Algorithm.
Keywords: Image Mining, Data Mining, Genetic Algorithm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24461822 Bangla Vowel Characterization Based on Analysis by Synthesis
Authors: Syed Akhter Hossain, M. Lutfar Rahman, Farruk Ahmed
Abstract:
Bangla Vowel characterization determines the spectral properties of Bangla vowels for efficient synthesis as well as recognition of Bangla vowels. In this paper, Bangla vowels in isolated word have been analyzed based on speech production model within the framework of Analysis-by-Synthesis. This has led to the extraction of spectral parameters for the production model in order to produce different Bangla vowel sounds. The real and synthetic spectra are compared and a weighted square error has been computed along with the error in the formant bandwidths for efficient representation of Bangla vowels. The extracted features produced good representation of targeted Bangla vowel. Such a representation also plays essential role in low bit rate speech coding and vocoders.
Keywords: Speech, vowel, formant, synthesis, spectrum, LPC.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23721821 Genetic Algorithms for Feature Generation in the Context of Audio Classification
Authors: José A. Menezes, Giordano Cabral, Bruno T. Gomes
Abstract:
Choosing good features is an essential part of machine learning. Recent techniques aim to automate this process. For instance, feature learning intends to learn the transformation of raw data into a useful representation to machine learning tasks. In automatic audio classification tasks, this is interesting since the audio, usually complex information, needs to be transformed into a computationally convenient input to process. Another technique tries to generate features by searching a feature space. Genetic algorithms, for instance, have being used to generate audio features by combining or modifying them. We find this approach particularly interesting and, despite the undeniable advances of feature learning approaches, we wanted to take a step forward in the use of genetic algorithms to find audio features, combining them with more conventional methods, like PCA, and inserting search control mechanisms, such as constraints over a confusion matrix. This work presents the results obtained on particular audio classification problems.
Keywords: Feature generation, feature learning, genetic algorithm, music information retrieval.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10801820 Extraction of Symbolic Rules from Artificial Neural Networks
Authors: S. M. Kamruzzaman, Md. Monirul Islam
Abstract:
Although backpropagation ANNs generally predict better than decision trees do for pattern classification problems, they are often regarded as black boxes, i.e., their predictions cannot be explained as those of decision trees. In many applications, it is desirable to extract knowledge from trained ANNs for the users to gain a better understanding of how the networks solve the problems. A new rule extraction algorithm, called rule extraction from artificial neural networks (REANN) is proposed and implemented to extract symbolic rules from ANNs. A standard three-layer feedforward ANN is the basis of the algorithm. A four-phase training algorithm is proposed for backpropagation learning. Explicitness of the extracted rules is supported by comparing them to the symbolic rules generated by other methods. Extracted rules are comparable with other methods in terms of number of rules, average number of conditions for a rule, and predictive accuracy. Extensive experimental studies on several benchmarks classification problems, such as breast cancer, iris, diabetes, and season classification problems, demonstrate the effectiveness of the proposed approach with good generalization ability.Keywords: Backpropagation, clustering algorithm, constructivealgorithm, continuous activation function, pruning algorithm, ruleextraction algorithm, symbolic rules.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16161819 Comparing Emotion Recognition from Voice and Facial Data Using Time Invariant Features
Authors: Vesna Kirandziska, Nevena Ackovska, Ana Madevska Bogdanova
Abstract:
The problem of emotion recognition is a challenging problem. It is still an open problem from the aspect of both intelligent systems and psychology. In this paper, both voice features and facial features are used for building an emotion recognition system. A Support Vector Machine classifiers are built by using raw data from video recordings. In this paper, the results obtained for the emotion recognition are given, and a discussion about the validity and the expressiveness of different emotions is presented. A comparison between the classifiers build from facial data only, voice data only and from the combination of both data is made here. The need for a better combination of the information from facial expression and voice data is argued.
Keywords: Emotion recognition, facial recognition, signal processing, machine learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2019