Search results for: Feature Subset Selection
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1880

Search results for: Feature Subset Selection

1310 Flocking Behaviors for Multiple Groups with Heterogeneous Agents

Authors: Jae Moon Lee

Abstract:

Most of researches for conventional simulations were studied focusing on flocks with a single species. While there exist the flocking behaviors with a single species in nature, the flocking behaviors are frequently observed with multi-species. This paper studies on the flocking simulation for heterogeneous agents. In order to simulate the flocks for heterogeneous agents, the conventional method uses the identifier of flock, while the proposed method defines the feature vector of agent and uses the similarity between agents by comparing with those feature vectors. Based on the similarity, the paper proposed the attractive force and repulsive force and then executed the simulation by applying two forces. The results of simulation showed that flock formation with heterogeneous agents is very natural in both cases. In addition, it showed that unlike the existing method, the proposed method can not only control the density of the flocks, but also be possible for two different groups of agents to flock close to each other if they have a high similarity.

Keywords: Flocking behavior, heterogeneous agents, similarity, simulation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1599
1309 Video Shot Detection and Key Frame Extraction Using Faber Shauder DWT and SVD

Authors: Assma Azeroual, Karim Afdel, Mohamed El Hajji, Hassan Douzi

Abstract:

Key frame extraction methods select the most representative frames of a video, which can be used in different areas of video processing such as video retrieval, video summary, and video indexing. In this paper we present a novel approach for extracting key frames from video sequences. The frame is characterized uniquely by his contours which are represented by the dominant blocks. These dominant blocks are located on the contours and its near textures. When the video frames have a noticeable changement, its dominant blocks changed, then we can extracte a key frame. The dominant blocks of every frame is computed, and then feature vectors are extracted from the dominant blocks image of each frame and arranged in a feature matrix. Singular Value Decomposition is used to calculate sliding windows ranks of those matrices. Finally the computed ranks are traced and then we are able to extract key frames of a video. Experimental results show that the proposed approach is robust against a large range of digital effects used during shot transition.

Keywords: Key Frame Extraction, Shot detection, FSDWT, Singular Value Decomposition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2520
1308 Efficient Feature Fusion for Noise Iris in Unconstrained Environment

Authors: Yao-Hong Tsai

Abstract:

This paper presents an efficient fusion algorithm for iris images to generate stable feature for recognition in unconstrained environment. Recently, iris recognition systems are focused on real scenarios in our daily life without the subject’s cooperation. Under large variation in the environment, the objective of this paper is to combine information from multiple images of the same iris. The result of image fusion is a new image which is more stable for further iris recognition than each original noise iris image. A wavelet-based approach for multi-resolution image fusion is applied in the fusion process. The detection of the iris image is based on Adaboost algorithm and then local binary pattern (LBP) histogram is then applied to texture classification with the weighting scheme. Experiment showed that the generated features from the proposed fusion algorithm can improve the performance for verification system through iris recognition.

Keywords: Image fusion, iris recognition, local binary pattern, wavelet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2217
1307 PSO-based Possibilistic Portfolio Model with Transaction Costs

Authors: Wei Chen, Cui-you Yao, Yue Qiu

Abstract:

This paper deals with a portfolio selection problem based on the possibility theory under the assumption that the returns of assets are LR-type fuzzy numbers. A possibilistic portfolio model with transaction costs is proposed, in which the possibilistic mean value of the return is termed measure of investment return, and the possibilistic variance of the return is termed measure of investment risk. Due to considering transaction costs, the existing traditional optimization algorithms usually fail to find the optimal solution efficiently and heuristic algorithms can be the best method. Therefore, a particle swarm optimization is designed to solve the corresponding optimization problem. At last, a numerical example is given to illustrate our proposed effective means and approaches.

Keywords: Possibility theory, portfolio selection, transaction costs, particle swarm optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1534
1306 A Non-Parametric Based Mapping Algorithm for Use in Audio Fingerprinting

Authors: Analise Borg, Paul Micallef

Abstract:

Over the past few years, the online multimedia collection has grown at a fast pace. Several companies showed interest to study the different ways to organise the amount of audio information without the need of human intervention to generate metadata. In the past few years, many applications have emerged on the market which are capable of identifying a piece of music in a short time. Different audio effects and degradation make it much harder to identify the unknown piece. In this paper, an audio fingerprinting system which makes use of a non-parametric based algorithm is presented. Parametric analysis is also performed using Gaussian Mixture Models (GMMs). The feature extraction methods employed are the Mel Spectrum Coefficients and the MPEG-7 basic descriptors. Bin numbers replaced the extracted feature coefficients during the non-parametric modelling. The results show that nonparametric analysis offer potential results as the ones mentioned in the literature.

Keywords: Audio fingerprinting, mapping algorithm, Gaussian Mixture Models, MFCC, MPEG-7.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2285
1305 Feature Preserving Image Interpolation and Enhancement Using Adaptive Bidirectional Flow

Authors: Shujun Fu, Qiuqi Ruan, Wenqia Wang

Abstract:

Image interpolation is a common problem in imaging applications. However, most interpolation algorithms in existence suffer visually to some extent the effects of blurred edges and jagged artifacts in the image. This paper presents an adaptive feature preserving bidirectional flow process, where an inverse diffusion is performed to enhance edges along the normal directions to the isophote lines (edges), while a normal diffusion is done to remove artifacts (''jaggies'') along the tangent directions. In order to preserve image features such as edges, angles and textures, the nonlinear diffusion coefficients are locally adjusted according to the first and second order directional derivatives of the image. Experimental results on synthetic images and nature images demonstrate that our interpolation algorithm substantially improves the subjective quality of the interpolated images over conventional interpolations.

Keywords: anisotropic diffusion, bidirectional flow, directionalderivatives, edge enhancement, image interpolation, inverse flow, shock filter.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1498
1304 Breast Cancer Survivability Prediction via Classifier Ensemble

Authors: Mohamed Al-Badrashiny, Abdelghani Bellaachia

Abstract:

This paper presents a classifier ensemble approach for predicting the survivability of the breast cancer patients using the latest database version of the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute. The system consists of two main components; features selection and classifier ensemble components. The features selection component divides the features in SEER database into four groups. After that it tries to find the most important features among the four groups that maximizes the weighted average F-score of a certain classification algorithm. The ensemble component uses three different classifiers, each of which models different set of features from SEER through the features selection module. On top of them, another classifier is used to give the final decision based on the output decisions and confidence scores from each of the underlying classifiers. Different classification algorithms have been examined; the best setup found is by using the decision tree, Bayesian network, and Na¨ıve Bayes algorithms for the underlying classifiers and Na¨ıve Bayes for the classifier ensemble step. The system outperforms all published systems to date when evaluated against the exact same data of SEER (period of 1973-2002). It gives 87.39% weighted average F-score compared to 85.82% and 81.34% of the other published systems. By increasing the data size to cover the whole database (period of 1973-2014), the overall weighted average F-score jumps to 92.4% on the held out unseen test set.

Keywords: Classifier ensemble, breast cancer survivability, data mining, SEER.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1671
1303 Microscopic Simulation of Toll Plaza Safety and Operations

Authors: Bekir O. Bartin, Kaan Ozbay, Sandeep Mudigonda, Hong Yang

Abstract:

The use of microscopic traffic simulation in evaluating the operational and safety conditions at toll plazas is demonstrated. Two toll plazas in New Jersey are selected as case studies and were developed and validated in Paramics traffic simulation software. In order to simulate drivers’ lane selection behavior in Paramics, a utility-based lane selection approach is implemented in Paramics Application Programming Interface (API). For each vehicle approaching the toll plaza, a utility value is assigned to each toll lane by taking into account the factors that are likely to impact drivers’ lane selection behavior, such as approach lane, exit lane and queue lengths. The results demonstrate that similar operational conditions, such as lane-by-lane toll plaza traffic volume can be attained using this approach. In addition, assessment of safety at toll plazas is conducted via a surrogate safety measure. In particular, the crash index (CI), an improved surrogate measure of time-to-collision (TTC), which reflects the severity of a crash is used in the simulation analyses. The results indicate that the spatial and temporal frequency of observed crashes can be simulated using the proposed methodology. Further analyses can be conducted to evaluate and compare various different operational decisions and safety measures using microscopic simulation models.

Keywords: Microscopic simulation, toll plaza, surrogate safety, application programming interface.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 788
1302 Low Cost Chip Set Selection Algorithm for Multi-way Partitioning of Digital System

Authors: Jae Young Park, Soongyu Kwon, Kyu Han Kim, Hyeong Geon Lee, Jong Tae Kim

Abstract:

This paper considers the problem of finding low cost chip set for a minimum cost partitioning of a large logic circuits. Chip sets are selected from a given library. Each chip in the library has a different price, area, and I/O pin. We propose a low cost chip set selection algorithm. Inputs to the algorithm are a netlist and a chip information in the library. Output is a list of chip sets satisfied with area and maximum partitioning number and it is sorted by cost. The algorithm finds the sorted list of chip sets from minimum cost to maximum cost. We used MCNC benchmark circuits for experiments. The experimental results show that all of chip sets found satisfy the multiple partitioning constraints.

Keywords: lowest cost chip set, MCNC benchmark, multi-way partitioning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1503
1301 Cardiac Disorder Classification Based On Extreme Learning Machine

Authors: Chul Kwak, Oh-Wook Kwon

Abstract:

In this paper, an extreme learning machine with an automatic segmentation algorithm is applied to heart disorder classification by heart sound signals. From continuous heart sound signals, the starting points of the first (S1) and the second heart pulses (S2) are extracted and corrected by utilizing an inter-pulse histogram. From the corrected pulse positions, a single period of heart sound signals is extracted and converted to a feature vector including the mel-scaled filter bank energy coefficients and the envelope coefficients of uniform-sized sub-segments. An extreme learning machine is used to classify the feature vector. In our cardiac disorder classification and detection experiments with 9 cardiac disorder categories, the proposed method shows significantly better performance than multi-layer perceptron, support vector machine, and hidden Markov model; it achieves the classification accuracy of 81.6% and the detection accuracy of 96.9%.

Keywords: Heart sound classification, extreme learning machine

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1934
1300 Project Portfolio Management Phases: A Technique for Strategy Alignment

Authors: Amaral, António, Araújo, Madalena

Abstract:

This paper seeks to give a general idea of the universe of project portfolio management, from its multidisciplinary nature, to the many challenges it raises, passing through the different techniques, models and tools used to solve the multiple problems known. It is intended to contribute to the clarification, with great depth, of the impacts and relationships involved in managing the projects- portfolio. It aims at proposing a technique for the project alignment with the organisational strategy, in order to select projects that later on will be considered in the analysis and selection of the portfolio. We consider the development of a methodology for assessing the project alignment index very relevant in the global market scenario. It can help organisations to gain a greater awareness of market dynamics, speed up the decision process and increase its consistency, thus enabling the strategic alignment and the improvement of the organisational performance.

Keywords: Project Portfolio Management Cycle, Project Portfolio Selection, Resource Assignment, Strategy Alignment technique

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3456
1299 Evaluation of Robust Feature Descriptors for Texture Classification

Authors: Jia-Hong Lee, Mei-Yi Wu, Hsien-Tsung Kuo

Abstract:

Texture is an important characteristic in real and synthetic scenes. Texture analysis plays a critical role in inspecting surfaces and provides important techniques in a variety of applications. Although several descriptors have been presented to extract texture features, the development of object recognition is still a difficult task due to the complex aspects of texture. Recently, many robust and scaling-invariant image features such as SIFT, SURF and ORB have been successfully used in image retrieval and object recognition. In this paper, we have tried to compare the performance for texture classification using these feature descriptors with k-means clustering. Different classifiers including K-NN, Naive Bayes, Back Propagation Neural Network , Decision Tree and Kstar were applied in three texture image sets - UIUCTex, KTH-TIPS and Brodatz, respectively. Experimental results reveal SIFTS as the best average accuracy rate holder in UIUCTex, KTH-TIPS and SURF is advantaged in Brodatz texture set. BP neuro network works best in the test set classification among all used classifiers.

Keywords: Texture classification, texture descriptor, SIFT, SURF, ORB.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1601
1298 A Psychophysiological Evaluation of an Effective Recognition Technique Using Interactive Dynamic Virtual Environments

Authors: Mohammadhossein Moghimi, Robert Stone, Pia Rotshtein

Abstract:

Recording psychological and physiological correlates of human performance within virtual environments and interpreting their impacts on human engagement, ‘immersion’ and related emotional or ‘effective’ states is both academically and technologically challenging. By exposing participants to an effective, real-time (game-like) virtual environment, designed and evaluated in an earlier study, a psychophysiological database containing the EEG, GSR and Heart Rate of 30 male and female gamers, exposed to 10 games, was constructed. Some 174 features were subsequently identified and extracted from a number of windows, with 28 different timing lengths (e.g. 2, 3, 5, etc. seconds). After reducing the number of features to 30, using a feature selection technique, K-Nearest Neighbour (KNN) and Support Vector Machine (SVM) methods were subsequently employed for the classification process. The classifiers categorised the psychophysiological database into four effective clusters (defined based on a 3-dimensional space – valence, arousal and dominance) and eight emotion labels (relaxed, content, happy, excited, angry, afraid, sad, and bored). The KNN and SVM classifiers achieved average cross-validation accuracies of 97.01% (±1.3%) and 92.84% (±3.67%), respectively. However, no significant differences were found in the classification process based on effective clusters or emotion labels.

Keywords: Virtual Reality, effective computing, effective VR, emotion-based effective physiological database.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 994
1297 A Review of Genetic Algorithm Optimization: Operations and Applications to Water Pipeline Systems

Authors: I. Abuiziah, N. Shakarneh

Abstract:

Genetic Algorithm (GA) is a powerful technique for solving optimization problems. It follows the idea of survival of the fittest - Better and better solutions evolve from previous generations until a near optimal solution is obtained. GA uses the main three operations, the selection, crossover and mutation to produce new generations from the old ones. GA has been widely used to solve optimization problems in many applications such as traveling salesman problem, airport traffic control, information retrieval (IR), reactive power optimization, job shop scheduling, and hydraulics systems such as water pipeline systems. In water pipeline systems we need to achieve some goals optimally such as minimum cost of construction, minimum length of pipes and diameters, and the place of protection devices. GA shows high performance over the other optimization techniques, moreover, it is easy to implement and use. Also, it searches a limited number of solutions.

Keywords: Genetic Algorithm, optimization, pipeline systems, selection, cross over.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5100
1296 Reducing the Imbalance Penalty through Artificial Intelligence Methods Geothermal Production Forecasting: A Case Study for Turkey

Authors: H. Anıl, G. Kar

Abstract:

In addition to being rich in renewable energy resources, Turkey is one of the countries that promise potential in geothermal energy production with its high installed power, cheapness, and sustainability. Increasing imbalance penalties become an economic burden for organizations, since the geothermal generation plants cannot maintain the balance of supply and demand due to the inadequacy of the production forecasts given in the day-ahead market. A better production forecast reduces the imbalance penalties of market participants and provides a better imbalance in the day ahead market. In this study, using machine learning, deep learning and time series methods, the total generation of the power plants belonging to Zorlu Doğal Electricity Generation, which has a high installed capacity in terms of geothermal, was predicted for the first one-week and first two-weeks of March, then the imbalance penalties were calculated with these estimates and compared with the real values. These modeling operations were carried out on two datasets, the basic dataset and the dataset created by extracting new features from this dataset with the feature engineering method. According to the results, Support Vector Regression from traditional machine learning models outperformed other models and exhibited the best performance. In addition, the estimation results in the feature engineering dataset showed lower error rates than the basic dataset. It has been concluded that the estimated imbalance penalty calculated for the selected organization is lower than the actual imbalance penalty, optimum and profitable accounts.

Keywords: Machine learning, deep learning, time series models, feature engineering, geothermal energy production forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 204
1295 A Text Mining Technique Using Association Rules Extraction

Authors: Hany Mahgoub, Dietmar Rösner, Nabil Ismail, Fawzy Torkey

Abstract:

This paper describes text mining technique for automatically extracting association rules from collections of textual documents. The technique called, Extracting Association Rules from Text (EART). It depends on keyword features for discover association rules amongst keywords labeling the documents. In this work, the EART system ignores the order in which the words occur, but instead focusing on the words and their statistical distributions in documents. The main contributions of the technique are that it integrates XML technology with Information Retrieval scheme (TFIDF) (for keyword/feature selection that automatically selects the most discriminative keywords for use in association rules generation) and use Data Mining technique for association rules discovery. It consists of three phases: Text Preprocessing phase (transformation, filtration, stemming and indexing of the documents), Association Rule Mining (ARM) phase (applying our designed algorithm for Generating Association Rules based on Weighting scheme GARW) and Visualization phase (visualization of results). Experiments applied on WebPages news documents related to the outbreak of the bird flu disease. The extracted association rules contain important features and describe the informative news included in the documents collection. The performance of the EART system compared with another system that uses the Apriori algorithm throughout the execution time and evaluating extracted association rules.

Keywords: Text mining, data mining, association rule mining

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4437
1294 Evaluating and Selecting Optimization Software Packages: A Framework for Business Applications

Authors: Waleed Abohamad, Amr Arisha

Abstract:

Owing the fact that optimization of business process is a crucial requirement to navigate, survive and even thrive in today-s volatile business environment, this paper presents a framework for selecting a best-fit optimization package for solving complex business problems. Complexity level of the problem and/or using incorrect optimization software can lead to biased solutions of the optimization problem. Accordingly, the proposed framework identifies a number of relevant factors (e.g. decision variables, objective functions, and modeling approach) to be considered during the evaluation and selection process. Application domain, problem specifications, and available accredited optimization approaches are also to be regarded. A recommendation of one or two optimization software is the output of the framework which is believed to provide the best results of the underlying problem. In addition to a set of guidelines and recommendations on how managers can conduct an effective optimization exercise is discussed.

Keywords: Complex Business Problems, Optimization, Selection Criteria, Software Evaluation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2910
1293 A New Ridge Orientation based Method of Computation for Feature Extraction from Fingerprint Images

Authors: Jayadevan R., Jayant V. Kulkarni, Suresh N. Mali, Hemant K. Abhyankar

Abstract:

An important step in studying the statistics of fingerprint minutia features is to reliably extract minutia features from the fingerprint images. A new reliable method of computation for minutiae feature extraction from fingerprint images is presented. A fingerprint image is treated as a textured image. An orientation flow field of the ridges is computed for the fingerprint image. To accurately locate ridges, a new ridge orientation based computation method is proposed. After ridge segmentation a new method of computation is proposed for smoothing the ridges. The ridge skeleton image is obtained and then smoothed using morphological operators to detect the features. A post processing stage eliminates a large number of false features from the detected set of minutiae features. The detected features are observed to be reliable and accurate.

Keywords: Minutia, orientation field, ridge segmentation, textured image.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1853
1292 Analysis of Relation between Unlabeled and Labeled Data to Self-Taught Learning Performance

Authors: Ekachai Phaisangittisagul, Rapeepol Chongprachawat

Abstract:

Obtaining labeled data in supervised learning is often difficult and expensive, and thus the trained learning algorithm tends to be overfitting due to small number of training data. As a result, some researchers have focused on using unlabeled data which may not necessary to follow the same generative distribution as the labeled data to construct a high-level feature for improving performance on supervised learning tasks. In this paper, we investigate the impact of the relationship between unlabeled and labeled data for classification performance. Specifically, we will apply difference unlabeled data which have different degrees of relation to the labeled data for handwritten digit classification task based on MNIST dataset. Our experimental results show that the higher the degree of relation between unlabeled and labeled data, the better the classification performance. Although the unlabeled data that is completely from different generative distribution to the labeled data provides the lowest classification performance, we still achieve high classification performance. This leads to expanding the applicability of the supervised learning algorithms using unsupervised learning.

Keywords: Autoencoder, high-level feature, MNIST dataset, selftaught learning, supervised learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1832
1291 Strength Optimization of Induction Hardened Splined Shaft – Material and Geometric Aspects

Authors: I. Barsoum, F. Khan

Abstract:

the current study presents a modeling framework to determine the torsion strength of an induction hardened splined shaft by considering geometry and material aspects with the aim to optimize the static torsion strength by selection of spline geometry and hardness depth. Six different spline geometries and seven different hardness profiles including non-hardened and throughhardened shafts have been considered. The results reveal that the torque that causes initial yielding of the induction hardened splined shaft is strongly dependent on the hardness depth and the geometry of the spline teeth. Guidelines for selection of the appropriate hardness depth and spline geometry are given such that an optimum static torsion strength of the component can be achieved.

Keywords: Static strength, splined shaft, torsion, induction hardening, hardness profile, finite element, optimization, design.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4970
1290 Personal Authentication Using FDOST in Finger Knuckle-Print Biometrics

Authors: N. B. Mahesh Kumar, K. Premalatha

Abstract:

The inherent skin patterns created at the joints in the finger exterior are referred as finger knuckle-print. It is exploited to identify a person in a unique manner because the finger knuckle print is greatly affluent in textures. In biometric system, the region of interest is utilized for the feature extraction algorithm. In this paper, local and global features are extracted separately. Fast Discrete Orthonormal Stockwell Transform is exploited to extract the local features. Global feature is attained by escalating the size of Fast Discrete Orthonormal Stockwell Transform to infinity. Two features are fused to increase the recognition accuracy. A matching distance is calculated for both the features individually. Then two distances are merged mutually to acquire the final matching distance. The proposed scheme gives the better performance in terms of equal error rate and correct recognition rate.

Keywords: Hamming distance, Instantaneous phase, Region of Interest, Recognition accuracy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2759
1289 Multi-Level Air Quality Classification in China Using Information Gain and Support Vector Machine

Authors: Bingchun Liu, Pei-Chann Chang, Natasha Huang, Dun Li

Abstract:

Machine Learning and Data Mining are the two important tools for extracting useful information and knowledge from large datasets. In machine learning, classification is a wildly used technique to predict qualitative variables and is generally preferred over regression from an operational point of view. Due to the enormous increase in air pollution in various countries especially China, Air Quality Classification has become one of the most important topics in air quality research and modelling. This study aims at introducing a hybrid classification model based on information theory and Support Vector Machine (SVM) using the air quality data of four cities in China namely Beijing, Guangzhou, Shanghai and Tianjin from Jan 1, 2014 to April 30, 2016. China's Ministry of Environmental Protection has classified the daily air quality into 6 levels namely Serious Pollution, Severe Pollution, Moderate Pollution, Light Pollution, Good and Excellent based on their respective Air Quality Index (AQI) values. Using the information theory, information gain (IG) is calculated and feature selection is done for both categorical features and continuous numeric features. Then SVM Machine Learning algorithm is implemented on the selected features with cross-validation. The final evaluation reveals that the IG and SVM hybrid model performs better than SVM (alone), Artificial Neural Network (ANN) and K-Nearest Neighbours (KNN) models in terms of accuracy as well as complexity.

Keywords: Machine learning, air quality classification, air quality index, information gain, support vector machine, cross-validation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 948
1288 Detecting Remote Protein Evolutionary Relationships via String Scoring Method

Authors: Nazar Zaki, Safaai Deris

Abstract:

The amount of the information being churned out by the field of biology has jumped manifold and now requires the extensive use of computer techniques for the management of this information. The predominance of biological information such as protein sequence similarity in the biological information sea is key information for detecting protein evolutionary relationship. Protein sequence similarity typically implies homology, which in turn may imply structural and functional similarities. In this work, we propose, a learning method for detecting remote protein homology. The proposed method uses a transformation that converts protein sequence into fixed-dimensional representative feature vectors. Each feature vector records the sensitivity of a protein sequence to a set of amino acids substrings generated from the protein sequences of interest. These features are then used in conjunction with support vector machines for the detection of the protein remote homology. The proposed method is tested and evaluated on two different benchmark protein datasets and it-s able to deliver improvements over most of the existing homology detection methods.

Keywords: Protein homology detection; support vectormachine; string kernel.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1392
1287 Emotion Classification using Adaptive SVMs

Authors: P. Visutsak

Abstract:

The study of the interaction between humans and computers has been emerging during the last few years. This interaction will be more powerful if computers are able to perceive and respond to human nonverbal communication such as emotions. In this study, we present the image-based approach to emotion classification through lower facial expression. We employ a set of feature points in the lower face image according to the particular face model used and consider their motion across each emotive expression of images. The vector of displacements of all feature points input to the Adaptive Support Vector Machines (A-SVMs) classifier that classify it into seven basic emotions scheme, namely neutral, angry, disgust, fear, happy, sad and surprise. The system was tested on the Japanese Female Facial Expression (JAFFE) dataset of frontal view facial expressions [7]. Our experiments on emotion classification through lower facial expressions demonstrate the robustness of Adaptive SVM classifier and verify the high efficiency of our approach.

Keywords: emotion classification, facial expression, adaptive support vector machines, facial expression classifier.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2224
1286 Generator of Hypotheses an Approach of Data Mining Based on Monotone Systems Theory

Authors: Rein Kuusik, Grete Lind

Abstract:

Generator of hypotheses is a new method for data mining. It makes possible to classify the source data automatically and produces a particular enumeration of patterns. Pattern is an expression (in a certain language) describing facts in a subset of facts. The goal is to describe the source data via patterns and/or IF...THEN rules. Used evaluation criteria are deterministic (not probabilistic). The search results are trees - form that is easy to comprehend and interpret. Generator of hypotheses uses very effective algorithm based on the theory of monotone systems (MS) named MONSA (MONotone System Algorithm).

Keywords: data mining, monotone systems, pattern, rule.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1256
1285 Combine a Population-based Incremental Learning with Artificial Immune System for Intrusion Detection System

Authors: Jheng-Long Wu, Pei-Chann Chang, Hsuan-Ming Chen

Abstract:

This research focus on the intrusion detection system (IDS) development which using artificial immune system (AIS) with population based incremental learning (PBIL). AIS have powerful distinguished capability to extirpate antigen when the antigen intrude into human body. The PBIL is based on past learning experience to adjust new learning. Therefore we propose an intrusion detection system call PBIL-AIS which combine two approaches of PBIL and AIS to evolution computing. In AIS part we design three mechanisms such as clonal selection, negative selection and antibody level to intensify AIS performance. In experimental result, our PBIL-AIS IDS can capture high accuracy when an intrusion connection attacks.

Keywords: Artificial immune system, intrusion detection, population-based incremental learning, evolution computing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1929
1284 Comparison of MFCC and Cepstral Coefficients as a Feature Set for PCG Biometric Systems

Authors: Justin Leo Cheang Loong, Khazaimatol S Subari, Muhammad Kamil Abdullah, Nurul Nadia Ahmad, RosliBesar

Abstract:

Heart sound is an acoustic signal and many techniques used nowadays for human recognition tasks borrow speech recognition techniques. One popular choice for feature extraction of accoustic signals is the Mel Frequency Cepstral Coefficients (MFCC) which maps the signal onto a non-linear Mel-Scale that mimics the human hearing. However the Mel-Scale is almost linear in the frequency region of heart sounds and thus should produce similar results with the standard cepstral coefficients (CC). In this paper, MFCC is investigated to see if it produces superior results for PCG based human identification system compared to CC. Results show that the MFCC system is still superior to CC despite linear filter-banks in the lower frequency range, giving up to 95% correct recognition rate for MFCC and 90% for CC. Further experiments show that the high recognition rate is due to the implementation of filter-banks and not from Mel-Scaling.

Keywords: Biometric, Phonocardiogram, Cepstral Coefficients, Mel Frequency

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3552
1283 Information Filtering using Index Word Selection based on the Topics

Authors: Takeru YOKOI, Hidekazu YANAGIMOTO, Sigeru OMATU

Abstract:

We have proposed an information filtering system using index word selection from a document set based on the topics included in a set of documents. This method narrows down the particularly characteristic words in a document set and the topics are obtained by Sparse Non-negative Matrix Factorization. In information filtering, a document is often represented with the vector in which the elements correspond to the weight of the index words, and the dimension of the vector becomes larger as the number of documents is increased. Therefore, it is possible that useless words as index words for the information filtering are included. In order to address the problem, the dimension needs to be reduced. Our proposal reduces the dimension by selecting index words based on the topics included in a document set. We have applied the Sparse Non-negative Matrix Factorization to the document set to obtain these topics. The filtering is carried out based on a centroid of the learning document set. The centroid is regarded as the user-s interest. In addition, the centroid is represented with a document vector whose elements consist of the weight of the selected index words. Using the English test collection MEDLINE, thus, we confirm the effectiveness of our proposal. Hence, our proposed selection can confirm the improvement of the recommendation accuracy from the other previous methods when selecting the appropriate number of index words. In addition, we discussed the selected index words by our proposal and we found our proposal was able to select the index words covered some minor topics included in the document set.

Keywords: Information Filtering, Sparse NMF, Index wordSelection, User Profile, Chi-squared Measure

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1456
1282 Improved Weighted Matching for Speaker Recognition

Authors: Ozan Mut, Mehmet Göktürk

Abstract:

Matching algorithms have significant importance in speaker recognition. Feature vectors of the unknown utterance are compared to feature vectors of the modeled speakers as a last step in speaker recognition. A similarity score is found for every model in the speaker database. Depending on the type of speaker recognition, these scores are used to determine the author of unknown speech samples. For speaker verification, similarity score is tested against a predefined threshold and either acceptance or rejection result is obtained. In the case of speaker identification, the result depends on whether the identification is open set or closed set. In closed set identification, the model that yields the best similarity score is accepted. In open set identification, the best score is tested against a threshold, so there is one more possible output satisfying the condition that the speaker is not one of the registered speakers in existing database. This paper focuses on closed set speaker identification using a modified version of a well known matching algorithm. The results of new matching algorithm indicated better performance on YOHO international speaker recognition database.

Keywords: Automatic Speaker Recognition, Voice Recognition, Pattern Recognition, Digital Audio Signal Processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1732
1281 Feature Point Detection by Combining Advantages of Intensity-based Approach and Edge-based Approach

Authors: Sungho Kim, Chaehoon Park, Yukyung Choi, Soon Kwon, In So Kweon

Abstract:

In this paper, a novel corner detection method is presented to stably extract geometrically important corners. Intensity-based corner detectors such as the Harris corner can detect corners in noisy environments but has inaccurate corner position and misses the corners of obtuse angles. Edge-based corner detectors such as Curvature Scale Space can detect structural corners but show unstable corner detection due to incomplete edge detection in noisy environments. The proposed image-based direct curvature estimation can overcome limitations in both inaccurate structural corner detection of the Harris corner detector (intensity-based) and the unstable corner detection of Curvature Scale Space caused by incomplete edge detection. Various experimental results validate the robustness of the proposed method.

Keywords: Feature, intensity, contour, hybrid.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1831