Search results for: feature selection
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1815

Search results for: feature selection

1275 Breast Cancer Survivability Prediction via Classifier Ensemble

Authors: Mohamed Al-Badrashiny, Abdelghani Bellaachia

Abstract:

This paper presents a classifier ensemble approach for predicting the survivability of the breast cancer patients using the latest database version of the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute. The system consists of two main components; features selection and classifier ensemble components. The features selection component divides the features in SEER database into four groups. After that it tries to find the most important features among the four groups that maximizes the weighted average F-score of a certain classification algorithm. The ensemble component uses three different classifiers, each of which models different set of features from SEER through the features selection module. On top of them, another classifier is used to give the final decision based on the output decisions and confidence scores from each of the underlying classifiers. Different classification algorithms have been examined; the best setup found is by using the decision tree, Bayesian network, and Na¨ıve Bayes algorithms for the underlying classifiers and Na¨ıve Bayes for the classifier ensemble step. The system outperforms all published systems to date when evaluated against the exact same data of SEER (period of 1973-2002). It gives 87.39% weighted average F-score compared to 85.82% and 81.34% of the other published systems. By increasing the data size to cover the whole database (period of 1973-2014), the overall weighted average F-score jumps to 92.4% on the held out unseen test set.

Keywords: Classifier ensemble, breast cancer survivability, data mining, SEER.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1671
1274 Microscopic Simulation of Toll Plaza Safety and Operations

Authors: Bekir O. Bartin, Kaan Ozbay, Sandeep Mudigonda, Hong Yang

Abstract:

The use of microscopic traffic simulation in evaluating the operational and safety conditions at toll plazas is demonstrated. Two toll plazas in New Jersey are selected as case studies and were developed and validated in Paramics traffic simulation software. In order to simulate drivers’ lane selection behavior in Paramics, a utility-based lane selection approach is implemented in Paramics Application Programming Interface (API). For each vehicle approaching the toll plaza, a utility value is assigned to each toll lane by taking into account the factors that are likely to impact drivers’ lane selection behavior, such as approach lane, exit lane and queue lengths. The results demonstrate that similar operational conditions, such as lane-by-lane toll plaza traffic volume can be attained using this approach. In addition, assessment of safety at toll plazas is conducted via a surrogate safety measure. In particular, the crash index (CI), an improved surrogate measure of time-to-collision (TTC), which reflects the severity of a crash is used in the simulation analyses. The results indicate that the spatial and temporal frequency of observed crashes can be simulated using the proposed methodology. Further analyses can be conducted to evaluate and compare various different operational decisions and safety measures using microscopic simulation models.

Keywords: Microscopic simulation, toll plaza, surrogate safety, application programming interface.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 788
1273 Low Cost Chip Set Selection Algorithm for Multi-way Partitioning of Digital System

Authors: Jae Young Park, Soongyu Kwon, Kyu Han Kim, Hyeong Geon Lee, Jong Tae Kim

Abstract:

This paper considers the problem of finding low cost chip set for a minimum cost partitioning of a large logic circuits. Chip sets are selected from a given library. Each chip in the library has a different price, area, and I/O pin. We propose a low cost chip set selection algorithm. Inputs to the algorithm are a netlist and a chip information in the library. Output is a list of chip sets satisfied with area and maximum partitioning number and it is sorted by cost. The algorithm finds the sorted list of chip sets from minimum cost to maximum cost. We used MCNC benchmark circuits for experiments. The experimental results show that all of chip sets found satisfy the multiple partitioning constraints.

Keywords: lowest cost chip set, MCNC benchmark, multi-way partitioning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1503
1272 Cardiac Disorder Classification Based On Extreme Learning Machine

Authors: Chul Kwak, Oh-Wook Kwon

Abstract:

In this paper, an extreme learning machine with an automatic segmentation algorithm is applied to heart disorder classification by heart sound signals. From continuous heart sound signals, the starting points of the first (S1) and the second heart pulses (S2) are extracted and corrected by utilizing an inter-pulse histogram. From the corrected pulse positions, a single period of heart sound signals is extracted and converted to a feature vector including the mel-scaled filter bank energy coefficients and the envelope coefficients of uniform-sized sub-segments. An extreme learning machine is used to classify the feature vector. In our cardiac disorder classification and detection experiments with 9 cardiac disorder categories, the proposed method shows significantly better performance than multi-layer perceptron, support vector machine, and hidden Markov model; it achieves the classification accuracy of 81.6% and the detection accuracy of 96.9%.

Keywords: Heart sound classification, extreme learning machine

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1934
1271 Project Portfolio Management Phases: A Technique for Strategy Alignment

Authors: Amaral, António, Araújo, Madalena

Abstract:

This paper seeks to give a general idea of the universe of project portfolio management, from its multidisciplinary nature, to the many challenges it raises, passing through the different techniques, models and tools used to solve the multiple problems known. It is intended to contribute to the clarification, with great depth, of the impacts and relationships involved in managing the projects- portfolio. It aims at proposing a technique for the project alignment with the organisational strategy, in order to select projects that later on will be considered in the analysis and selection of the portfolio. We consider the development of a methodology for assessing the project alignment index very relevant in the global market scenario. It can help organisations to gain a greater awareness of market dynamics, speed up the decision process and increase its consistency, thus enabling the strategic alignment and the improvement of the organisational performance.

Keywords: Project Portfolio Management Cycle, Project Portfolio Selection, Resource Assignment, Strategy Alignment technique

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3456
1270 Evaluation of Robust Feature Descriptors for Texture Classification

Authors: Jia-Hong Lee, Mei-Yi Wu, Hsien-Tsung Kuo

Abstract:

Texture is an important characteristic in real and synthetic scenes. Texture analysis plays a critical role in inspecting surfaces and provides important techniques in a variety of applications. Although several descriptors have been presented to extract texture features, the development of object recognition is still a difficult task due to the complex aspects of texture. Recently, many robust and scaling-invariant image features such as SIFT, SURF and ORB have been successfully used in image retrieval and object recognition. In this paper, we have tried to compare the performance for texture classification using these feature descriptors with k-means clustering. Different classifiers including K-NN, Naive Bayes, Back Propagation Neural Network , Decision Tree and Kstar were applied in three texture image sets - UIUCTex, KTH-TIPS and Brodatz, respectively. Experimental results reveal SIFTS as the best average accuracy rate holder in UIUCTex, KTH-TIPS and SURF is advantaged in Brodatz texture set. BP neuro network works best in the test set classification among all used classifiers.

Keywords: Texture classification, texture descriptor, SIFT, SURF, ORB.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1601
1269 A Psychophysiological Evaluation of an Effective Recognition Technique Using Interactive Dynamic Virtual Environments

Authors: Mohammadhossein Moghimi, Robert Stone, Pia Rotshtein

Abstract:

Recording psychological and physiological correlates of human performance within virtual environments and interpreting their impacts on human engagement, ‘immersion’ and related emotional or ‘effective’ states is both academically and technologically challenging. By exposing participants to an effective, real-time (game-like) virtual environment, designed and evaluated in an earlier study, a psychophysiological database containing the EEG, GSR and Heart Rate of 30 male and female gamers, exposed to 10 games, was constructed. Some 174 features were subsequently identified and extracted from a number of windows, with 28 different timing lengths (e.g. 2, 3, 5, etc. seconds). After reducing the number of features to 30, using a feature selection technique, K-Nearest Neighbour (KNN) and Support Vector Machine (SVM) methods were subsequently employed for the classification process. The classifiers categorised the psychophysiological database into four effective clusters (defined based on a 3-dimensional space – valence, arousal and dominance) and eight emotion labels (relaxed, content, happy, excited, angry, afraid, sad, and bored). The KNN and SVM classifiers achieved average cross-validation accuracies of 97.01% (±1.3%) and 92.84% (±3.67%), respectively. However, no significant differences were found in the classification process based on effective clusters or emotion labels.

Keywords: Virtual Reality, effective computing, effective VR, emotion-based effective physiological database.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 994
1268 Reducing the Imbalance Penalty through Artificial Intelligence Methods Geothermal Production Forecasting: A Case Study for Turkey

Authors: H. Anıl, G. Kar

Abstract:

In addition to being rich in renewable energy resources, Turkey is one of the countries that promise potential in geothermal energy production with its high installed power, cheapness, and sustainability. Increasing imbalance penalties become an economic burden for organizations, since the geothermal generation plants cannot maintain the balance of supply and demand due to the inadequacy of the production forecasts given in the day-ahead market. A better production forecast reduces the imbalance penalties of market participants and provides a better imbalance in the day ahead market. In this study, using machine learning, deep learning and time series methods, the total generation of the power plants belonging to Zorlu Doğal Electricity Generation, which has a high installed capacity in terms of geothermal, was predicted for the first one-week and first two-weeks of March, then the imbalance penalties were calculated with these estimates and compared with the real values. These modeling operations were carried out on two datasets, the basic dataset and the dataset created by extracting new features from this dataset with the feature engineering method. According to the results, Support Vector Regression from traditional machine learning models outperformed other models and exhibited the best performance. In addition, the estimation results in the feature engineering dataset showed lower error rates than the basic dataset. It has been concluded that the estimated imbalance penalty calculated for the selected organization is lower than the actual imbalance penalty, optimum and profitable accounts.

Keywords: Machine learning, deep learning, time series models, feature engineering, geothermal energy production forecasting.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 204
1267 A Review of Genetic Algorithm Optimization: Operations and Applications to Water Pipeline Systems

Authors: I. Abuiziah, N. Shakarneh

Abstract:

Genetic Algorithm (GA) is a powerful technique for solving optimization problems. It follows the idea of survival of the fittest - Better and better solutions evolve from previous generations until a near optimal solution is obtained. GA uses the main three operations, the selection, crossover and mutation to produce new generations from the old ones. GA has been widely used to solve optimization problems in many applications such as traveling salesman problem, airport traffic control, information retrieval (IR), reactive power optimization, job shop scheduling, and hydraulics systems such as water pipeline systems. In water pipeline systems we need to achieve some goals optimally such as minimum cost of construction, minimum length of pipes and diameters, and the place of protection devices. GA shows high performance over the other optimization techniques, moreover, it is easy to implement and use. Also, it searches a limited number of solutions.

Keywords: Genetic Algorithm, optimization, pipeline systems, selection, cross over.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5100
1266 A Text Mining Technique Using Association Rules Extraction

Authors: Hany Mahgoub, Dietmar Rösner, Nabil Ismail, Fawzy Torkey

Abstract:

This paper describes text mining technique for automatically extracting association rules from collections of textual documents. The technique called, Extracting Association Rules from Text (EART). It depends on keyword features for discover association rules amongst keywords labeling the documents. In this work, the EART system ignores the order in which the words occur, but instead focusing on the words and their statistical distributions in documents. The main contributions of the technique are that it integrates XML technology with Information Retrieval scheme (TFIDF) (for keyword/feature selection that automatically selects the most discriminative keywords for use in association rules generation) and use Data Mining technique for association rules discovery. It consists of three phases: Text Preprocessing phase (transformation, filtration, stemming and indexing of the documents), Association Rule Mining (ARM) phase (applying our designed algorithm for Generating Association Rules based on Weighting scheme GARW) and Visualization phase (visualization of results). Experiments applied on WebPages news documents related to the outbreak of the bird flu disease. The extracted association rules contain important features and describe the informative news included in the documents collection. The performance of the EART system compared with another system that uses the Apriori algorithm throughout the execution time and evaluating extracted association rules.

Keywords: Text mining, data mining, association rule mining

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4439
1265 Evaluating and Selecting Optimization Software Packages: A Framework for Business Applications

Authors: Waleed Abohamad, Amr Arisha

Abstract:

Owing the fact that optimization of business process is a crucial requirement to navigate, survive and even thrive in today-s volatile business environment, this paper presents a framework for selecting a best-fit optimization package for solving complex business problems. Complexity level of the problem and/or using incorrect optimization software can lead to biased solutions of the optimization problem. Accordingly, the proposed framework identifies a number of relevant factors (e.g. decision variables, objective functions, and modeling approach) to be considered during the evaluation and selection process. Application domain, problem specifications, and available accredited optimization approaches are also to be regarded. A recommendation of one or two optimization software is the output of the framework which is believed to provide the best results of the underlying problem. In addition to a set of guidelines and recommendations on how managers can conduct an effective optimization exercise is discussed.

Keywords: Complex Business Problems, Optimization, Selection Criteria, Software Evaluation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2910
1264 A New Ridge Orientation based Method of Computation for Feature Extraction from Fingerprint Images

Authors: Jayadevan R., Jayant V. Kulkarni, Suresh N. Mali, Hemant K. Abhyankar

Abstract:

An important step in studying the statistics of fingerprint minutia features is to reliably extract minutia features from the fingerprint images. A new reliable method of computation for minutiae feature extraction from fingerprint images is presented. A fingerprint image is treated as a textured image. An orientation flow field of the ridges is computed for the fingerprint image. To accurately locate ridges, a new ridge orientation based computation method is proposed. After ridge segmentation a new method of computation is proposed for smoothing the ridges. The ridge skeleton image is obtained and then smoothed using morphological operators to detect the features. A post processing stage eliminates a large number of false features from the detected set of minutiae features. The detected features are observed to be reliable and accurate.

Keywords: Minutia, orientation field, ridge segmentation, textured image.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1853
1263 Analysis of Relation between Unlabeled and Labeled Data to Self-Taught Learning Performance

Authors: Ekachai Phaisangittisagul, Rapeepol Chongprachawat

Abstract:

Obtaining labeled data in supervised learning is often difficult and expensive, and thus the trained learning algorithm tends to be overfitting due to small number of training data. As a result, some researchers have focused on using unlabeled data which may not necessary to follow the same generative distribution as the labeled data to construct a high-level feature for improving performance on supervised learning tasks. In this paper, we investigate the impact of the relationship between unlabeled and labeled data for classification performance. Specifically, we will apply difference unlabeled data which have different degrees of relation to the labeled data for handwritten digit classification task based on MNIST dataset. Our experimental results show that the higher the degree of relation between unlabeled and labeled data, the better the classification performance. Although the unlabeled data that is completely from different generative distribution to the labeled data provides the lowest classification performance, we still achieve high classification performance. This leads to expanding the applicability of the supervised learning algorithms using unsupervised learning.

Keywords: Autoencoder, high-level feature, MNIST dataset, selftaught learning, supervised learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1832
1262 Strength Optimization of Induction Hardened Splined Shaft – Material and Geometric Aspects

Authors: I. Barsoum, F. Khan

Abstract:

the current study presents a modeling framework to determine the torsion strength of an induction hardened splined shaft by considering geometry and material aspects with the aim to optimize the static torsion strength by selection of spline geometry and hardness depth. Six different spline geometries and seven different hardness profiles including non-hardened and throughhardened shafts have been considered. The results reveal that the torque that causes initial yielding of the induction hardened splined shaft is strongly dependent on the hardness depth and the geometry of the spline teeth. Guidelines for selection of the appropriate hardness depth and spline geometry are given such that an optimum static torsion strength of the component can be achieved.

Keywords: Static strength, splined shaft, torsion, induction hardening, hardness profile, finite element, optimization, design.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4970
1261 Personal Authentication Using FDOST in Finger Knuckle-Print Biometrics

Authors: N. B. Mahesh Kumar, K. Premalatha

Abstract:

The inherent skin patterns created at the joints in the finger exterior are referred as finger knuckle-print. It is exploited to identify a person in a unique manner because the finger knuckle print is greatly affluent in textures. In biometric system, the region of interest is utilized for the feature extraction algorithm. In this paper, local and global features are extracted separately. Fast Discrete Orthonormal Stockwell Transform is exploited to extract the local features. Global feature is attained by escalating the size of Fast Discrete Orthonormal Stockwell Transform to infinity. Two features are fused to increase the recognition accuracy. A matching distance is calculated for both the features individually. Then two distances are merged mutually to acquire the final matching distance. The proposed scheme gives the better performance in terms of equal error rate and correct recognition rate.

Keywords: Hamming distance, Instantaneous phase, Region of Interest, Recognition accuracy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2759
1260 Multi-Level Air Quality Classification in China Using Information Gain and Support Vector Machine

Authors: Bingchun Liu, Pei-Chann Chang, Natasha Huang, Dun Li

Abstract:

Machine Learning and Data Mining are the two important tools for extracting useful information and knowledge from large datasets. In machine learning, classification is a wildly used technique to predict qualitative variables and is generally preferred over regression from an operational point of view. Due to the enormous increase in air pollution in various countries especially China, Air Quality Classification has become one of the most important topics in air quality research and modelling. This study aims at introducing a hybrid classification model based on information theory and Support Vector Machine (SVM) using the air quality data of four cities in China namely Beijing, Guangzhou, Shanghai and Tianjin from Jan 1, 2014 to April 30, 2016. China's Ministry of Environmental Protection has classified the daily air quality into 6 levels namely Serious Pollution, Severe Pollution, Moderate Pollution, Light Pollution, Good and Excellent based on their respective Air Quality Index (AQI) values. Using the information theory, information gain (IG) is calculated and feature selection is done for both categorical features and continuous numeric features. Then SVM Machine Learning algorithm is implemented on the selected features with cross-validation. The final evaluation reveals that the IG and SVM hybrid model performs better than SVM (alone), Artificial Neural Network (ANN) and K-Nearest Neighbours (KNN) models in terms of accuracy as well as complexity.

Keywords: Machine learning, air quality classification, air quality index, information gain, support vector machine, cross-validation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 948
1259 Detecting Remote Protein Evolutionary Relationships via String Scoring Method

Authors: Nazar Zaki, Safaai Deris

Abstract:

The amount of the information being churned out by the field of biology has jumped manifold and now requires the extensive use of computer techniques for the management of this information. The predominance of biological information such as protein sequence similarity in the biological information sea is key information for detecting protein evolutionary relationship. Protein sequence similarity typically implies homology, which in turn may imply structural and functional similarities. In this work, we propose, a learning method for detecting remote protein homology. The proposed method uses a transformation that converts protein sequence into fixed-dimensional representative feature vectors. Each feature vector records the sensitivity of a protein sequence to a set of amino acids substrings generated from the protein sequences of interest. These features are then used in conjunction with support vector machines for the detection of the protein remote homology. The proposed method is tested and evaluated on two different benchmark protein datasets and it-s able to deliver improvements over most of the existing homology detection methods.

Keywords: Protein homology detection; support vectormachine; string kernel.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1392
1258 Emotion Classification using Adaptive SVMs

Authors: P. Visutsak

Abstract:

The study of the interaction between humans and computers has been emerging during the last few years. This interaction will be more powerful if computers are able to perceive and respond to human nonverbal communication such as emotions. In this study, we present the image-based approach to emotion classification through lower facial expression. We employ a set of feature points in the lower face image according to the particular face model used and consider their motion across each emotive expression of images. The vector of displacements of all feature points input to the Adaptive Support Vector Machines (A-SVMs) classifier that classify it into seven basic emotions scheme, namely neutral, angry, disgust, fear, happy, sad and surprise. The system was tested on the Japanese Female Facial Expression (JAFFE) dataset of frontal view facial expressions [7]. Our experiments on emotion classification through lower facial expressions demonstrate the robustness of Adaptive SVM classifier and verify the high efficiency of our approach.

Keywords: emotion classification, facial expression, adaptive support vector machines, facial expression classifier.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2224
1257 Combine a Population-based Incremental Learning with Artificial Immune System for Intrusion Detection System

Authors: Jheng-Long Wu, Pei-Chann Chang, Hsuan-Ming Chen

Abstract:

This research focus on the intrusion detection system (IDS) development which using artificial immune system (AIS) with population based incremental learning (PBIL). AIS have powerful distinguished capability to extirpate antigen when the antigen intrude into human body. The PBIL is based on past learning experience to adjust new learning. Therefore we propose an intrusion detection system call PBIL-AIS which combine two approaches of PBIL and AIS to evolution computing. In AIS part we design three mechanisms such as clonal selection, negative selection and antibody level to intensify AIS performance. In experimental result, our PBIL-AIS IDS can capture high accuracy when an intrusion connection attacks.

Keywords: Artificial immune system, intrusion detection, population-based incremental learning, evolution computing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1929
1256 Comparison of MFCC and Cepstral Coefficients as a Feature Set for PCG Biometric Systems

Authors: Justin Leo Cheang Loong, Khazaimatol S Subari, Muhammad Kamil Abdullah, Nurul Nadia Ahmad, RosliBesar

Abstract:

Heart sound is an acoustic signal and many techniques used nowadays for human recognition tasks borrow speech recognition techniques. One popular choice for feature extraction of accoustic signals is the Mel Frequency Cepstral Coefficients (MFCC) which maps the signal onto a non-linear Mel-Scale that mimics the human hearing. However the Mel-Scale is almost linear in the frequency region of heart sounds and thus should produce similar results with the standard cepstral coefficients (CC). In this paper, MFCC is investigated to see if it produces superior results for PCG based human identification system compared to CC. Results show that the MFCC system is still superior to CC despite linear filter-banks in the lower frequency range, giving up to 95% correct recognition rate for MFCC and 90% for CC. Further experiments show that the high recognition rate is due to the implementation of filter-banks and not from Mel-Scaling.

Keywords: Biometric, Phonocardiogram, Cepstral Coefficients, Mel Frequency

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3553
1255 Information Filtering using Index Word Selection based on the Topics

Authors: Takeru YOKOI, Hidekazu YANAGIMOTO, Sigeru OMATU

Abstract:

We have proposed an information filtering system using index word selection from a document set based on the topics included in a set of documents. This method narrows down the particularly characteristic words in a document set and the topics are obtained by Sparse Non-negative Matrix Factorization. In information filtering, a document is often represented with the vector in which the elements correspond to the weight of the index words, and the dimension of the vector becomes larger as the number of documents is increased. Therefore, it is possible that useless words as index words for the information filtering are included. In order to address the problem, the dimension needs to be reduced. Our proposal reduces the dimension by selecting index words based on the topics included in a document set. We have applied the Sparse Non-negative Matrix Factorization to the document set to obtain these topics. The filtering is carried out based on a centroid of the learning document set. The centroid is regarded as the user-s interest. In addition, the centroid is represented with a document vector whose elements consist of the weight of the selected index words. Using the English test collection MEDLINE, thus, we confirm the effectiveness of our proposal. Hence, our proposed selection can confirm the improvement of the recommendation accuracy from the other previous methods when selecting the appropriate number of index words. In addition, we discussed the selected index words by our proposal and we found our proposal was able to select the index words covered some minor topics included in the document set.

Keywords: Information Filtering, Sparse NMF, Index wordSelection, User Profile, Chi-squared Measure

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1456
1254 Improved Weighted Matching for Speaker Recognition

Authors: Ozan Mut, Mehmet Göktürk

Abstract:

Matching algorithms have significant importance in speaker recognition. Feature vectors of the unknown utterance are compared to feature vectors of the modeled speakers as a last step in speaker recognition. A similarity score is found for every model in the speaker database. Depending on the type of speaker recognition, these scores are used to determine the author of unknown speech samples. For speaker verification, similarity score is tested against a predefined threshold and either acceptance or rejection result is obtained. In the case of speaker identification, the result depends on whether the identification is open set or closed set. In closed set identification, the model that yields the best similarity score is accepted. In open set identification, the best score is tested against a threshold, so there is one more possible output satisfying the condition that the speaker is not one of the registered speakers in existing database. This paper focuses on closed set speaker identification using a modified version of a well known matching algorithm. The results of new matching algorithm indicated better performance on YOHO international speaker recognition database.

Keywords: Automatic Speaker Recognition, Voice Recognition, Pattern Recognition, Digital Audio Signal Processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1732
1253 Feature Point Detection by Combining Advantages of Intensity-based Approach and Edge-based Approach

Authors: Sungho Kim, Chaehoon Park, Yukyung Choi, Soon Kwon, In So Kweon

Abstract:

In this paper, a novel corner detection method is presented to stably extract geometrically important corners. Intensity-based corner detectors such as the Harris corner can detect corners in noisy environments but has inaccurate corner position and misses the corners of obtuse angles. Edge-based corner detectors such as Curvature Scale Space can detect structural corners but show unstable corner detection due to incomplete edge detection in noisy environments. The proposed image-based direct curvature estimation can overcome limitations in both inaccurate structural corner detection of the Harris corner detector (intensity-based) and the unstable corner detection of Curvature Scale Space caused by incomplete edge detection. Various experimental results validate the robustness of the proposed method.

Keywords: Feature, intensity, contour, hybrid.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1831
1252 Generalized d-q Model of n-Phase Induction Motor Drive

Authors: G. Renukadevi, K. Rajambal

Abstract:

This paper presents a generalized d-q model of n- phase induction motor drive. Multi -phase (n-phase) induction motor (more than three phases) drives possess several advantages over conventional three-phase drives, such as reduced current/phase without increasing voltage/phase, lower torque pulsation, higher torque density, fault tolerance, stability, high efficiency and lower current ripple. When the number of phases increases, it is also possible to increase the power in the same frame. In this paper, a generalized dq-axis model is developed in Matlab/Simulink for an n-phase induction motor. The simulation results are presented for 5, 6, 7, 9 and 12 phase induction motor under varying load conditions. Transient response of the multi-phase induction motors are given for different number of phases. Fault tolerant feature is also analyzed for 5-phase induction motor drive.

Keywords: d-q model, dynamic Response, fault tolerant feature, Matlab/Simulink, multi-phase induction motor, transient response.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10560
1251 An Automatic Bayesian Classification System for File Format Selection

Authors: Roman Graf, Sergiu Gordea, Heather M. Ryan

Abstract:

This paper presents an approach for the classification of an unstructured format description for identification of file formats. The main contribution of this work is the employment of data mining techniques to support file format selection with just the unstructured text description that comprises the most important format features for a particular organisation. Subsequently, the file format indentification method employs file format classifier and associated configurations to support digital preservation experts with an estimation of required file format. Our goal is to make use of a format specification knowledge base aggregated from a different Web sources in order to select file format for a particular institution. Using the naive Bayes method, the decision support system recommends to an expert, the file format for his institution. The proposed methods facilitate the selection of file format and the quality of a digital preservation process. The presented approach is meant to facilitate decision making for the preservation of digital content in libraries and archives using domain expert knowledge and specifications of file formats. To facilitate decision-making, the aggregated information about the file formats is presented as a file format vocabulary that comprises most common terms that are characteristic for all researched formats. The goal is to suggest a particular file format based on this vocabulary for analysis by an expert. The sample file format calculation and the calculation results including probabilities are presented in the evaluation section.

Keywords: Data mining, digital libraries, digital preservation, file format.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1660
1250 Target Concept Selection by Property Overlap in Ontology Population

Authors: Seong-Bae Park, Sang-Soo Kim, Sewook Oh, Zooyl Zeong, Hojin Lee, Seong Rae Park

Abstract:

An ontology is widely used in many kinds of applications as a knowledge representation tool for domain knowledge. However, even though an ontology schema is well prepared by domain experts, it is tedious and cost-intensive to add instances into the ontology. The most confident and trust-worthy way to add instances into the ontology is to gather instances from tables in the related Web pages. In automatic populating of instances, the primary task is to find the most proper concept among all possible concepts within the ontology for a given table. This paper proposes a novel method for this problem by defining the similarity between the table and the concept using the overlap of their properties. According to a series of experiments, the proposed method achieves 76.98% of accuracy. This implies that the proposed method is a plausible way for automatic ontology population from Web tables.

Keywords: Ontology population, domain knowledge consolidation, target concept selection, property overlap.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1720
1249 The Customization of 3D Last Form Design Based On Weighted Blending

Authors: Shih-Wen Hsiao, Chu-Hsuan Lee, Rong-Qi Chen

Abstract:

When it comes to last, it is regarded as the critical foundation of shoe design and development. Not only the last relates to the comfort of shoes wearing but also it aids the production of shoe styling and manufacturing. In order to enhance the efficiency and application of last development, a computer aided methodology for customized last form designs is proposed in this study. The reverse engineering is mainly applied to the process of scanning for the last form. Then the minimum energy is used for the revision of surface continuity, the surface of the last is reconstructed with the feature curves of the scanned last. When the surface of a last is reconstructed, based on the foundation of the proposed last form reconstruction module, the weighted arithmetic mean method is applied to the calculation on the shape morphing which differs from the grading for the control mesh of last, and the algorithm of subdivision is used to create the surface of last mesh, thus the feet-fitting 3D last form of different sizes is generated from its original form feature with functions remained. Finally, the practicability of the proposed methodology is verified through later case studies.

Keywords: 3D last design, Customization, Reverse engineering, Weighted morphing, Shape blending.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2204
1248 A Scenario Oriented Supplier Selection by Considering a Multi Tier Supplier Network

Authors: Mohammad Najafi Nobar, Bahareh Pourmehr, Mehdi Hajimirarab

Abstract:

One of the main processes of supply chain management is supplier selection process which its accurate implementation can dramatically increase company competitiveness. In presented article model developed based on the features of second tiers suppliers and four scenarios are predicted in order to help the decision maker (DM) in making up his/her mind. In addition two tiers of suppliers have been considered as a chain of suppliers. Then the proposed approach is solved by a method combined of concepts of fuzzy set theory (FST) and linear programming (LP) which has been nourished by real data extracted from an engineering design and supplying parts company. At the end results reveal the high importance of considering second tier suppliers features as criteria for selecting the best supplier.

Keywords: Supply Chain Management (SCM), SupplierSelection, Second Tier Supplier, Scenario Planning, Green Factor, Linear Programming, Fuzzy Set Theory

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1806
1247 Quantifying the Stability of Software Systems via Simulation in Dependency Networks

Authors: Weifeng Pan

Abstract:

The stability of a software system is one of the most important quality attributes affecting the maintenance effort. Many techniques have been proposed to support the analysis of software stability at the architecture, file, and class level of software systems, but little effort has been made for that at the feature (i.e., method and attribute) level. And the assumptions the existing techniques based on always do not meet the practice to a certain degree. Considering that, in this paper, we present a novel metric, Stability of Software (SoS), to measure the stability of object-oriented software systems by software change propagation analysis using a simulation way in software dependency networks at feature level. The approach is evaluated by case studies on eight open source Java programs using different software structures (one employs design patterns versus one does not) for the same object-oriented program. The results of the case studies validate the effectiveness of the proposed metric. The approach has been fully automated by a tool written in Java.

Keywords: Software stability, change propagation, design pattern, software maintenance, object-oriented (OO) software.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1678
1246 The Study on the Stationarity of Energy Consumption in US States: Considering Structural Breaks, Nonlinearity, and Cross- Sectional Dependency

Authors: Wen-Chi Liu

Abstract:

This study applies the sequential panel selection method (SPSM) procedure proposed by Chortareas and Kapetanios (2009) to investigate the time-series properties of energy consumption in 50 US states from 1963 to 2009. SPSM involves the classification of the entire panel into a group of stationary series and a group of non-stationary series to identify how many and which series in the panel are stationary processes. Empirical results obtained through SPSM with the panel KSS unit root test developed by Ucar and Omay (2009) combined with a Fourier function indicate that energy consumption in all the 50 US states are stationary. The results of this study have important policy implications for the 50 US states.

Keywords: Energy Consumption, Panel Unit Root, Sequential Panel Selection Method, Fourier Function, US states.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1814