Search results for: feature library

1053 Decision Tree-based Feature Ranking using Manhattan Hierarchical Cluster Criterion

Authors: Yasmin Mohd Yacob, Harsa A. Mat Sakim, Nor Ashidi Mat Isa

Abstract:

Feature selection study is gaining importance due to its contribution to save classification cost in terms of time and computation load. In search of essential features, one of the methods to search the features is via the decision tree. Decision tree act as an intermediate feature space inducer in order to choose essential features. In decision tree-based feature selection, some studies used decision tree as a feature ranker with a direct threshold measure, while others remain the decision tree but utilized pruning condition that act as a threshold mechanism to choose features. This paper proposed threshold measure using Manhattan Hierarchical Cluster distance to be utilized in feature ranking in order to choose relevant features as part of the feature selection process. The result is promising, and this method can be improved in the future by including test cases of a higher number of attributes.

Keywords: Feature ranking, decision tree, hierarchical cluster, Manhattan distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1926

1052 Management Decision System for the Documentary Archives in the Library of a Public Moroccan Institution: Case of Sultan Moulay Slimane University, Beni Mellal

Authors: Jaouad Oukrich, Belaid Bouikhalene, Noureddine Askour

Abstract:

This paper deals with the problem of management of information resources in libraries of the public institution Sultan Moulay Slimane University (SMSU) in order to analyze the satisfaction of the readers, and allow university leaders to make better strategic and instant decisions. For this, the integration of an integrated management decision library system is a priority program of higher education, as part of the Digital Morocco, which has a proactive policy to develop the use of new technologies information and communication in higher institutions. This operational information system can provide better services to the students and for the leaders. Our approach is to integrate the tools of business intelligence (BI) in the library management by using power BI.

Keywords: PMB, integrated library management system, ILMS, document, SMSU, power BI, satisfaction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1861

1051 Triangular Geometric Feature for Offline Signature Verification

Authors: Zuraidasahana Zulkarnain, Mohd Shafry Mohd Rahim, Nor Anita Fairos Ismail, Mohd Azhar M. Arsad

Abstract:

Handwritten signature is accepted widely as a biometric characteristic for personal authentication. The use of appropriate features plays an important role in determining accuracy of signature verification; therefore, this paper presents a feature based on the geometrical concept. To achieve the aim, triangle attributes are exploited to design a new feature since the triangle possesses orientation, angle and transformation that would improve accuracy. The proposed feature uses triangulation geometric set comprising of sides, angles and perimeter of a triangle which is derived from the center of gravity of a signature image. For classification purpose, Euclidean classifier along with Voting-based classifier is used to verify the tendency of forgery signature. This classification process is experimented using triangular geometric feature and selected global features. Based on an experiment that was validated using Grupo de Senales 960 (GPDS-960) signature database, the proposed triangular geometric feature achieves a lower Average Error Rates (AER) value with a percentage of 34% as compared to 43% of the selected global feature. As a conclusion, the proposed triangular geometric feature proves to be a more reliable feature for accurate signature verification.

Keywords: biometrics, euclidean classifier, feature extraction, offline signature verification, VOTING-based classifier

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1912

1050 Product Feature Modelling for Integrating Product Design and Assembly Process Planning

Authors: Baha Hasan, Jan Wikander

Abstract:

This paper describes a part of the integrating work between assembly design and assembly process planning domains (APP). The work is based, in its first stage, on modelling assembly features to support APP. A multi-layer architecture, based on feature-based modelling, is proposed to establish a dynamic and adaptable link between product design using CAD tools and APP. The proposed approach is based on deriving “specific function” features from the “generic” assembly and form features extracted from the CAD tools. A hierarchal structure from “generic” to “specific” and from “high level geometrical entities” to “low level geometrical entities” is proposed in order to integrate geometrical and assembly data extracted from geometrical and assembly modelers to the required processes and resources in APP. The feature concept, feature-based modelling, and feature recognition techniques are reviewed.

Keywords: Assembly feature, assembly process planning, feature, feature-based modelling, form feature, ontology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1125

1049 Analysis of Feature Space for a 2d/3d Vision based Emotion Recognition Method

Authors: Robert Niese, Ayoub Al-Hamadi, Bernd Michaelis

Abstract:

In modern human computer interaction systems (HCI), emotion recognition is becoming an imperative characteristic. The quest for effective and reliable emotion recognition in HCI has resulted in a need for better face detection, feature extraction and classification. In this paper we present results of feature space analysis after briefly explaining our fully automatic vision based emotion recognition method. We demonstrate the compactness of the feature space and show how the 2d/3d based method achieves superior features for the purpose of emotion classification. Also it is exposed that through feature normalization a widely person independent feature space is created. As a consequence, the classifier architecture has only a minor influence on the classification result. This is particularly elucidated with the help of confusion matrices. For this purpose advanced classification algorithms, such as Support Vector Machines and Artificial Neural Networks are employed, as well as the simple k- Nearest Neighbor classifier.

Keywords: Facial expression analysis, Feature extraction, Image processing, Pattern Recognition, Application.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1871

1048 Analysis of the Physical Behavior of Library Users in Reading Rooms through GIS: A Case Study of the Central Library of Tehran University

Authors: R. Pournaghi

Abstract:

Taking into account the significance of measuring the daily use of the study space in the libraries in order to develop and reorganize the space for enhancing the efficiency of the study space, the current study aimed to apply GIS in analyzing the study halls of the Central Library and Document Center of Tehran University in order to determine how study desks and chairs were used by the students. The study used a combination of survey-descriptive and system design method. In order to gather the required data, surveydescriptive method was used. For implementing and entering data into ArcGIS and analyzing the data and displaying the results on the maps of the study halls of the library, system design method was utilized. The design of the spatial database of the use of the study halls was measured through the extent of occupancy of the space by the library users and the maps of the study halls of the central library of Tehran University as the case study. The results showed that Abooreyhan hall had the highest rate of occupancy of the desks and chairs compared to the other halls. The Hall of Science and Technology, with an average occupancy rate of 0.39 for the tables represented the lowest number of users and Rashid al-Dins hall, and Science and Technology hall with an average occupancy rate (0.40) had the lowest number of users for seats. In this study, the comparison of the space occupied at different periods in the morning, evenings, afternoons, and several months was performed through GIS. This system analyzed the space relationships effectively and efficiently. The output of this study would be used by administrators and librarians to determine the exact extent of use of the equipment of the study halls and librarians can use the output map to design the space more efficiently at the library.

Keywords: Geospatial Information System, Spatial analysis, Reading Room, Academic libraries, Library’s User, Central Library of Tehran University.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1768

1047 Optimized Facial Features-based Age Classification

Authors: Md. Zahangir Alom, Mei-Lan Piao, Md. Shariful Islam, Nam Kim, Jae-Hyeung Park

Abstract:

The evaluation and measurement of human body dimensions are achieved by physical anthropometry. This research was conducted in view of the importance of anthropometric indices of the face in forensic medicine, surgery, and medical imaging. The main goal of this research is to optimization of facial feature point by establishing a mathematical relationship among facial features and used optimize feature points for age classification. Since selected facial feature points are located to the area of mouth, nose, eyes and eyebrow on facial images, all desire facial feature points are extracted accurately. According this proposes method; sixteen Euclidean distances are calculated from the eighteen selected facial feature points vertically as well as horizontally. The mathematical relationships among horizontal and vertical distances are established. Moreover, it is also discovered that distances of the facial feature follows a constant ratio due to age progression. The distances between the specified features points increase with respect the age progression of a human from his or her childhood but the ratio of the distances does not change (d = 1 .618 ) . Finally, according to the proposed mathematical relationship four independent feature distances related to eight feature points are selected from sixteen distances and eighteen feature point-s respectively. These four feature distances are used for classification of age using Support Vector Machine (SVM)-Sequential Minimal Optimization (SMO) algorithm and shown around 96 % accuracy. Experiment result shows the proposed system is effective and accurate for age classification.

Keywords: 3D Face Model, Face Anthropometrics, Facial Features Extraction, Feature distances, SVM-SMO

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2007

1046 Feature Reduction of Nearest Neighbor Classifiers using Genetic Algorithm

Authors: M. Analoui, M. Fadavi Amiri

Abstract:

The design of a pattern classifier includes an attempt to select, among a set of possible features, a minimum subset of weakly correlated features that better discriminate the pattern classes. This is usually a difficult task in practice, normally requiring the application of heuristic knowledge about the specific problem domain. The selection and quality of the features representing each pattern have a considerable bearing on the success of subsequent pattern classification. Feature extraction is the process of deriving new features from the original features in order to reduce the cost of feature measurement, increase classifier efficiency, and allow higher classification accuracy. Many current feature extraction techniques involve linear transformations of the original pattern vectors to new vectors of lower dimensionality. While this is useful for data visualization and increasing classification efficiency, it does not necessarily reduce the number of features that must be measured since each new feature may be a linear combination of all of the features in the original pattern vector. In this paper a new approach is presented to feature extraction in which feature selection, feature extraction, and classifier training are performed simultaneously using a genetic algorithm. In this approach each feature value is first normalized by a linear equation, then scaled by the associated weight prior to training, testing, and classification. A knn classifier is used to evaluate each set of feature weights. The genetic algorithm optimizes a vector of feature weights, which are used to scale the individual features in the original pattern vectors in either a linear or a nonlinear fashion. By this approach, the number of features used in classifying can be finely reduced.

Keywords: Feature reduction, genetic algorithm, pattern classification, nearest neighbor rule classifiers (k-NNR).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1727

1045 A Java Based Discrete Event Simulation Library

Authors: Brahim Belattar, Abdelhabib Bourouis

Abstract:

This paper describes important features of JAPROSIM, a free and open source simulation library implemented in Java programming language. It provides a framework for building discrete event simulation models. The process interaction world view adopted by JAPROSIM is discussed. We present the architecture and major components of the simulation library. A pedagogical example is given in order to illustrate how to use JAPROSIM for building discrete event simulation models. Further motivations are discussed and suggestions for improving our work are given.

Keywords: Discrete Event Simulation, Object-Oriented Simulation, JAPROSIM, Process Interaction Worldview, Java-based modeling and simulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3755

1044 On-line Lao Handwritten Recognition with Proportional Invariant Feature

Authors: Khampheth Bounnady, Boontee Kruatrachue, Somkiat Wangsiripitak

Abstract:

This paper proposed high level feature for online Lao handwritten recognition. This feature must be high level enough so that the feature is not change when characters are written by different persons at different speed and different proportion (shorter or longer stroke, head, tail, loop, curve). In this high level feature, a character is divided in to sequence of curve segments where a segment start where curve reverse rotation (counter clockwise and clockwise). In each segment, following features are gathered cumulative change in direction of curve (- for clockwise), cumulative curve length, cumulative length of left to right, right to left, top to bottom and bottom to top ( cumulative change in X and Y axis of segment). This feature is simple yet robust for high accuracy recognition. The feature can be gather from parsing the original time sampling sequence X, Y point of the pen location without re-sampling. We also experiment on other segmentation point such as the maximum curvature point which was widely used by other researcher. Experiments results show that the recognition rates are at 94.62% in comparing to using maximum curvature point 75.07%. This is due to a lot of variations of turning points in handwritten.

Keywords: Handwritten feature, chain code, Lao handwritten recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1989

1043 Image Retrieval Based on Multi-Feature Fusion for Heterogeneous Image Databases

Authors: N. W. U. D. Chathurani, Shlomo Geva, Vinod Chandran, Proboda Rajapaksha

Abstract:

Selecting an appropriate image representation is the most important factor in implementing an effective Content-Based Image Retrieval (CBIR) system. This paper presents a multi-feature fusion approach for efficient CBIR, based on the distance distribution of features and relative feature weights at the time of query processing. It is a simple yet effective approach, which is free from the effect of features' dimensions, ranges, internal feature normalization and the distance measure. This approach can easily be adopted in any feature combination to improve retrieval quality. The proposed approach is empirically evaluated using two benchmark datasets for image classification (a subset of the Corel dataset and Oliva and Torralba) and compared with existing approaches. The performance of the proposed approach is confirmed with the significantly improved performance in comparison with the independently evaluated baseline of the previously proposed feature fusion approaches.

Keywords: Feature fusion, image retrieval, membership function, normalization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1304

1042 K-Means Based Matching Algorithm for Multi-Resolution Feature Descriptors

Authors: Shao-Tzu Huang, Chen-Chien Hsu, Wei-Yen Wang

Abstract:

Matching high dimensional features between images is computationally expensive for exhaustive search approaches in computer vision. Although the dimension of the feature can be degraded by simplifying the prior knowledge of homography, matching accuracy may degrade as a tradeoff. In this paper, we present a feature matching method based on k-means algorithm that reduces the matching cost and matches the features between images instead of using a simplified geometric assumption. Experimental results show that the proposed method outperforms the previous linear exhaustive search approaches in terms of the inlier ratio of matched pairs.

Keywords: Feature matching, k-means clustering, scale invariant feature transform, linear exhaustive search.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1035

1041 Feature Weighting and Selection - A Novel Genetic Evolutionary Approach

Authors: Serkawt Khola

Abstract:

A feature weighting and selection method is proposed which uses the structure of a weightless neuron and exploits the principles that govern the operation of Genetic Algorithms and Evolution. Features are coded onto chromosomes in a novel way which allows weighting information regarding the features to be directly inferred from the gene values. The proposed method is significant in that it addresses several problems concerned with algorithms for feature selection and weighting as well as providing significant advantages such as speed, simplicity and suitability for real-time systems.

Keywords: Feature weighting, genetic algorithm, pattern recognition, weightless neuron.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1809

1040 Comparison of Web Development Using Framework over Library

Authors: Syamsul Syafiq, Maslina Daud, Hafizah Hasan, Ahmad Zairi, Shazil Imri, Ezaini Akmar, Norbazilah Rahim

Abstract:

Over recent years, web development has changed significantly. Driven largely by the rise of trends like mobiles, the world of development is rapidly evolving. The rise of the Internet makes web applications crucial nowadays. The web application has been an interface for a company and one of the ways they present their portfolio to the client. On the other hand, the web has become part of the file management system which takes over the role of paper. Due to high demand in web applications, developers are required to develop a web application that are cost-effective, secure and well coded. A framework has been proposed to develop an application rather than using library style development. The framework is helping the developer in creating the structure of a web automatically. This paper will compare the advantages and disadvantages of web development using framework against library-style development. This comparison is based on a previous research paper focusing on two main indicators, which are the impact to management and impact to the developer.

Keywords: Framework, library Style development, web application development, traditional web, static web, dynamic web.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1238

1039 Assessing and Visualizing the Stability of Feature Selectors: A Case Study with Spectral Data

Authors: R.Guzman-Martinez, Oscar Garcia-Olalla, R.Alaiz-Rodriguez

Abstract:

Feature selection plays an important role in applications with high dimensional data. The assessment of the stability of feature selection/ranking algorithms becomes an important issue when the dataset is small and the aim is to gain insight into the underlying process by analyzing the most relevant features. In this work, we propose a graphical approach that enables to analyze the similarity between feature ranking techniques as well as their individual stability. Moreover, it works with whatever stability metric (Canberra distance, Spearman's rank correlation coefficient, Kuncheva's stability index,...). We illustrate this visualization technique evaluating the stability of several feature selection techniques on a spectral binary dataset. Experimental results with a neural-based classifier show that stability and ranking quality may not be linked together and both issues have to be studied jointly in order to offer answers to the domain experts.

Keywords: Feature Selection Stability, Spectral data, Data visualization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1480

1038 Study of Features for Hand-printed Recognition

Authors: Satish Kumar

Abstract:

The feature extraction method(s) used to recognize hand-printed characters play an important role in ICR applications. In order to achieve high recognition rate for a recognition system, the choice of a feature that suits for the given script is certainly an important task. Even if a new feature required to be designed for a given script, it is essential to know the recognition ability of the existing features for that script. Devanagari script is being used in various Indian languages besides Hindi the mother tongue of majority of Indians. This research examines a variety of feature extraction approaches, which have been used in various ICR/OCR applications, in context to Devanagari hand-printed script. The study is conducted theoretically and experimentally on more that 10 feature extraction methods. The various feature extraction methods have been evaluated on Devanagari hand-printed database comprising more than 25000 characters belonging to 43 alphabets. The recognition ability of the features have been evaluated using three classifiers i.e. k-NN, MLP and SVM.

Keywords: Features, Hand-printed, Devanagari, Classifier, Database

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1677

1037 Texture Feature-Based Language Identification Using Wavelet-Domain BDIP and BVLC Features and FFT Feature

Authors: Ick Hoon Jang, Hoon Jae Lee, Dae Hoon Kwon, Ui Young Pak

Abstract:

In this paper, we propose a texture feature-based language identification using wavelet-domain BDIP (block difference of inverse probabilities) and BVLC (block variance of local correlation coefficients) features and FFT (fast Fourier transform) feature. In the proposed method, wavelet subbands are first obtained by wavelet transform from a test image and denoised by Donoho-s soft-thresholding. BDIP and BVLC operators are next applied to the wavelet subbands. FFT blocks are also obtained by 2D (twodimensional) FFT from the blocks into which the test image is partitioned. Some significant FFT coefficients in each block are selected and magnitude operator is applied to them. Moments for each subband of BDIP and BVLC and for each magnitude of significant FFT coefficients are then computed and fused into a feature vector. In classification, a stabilized Bayesian classifier, which adopts variance thresholding, searches the training feature vector most similar to the test feature vector. Experimental results show that the proposed method with the three operations yields excellent language identification even with rather low feature dimension.

Keywords: BDIP, BVLC, FFT, language identification, texture feature, wavelet transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2109

1036 Towards Integrating Statistical Color Features for Human Skin Detection

Authors: Mohd Zamri Osman, Mohd Aizaini Maarof, Mohd Foad Rohani

Abstract:

Human skin detection recognized as the primary step in most of the applications such as face detection, illicit image filtering, hand recognition and video surveillance. The performance of any skin detection applications greatly relies on the two components: feature extraction and classification method. Skin color is the most vital information used for skin detection purpose. However, color feature alone sometimes could not handle images with having same color distribution with skin color. A color feature of pixel-based does not eliminate the skin-like color due to the intensity of skin and skin-like color fall under the same distribution. Hence, the statistical color analysis will be exploited such mean and standard deviation as an additional feature to increase the reliability of skin detector. In this paper, we studied the effectiveness of statistical color feature for human skin detection. Furthermore, the paper analyzed the integrated color and texture using eight classifiers with three color spaces of RGB, YCbCr, and HSV. The experimental results show that the integrating statistical feature using Random Forest classifier achieved a significant performance with an F1-score 0.969.

Keywords: Color space, neural network, random forest, skin detection, statistical feature.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1901

1035 Simulink Library for Reference Current Generation in Active DC Traction Substations

Authors: Mihaela Popescu, Alexandru Bitoleanu

Abstract:

This paper is focused on the reference current calculation in the compensation mode of the active DC traction substations. The so-called p-q theory of the instantaneous reactive power is used as theoretical foundation. The compensation goal of total compensation is taken into consideration for the operation under both sinusoidal and nonsinusoidal voltage conditions, through the two objectives of unity power factor and perfect harmonic cancelation. Four blocks of reference current generation implement the conceived algorithms and they are included in a specific Simulink library, which is useful in a DSP dSPACE-based platform working under Matlab/Simulink. The simulation results validate the correctness of the implementation and fulfillment of the compensation tasks.

Keywords: Active power filter, DC traction, p-q theory, Simulink library.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1684

1034 Gene Selection Guided by Feature Interdependence

Authors: Hung-Ming Lai, Andreas Albrecht, Kathleen Steinhöfel

Abstract:

Cancers could normally be marked by a number of differentially expressed genes which show enormous potential as biomarkers for a certain disease. Recent years, cancer classification based on the investigation of gene expression profiles derived by high-throughput microarrays has widely been used. The selection of discriminative genes is, therefore, an essential preprocess step in carcinogenesis studies. In this paper, we have proposed a novel gene selector using information-theoretic measures for biological discovery. This multivariate filter is a four-stage framework through the analyses of feature relevance, feature interdependence, feature redundancy-dependence and subset rankings, and having been examined on the colon cancer data set. Our experimental result show that the proposed method outperformed other information theorem based filters in all aspect of classification errors and classification performance.

Keywords: Colon cancer, feature interdependence, feature subset selection, gene selection, microarray data analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2096

1033 Parallelization and Optimization of SIFT Feature Extraction on Cluster System

Authors: Mingling Zheng, Zhenlong Song, Ke Xu, Hengzhu Liu

Abstract:

Scale Invariant Feature Transform (SIFT) has been widely applied, but extracting SIFT feature is complicated and time-consuming. In this paper, to meet the demand of the real-time applications, SIFT is parallelized and optimized on cluster system, which is named pSIFT. Redundancy storage and communication are used for boundary data to improve the performance, and before representation of feature descriptor, data reallocation is adopted to keep load balance in pSIFT. Experimental results show that pSIFT achieves good speedup and scalability.

Keywords: cluster, image matching, parallelization and optimization, SIFT.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1824

1032 A Hybrid Feature Selection by Resampling, Chi squared and Consistency Evaluation Techniques

Authors: Amir-Massoud Bidgoli, Mehdi Naseri Parsa

Abstract:

In this paper a combined feature selection method is proposed which takes advantages of sample domain filtering, resampling and feature subset evaluation methods to reduce dimensions of huge datasets and select reliable features. This method utilizes both feature space and sample domain to improve the process of feature selection and uses a combination of Chi squared with Consistency attribute evaluation methods to seek reliable features. This method consists of two phases. The first phase filters and resamples the sample domain and the second phase adopts a hybrid procedure to find the optimal feature space by applying Chi squared, Consistency subset evaluation methods and genetic search. Experiments on various sized datasets from UCI Repository of Machine Learning databases show that the performance of five classifiers (Naïve Bayes, Logistic, Multilayer Perceptron, Best First Decision Tree and JRIP) improves simultaneously and the classification error for these classifiers decreases considerably. The experiments also show that this method outperforms other feature selection methods.

Keywords: feature selection, resampling, reliable features, Consistency Subset Evaluation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2541

1031 A Proposed Hybrid Approach for Feature Selection in Text Document Categorization

Authors: M. F. Zaiyadi, B. Baharudin

Abstract:

Text document categorization involves large amount of data or features. The high dimensionality of features is a troublesome and can affect the performance of the classification. Therefore, feature selection is strongly considered as one of the crucial part in text document categorization. Selecting the best features to represent documents can reduce the dimensionality of feature space hence increase the performance. There were many approaches has been implemented by various researchers to overcome this problem. This paper proposed a novel hybrid approach for feature selection in text document categorization based on Ant Colony Optimization (ACO) and Information Gain (IG). We also presented state-of-the-art algorithms by several other researchers.

Keywords: Ant colony optimization, feature selection, information gain, text categorization, text representation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2022

1030 Automatic Threshold Search for Heat Map Based Feature Selection: A Cancer Dataset Analysis

Authors: Carlos Huertas, Reyes Juarez-Ramirez

Abstract:

Public health is one of the most critical issues today; therefore, there is great interest to improve technologies in the area of diseases detection. With machine learning and feature selection, it has been possible to aid the diagnosis of several diseases such as cancer. In this work, we present an extension to the Heat Map Based Feature Selection algorithm, this modification allows automatic threshold parameter selection that helps to improve the generalization performance of high dimensional data such as mass spectrometry. We have performed a comparison analysis using multiple cancer datasets and compare against the well known Recursive Feature Elimination algorithm and our original proposal, the results show improved classification performance that is very competitive against current techniques.

Keywords: Feature selection, mass spectrometry, biomarker discovery, cancer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1544

1029 Application of Genetic Algorithms to Feature Subset Selection in a Farsi OCR

Authors: M. Soryani, N. Rafat

Abstract:

Dealing with hundreds of features in character recognition systems is not unusual. This large number of features leads to the increase of computational workload of recognition process. There have been many methods which try to remove unnecessary or redundant features and reduce feature dimensionality. Besides because of the characteristics of Farsi scripts, it-s not possible to apply other languages algorithms to Farsi directly. In this paper some methods for feature subset selection using genetic algorithms are applied on a Farsi optical character recognition (OCR) system. Experimental results show that application of genetic algorithms (GA) to feature subset selection in a Farsi OCR results in lower computational complexity and enhanced recognition rate.

Keywords: Feature Subset Selection, Genetic Algorithms, Optical Character Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1934

1028 Digital Library Evaluation by SWARA-WASPAS Method

Authors: Mehmet Yörükoğlu, Serhat Aydın

Abstract:

Since the discovery of the manuscript, mechanical methods for storing, transferring and using the information have evolved into digital methods over the time. In this process, libraries that are the center of the information have also become digitized and become accessible from anywhere and at any time in the world by taking on a structure that has no physical boundaries. In this context, some criteria for information obtained from digital libraries have become more important for users. This paper evaluates the user criteria from different perspectives that make a digital library more useful. The Step-Wise Weight Assessment Ratio Analysis-Weighted Aggregated Sum Product Assessment (SWARA-WASPAS) method is used with flexibility and easy calculation steps for the evaluation of digital library criteria. Three different digital libraries are evaluated by information technology experts according to five conflicting main criteria, ‘interface design’, ‘effects on users’, ‘services’, ‘user engagement’ and ‘context’. Finally, alternatives are ranked in descending order.

Keywords: Digital library, multi criteria decision making, SWARA-WASPAS method.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 806

1027 Feature-Based Machining using Macro

Authors: M. Razak, A. Jusoh, A. Zakaria

Abstract:

This paper presents an on-going research work on the implementation of feature-based machining via macro programming. Repetitive machining features such as holes, slots, pockets etc can readily be encapsulated in macros. Each macro consists of methods on how to machine the shape as defined by the feature. The macro programming technique comprises of a main program and subprograms. The main program allows user to select several subprograms that contain features and define their important parameters. With macros, complex machining routines can be implemented easily and no post processor is required. A case study on machining of a part that comprised of planar face, hole and pocket features using the macro programming technique was carried out. It is envisaged that the macro programming technique can be extended to other feature-based machining fields such as the newly developed STEP-NC domain.

Keywords: Feature-based machining, CNC, Macro, STEP-NC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2639

1026 Methodology of Personalizing Interior Spaces in Public Libraries

Authors: Baharak Mousapour

Abstract:

Creating public spaces which are tailored for the specific demands of the individuals is one of the challenges for the contemporary interior designers. Improving the general knowledge as well as providing a forum for all walks of life to exploit is one of the objectives of a public library. In this regard, interior design in consistent with the demands of the individuals is of paramount importance. Seemingly, study spaces, in particular, those in close relation to the personalized sector, have proven to be challenging, according to the literature. To address this challenge, attributes of individuals, namely, perception of people from public spaces and their interactions with the so-called spaces, should be analyzed to provide interior designers with something to work on. This paper follows the analytic-descriptive research methodology by outlining case study libraries which have personalized public libraries with the investigation of the type of personalization as its primary objective and (I) recognition of physical schedule and the know-how of the spatial connection in indoor design of a library and (II) analysis of each personalized space in relation to other spaces of the library as its secondary objectives. The significance of the current research lies in the concept of personalization as one of the most recent methods of attracting people to libraries. Previous research exists in this regard, but the lack of data concerning personalization makes this topic worth investigating. Hence, this study aims to put forward approaches through real-case studies for the designers to deal with this concept.

Keywords: interior design, library, library design, personalization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 523

1025 Imputation Technique for Feature Selection in Microarray Data Set

Authors: Younies Mahmoud, Mai Mabrouk, Elsayed Sallam

Abstract:

Analyzing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.

Keywords: DNA microarray, feature selection, missing data, bioinformatics.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2731

1024 Genetic Algorithms for Feature Generation in the Context of Audio Classification

Authors: José A. Menezes, Giordano Cabral, Bruno T. Gomes

Abstract:

Choosing good features is an essential part of machine learning. Recent techniques aim to automate this process. For instance, feature learning intends to learn the transformation of raw data into a useful representation to machine learning tasks. In automatic audio classification tasks, this is interesting since the audio, usually complex information, needs to be transformed into a computationally convenient input to process. Another technique tries to generate features by searching a feature space. Genetic algorithms, for instance, have being used to generate audio features by combining or modifying them. We find this approach particularly interesting and, despite the undeniable advances of feature learning approaches, we wanted to take a step forward in the use of genetic algorithms to find audio features, combining them with more conventional methods, like PCA, and inserting search control mechanisms, such as constraints over a confusion matrix. This work presents the results obtained on particular audio classification problems.

Keywords: Feature generation, feature learning, genetic algorithm, music information retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1020