Search results for: multi-class classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1133

Search results for: multi-class classification

443 Learning of Class Membership Values by Ellipsoidal Decision Regions

Authors: Leehter Yao, Chin-Chin Lin

Abstract:

A novel method of learning complex fuzzy decision regions in the n-dimensional feature space is proposed. Through the fuzzy decision regions, a given pattern's class membership value of every class is determined instead of the conventional crisp class the pattern belongs to. The n-dimensional fuzzy decision region is approximated by union of hyperellipsoids. By explicitly parameterizing these hyperellipsoids, the decision regions are determined by estimating the parameters of each hyperellipsoid.Genetic Algorithm is applied to estimate the parameters of each region component. With the global optimization ability of GA, the learned decision region can be arbitrarily complex.

Keywords: Ellipsoid, genetic algorithm, decision regions, classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1429
442 Categorical Missing Data Imputation Using Fuzzy Neural Networks with Numerical and Categorical Inputs

Authors: Pilar Rey-del-Castillo, Jesús Cardeñosa

Abstract:

There are many situations where input feature vectors are incomplete and methods to tackle the problem have been studied for a long time. A commonly used procedure is to replace each missing value with an imputation. This paper presents a method to perform categorical missing data imputation from numerical and categorical variables. The imputations are based on Simpson-s fuzzy min-max neural networks where the input variables for learning and classification are just numerical. The proposed method extends the input to categorical variables by introducing new fuzzy sets, a new operation and a new architecture. The procedure is tested and compared with others using opinion poll data.

Keywords: Classifier, imputation techniques, fuzzy systems, fuzzy min-max neural networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1785
441 A Classical Method of Optimizing Manufacturing Systems Using a Number of Industrial Engineering Techniques

Authors: John M. Ikome, Martha E. Ikome, Therese Van Wyk

Abstract:

Productivity optimization of a company can significantly increase the company’s output and productivity which can be in the form of corrective actions of ineffective activities, process simplification, and reduction of variations, responsiveness, and reduction of set-up-time which are all under the classification of waste within the manufacturing environment. Deriving a means to eliminate a number of these issues has a key importance for manufacturing organization. This paper focused on a number of industrial engineering techniques which include a cause and effect diagram, to identify and optimize the method or systems being used. Based on our results, it shows that there are a number of variations within the production processes that can significantly disrupt the expected output.

Keywords: Optimization, fishbone diagram, productivity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1003
440 A Study of Classification Models to Predict Drill-Bit Breakage Using Degradation Signals

Authors: Bharatendra Rai

Abstract:

Cutting tools are widely used in manufacturing processes and drilling is the most commonly used machining process. Although drill-bits used in drilling may not be expensive, their breakage can cause damage to expensive work piece being drilled and at the same time has major impact on productivity. Predicting drill-bit breakage, therefore, is important in reducing cost and improving productivity. This study uses twenty features extracted from two degradation signals viz., thrust force and torque. The methodology used involves developing and comparing decision tree, random forest, and multinomial logistic regression models for classifying and predicting drill-bit breakage using degradation signals.

Keywords: Degradation signal, drill-bit breakage, random forest, multinomial logistic regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2245
439 Performance Appraisal System using Multifactorial Evaluation Model

Authors: C. C. Yee, Y.Y.Chen

Abstract:

Performance appraisal of employee is important in managing the human resource of an organization. With the change towards knowledge-based capitalism, maintaining talented knowledge workers is critical. However, management classification of “outstanding", “poor" and “average" performance may not be an easy decision. Besides that, superior might also tend to judge the work performance of their subordinates informally and arbitrarily especially without the existence of a system of appraisal. In this paper, we propose a performance appraisal system using multifactorial evaluation model in dealing with appraisal grades which are often express vaguely in linguistic terms. The proposed model is for evaluating staff performance based on specific performance appraisal criteria. The project was collaboration with one of the Information and Communication Technology company in Malaysia with reference to its performance appraisal process.

Keywords: Multifactorial Evaluation Model, performance appraisal system, decision support system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4271
438 Scene Adaptive Shadow Detection Algorithm

Authors: Mohammed Ibrahim M, Anupama R.

Abstract:

Robustness is one of the primary performance criteria for an Intelligent Video Surveillance (IVS) system. One of the key factors in enhancing the robustness of dynamic video analysis is,providing accurate and reliable means for shadow detection. If left undetected, shadow pixels may result in incorrect object tracking and classification, as it tends to distort localization and measurement information. Most of the algorithms proposed in literature are computationally expensive; some to the extent of equalling computational requirement of motion detection. In this paper, the homogeneity property of shadows is explored in a novel way for shadow detection. An adaptive division image (which highlights homogeneity property of shadows) analysis followed by a relatively simpler projection histogram analysis for penumbra suppression is the key novelty in our approach.

Keywords: homogeneity, penumbra, projection histogram, shadow correction

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1907
437 Vulnerability of Groundwater Resources Selected for Emergency Water Supply

Authors: Frantisek Bozek, Alena Bumbova, Eduard Bakos

Abstract:

Paper is dealing with vulnerability concerning elements of hydrological structures and elements of technological equipments which are acceptable for groundwater resources. The vulnerability assessment stems from the application of the register of hazards and a potential threat to individual water source elements within each type of hazard. The proposed procedure is pattern for assessing the risks of disturbance, damage, or destruction of water source by the identified natural or technological hazards and consequently for classification of these risks in relation to emergency water supply. Using of this procedure was verified on selected groundwater resource in particular region, which seems to be as potentially useful for crisis planning system.

Keywords: Hazard, Hydrogeological Structure, Elements, Index, Sensitivity, Water Source, Vulnerability

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1445
436 Torque Based Selection of ANN for Fault Diagnosis of Wound Rotor Asynchronous Motor-Converter Association

Authors: Djalal Eddine Khodja, Boukhemis Chetate

Abstract:

In this paper, an automatic system of diagnosis was developed to detect and locate in real time the defects of the wound rotor asynchronous machine associated to electronic converter. For this purpose, we have treated the signals of the measured parameters (current and speed) to use them firstly, as indicating variables of the machine defects under study and, secondly, as inputs to the Artificial Neuron Network (ANN) for their classification in order to detect the defect type in progress. Once a defect is detected, the interpretation system of information will give the type of the defect and its place of appearance.

Keywords: Artificial Neuron Networks (ANN), Effective Value (RMS), Experimental results, Failure detection Indicating values, Motor-converter unit.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1502
435 Hand Written Digit Recognition by Multiple Classifier Fusion based on Decision Templates Approach

Authors: Reza Ebrahimpour, Samaneh Hamedi

Abstract:

Classifier fusion may generate more accurate classification than each of the basic classifiers. Fusion is often based on fixed combination rules like the product, average etc. This paper presents decision templates as classifier fusion method for the recognition of the handwritten English and Farsi numerals (1-9). The process involves extracting a feature vector on well-known image databases. The extracted feature vector is fed to multiple classifier fusion. A set of experiments were conducted to compare decision templates (DTs) with some combination rules. Results from decision templates conclude 97.99% and 97.28% for Farsi and English handwritten digits.

Keywords: Decision templates, multi-layer perceptron, characteristics Loci, principle component analysis (PCA).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1961
434 Deficiencies of Lung Segmentation Techniques using CT Scan Images for CAD

Authors: Nisar Ahmed Memon, Anwar Majid Mirza, S.A.M. Gilani

Abstract:

Segmentation is an important step in medical image analysis and classification for radiological evaluation or computer aided diagnosis. This paper presents the problem of inaccurate lung segmentation as observed in algorithms presented by researchers working in the area of medical image analysis. The different lung segmentation techniques have been tested using the dataset of 19 patients consisting of a total of 917 images. We obtained datasets of 11 patients from Ackron University, USA and of 8 patients from AGA Khan Medical University, Pakistan. After testing the algorithms against datasets, the deficiencies of each algorithm have been highlighted.

Keywords: Computer Aided Diagnosis (CAD), MathematicalMorphology, Medical Image Analysis, Region Growing, Segmentation, Thresholding,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2342
433 Improving Academic Performance Prediction using Voting Technique in Data Mining

Authors: Ikmal Hisyam Mohamad Paris, Lilly Suriani Affendey, Norwati Mustapha

Abstract:

In this paper we compare the accuracy of data mining methods to classifying students in order to predicting student-s class grade. These predictions are more useful for identifying weak students and assisting management to take remedial measures at early stages to produce excellent graduate that will graduate at least with second class upper. Firstly we examine single classifiers accuracy on our data set and choose the best one and then ensembles it with a weak classifier to produce simple voting method. We present results show that combining different classifiers outperformed other single classifiers for predicting student performance.

Keywords: Classification, Data Mining, Prediction, Combination of Multiple Classifiers.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2760
432 Using Fractional Factorial Designs for Variable Importance in Random Forest Models

Authors: Ewa. M. Sztendur, Neil T. Diamond

Abstract:

Random Forests are a powerful classification technique, consisting of a collection of decision trees. One useful feature of Random Forests is the ability to determine the importance of each variable in predicting the outcome. This is done by permuting each variable and computing the change in prediction accuracy before and after the permutation. This variable importance calculation is similar to a one-factor-at a time experiment and therefore is inefficient. In this paper, we use a regular fractional factorial design to determine which variables to permute. Based on the results of the trials in the experiment, we calculate the individual importance of the variables, with improved precision over the standard method. The method is illustrated with a study of student attrition at Monash University.

Keywords: Random Forests, Variable Importance, Fractional Factorial Designs, Student Attrition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2000
431 Framework and Characterization of Physical Internet

Authors: Charifa Fergani, Adiba El Bouzekri El Idrissi, Suzanne Marcotte, Abdelowahed Hajjaji

Abstract:

Over the last years, a new paradigm known as Physical Internet has been developed, and studied in logistics management. The purpose of this global and open system is to deal with logistics grand challenge by setting up an efficient and sustainable Logistics Web. The purpose of this paper is to review scientific articles dedicated to Physical Internet topic, and to provide a clustering strategy enabling to classify the literature on the Physical Internet, to follow its evolution, as well as to criticize it. The classification is based on three factors: Logistics Web, organization, and resources. Several papers about Physical Internet have been classified and analyzed along the Logistics Web, resources and organization views at a strategic, tactical and operational level, respectively. A developed cluster analysis shows which topics of the Physical Internet that are the less covered actually. Future researches are outlined for these topics.

Keywords: Logistics web, Physical Internet, PI characterization, taxonomy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 862
430 Joint Use of Factor Analysis (FA) and Data Envelopment Analysis (DEA) for Ranking of Data Envelopment Analysis

Authors: Reza Nadimi, Fariborz Jolai

Abstract:

This article combines two techniques: data envelopment analysis (DEA) and Factor analysis (FA) to data reduction in decision making units (DMU). Data envelopment analysis (DEA), a popular linear programming technique is useful to rate comparatively operational efficiency of decision making units (DMU) based on their deterministic (not necessarily stochastic) input–output data and factor analysis techniques, have been proposed as data reduction and classification technique, which can be applied in data envelopment analysis (DEA) technique for reduction input – output data. Numerical results reveal that the new approach shows a good consistency in ranking with DEA.

Keywords: Effectiveness, Decision Making, Data EnvelopmentAnalysis, Factor Analysis

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2428
429 EHW from Consumer Point of View: Consumer-Triggered Evolution

Authors: Yerbol Sapargaliyev, Tatiana Kalganova

Abstract:

Evolvable Hardware (EHW) has been regarded as adaptive system acquired by wide application market. Consumer market of any good requires diversity to satisfy consumers- preferences. Adaptation of EHW is a key technology that could provide individual approach to every particular user. This situation raises a question: how to set target for evolutionary algorithm? The existing techniques do not allow consumer to influence evolutionary process. Only designer at the moment is capable to influence the evolution. The proposed consumer-triggered evolution overcomes this problem by introducing new features to EHW that help adaptive system to obtain targets during consumer stage. Classification of EHW is given according to responsiveness, imitation of human behavior and target circuit response. Home intelligent water heating system is considered as an example.

Keywords: Actuators, consumer-triggered evolution, evolvable hardware, sensors.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1489
428 A New Face Recognition Method using PCA, LDA and Neural Network

Authors: A. Hossein Sahoolizadeh, B. Zargham Heidari, C. Hamid Dehghani

Abstract:

In this paper, a new face recognition method based on PCA (principal Component Analysis), LDA (Linear Discriminant Analysis) and neural networks is proposed. This method consists of four steps: i) Preprocessing, ii) Dimension reduction using PCA, iii) feature extraction using LDA and iv) classification using neural network. Combination of PCA and LDA is used for improving the capability of LDA when a few samples of images are available and neural classifier is used to reduce number misclassification caused by not-linearly separable classes. The proposed method was tested on Yale face database. Experimental results on this database demonstrated the effectiveness of the proposed method for face recognition with less misclassification in comparison with previous methods.

Keywords: Face recognition Principal component analysis, Linear discriminant analysis, Neural networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3219
427 Target Detection with Improved Image Texture Feature Coding Method and Support Vector Machine

Authors: R. Xu, X. Zhao, X. Li, C. Kwan, C.-I Chang

Abstract:

An image texture analysis and target recognition approach of using an improved image texture feature coding method (TFCM) and Support Vector Machine (SVM) for target detection is presented. With our proposed target detection framework, targets of interest can be detected accurately. Cascade-Sliding-Window technique was also developed for automated target localization. Application to mammogram showed that over 88% of normal mammograms and 80% of abnormal mammograms can be correctly identified. The approach was also successfully applied to Synthetic Aperture Radar (SAR) and Ground Penetrating Radar (GPR) images for target detection.

Keywords: Image texture analysis, feature extraction, target detection, pattern classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1781
426 Data Analysis Techniques for Predictive Maintenance on Fleet of Heavy-Duty Vehicles

Authors: Antonis Sideris, Elias Chlis Kalogeropoulos, Konstantia Moirogiorgou

Abstract:

The present study proposes a methodology for the efficient daily management of fleet vehicles and construction machinery. The application covers the area of remote monitoring of heavy-duty vehicles operation parameters, where specific sensor data are stored and examined in order to provide information about the vehicle’s health. The vehicle diagnostics allow the user to inspect whether maintenance tasks need to be performed before a fault occurs. A properly designed machine learning model is proposed for the detection of two different types of faults through classification. Cross validation is used and the accuracy of the trained model is checked with the confusion matrix.

Keywords: Fault detection, feature selection, machine learning, predictive maintenance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 785
425 Risk Classification of SMEs by Early Warning Model Based on Data Mining

Authors: Nermin Ozgulbas, Ali Serhan Koyuncugil

Abstract:

One of the biggest problems of SMEs is their tendencies to financial distress because of insufficient finance background. In this study, an Early Warning System (EWS) model based on data mining for financial risk detection is presented. CHAID algorithm has been used for development of the EWS. Developed EWS can be served like a tailor made financial advisor in decision making process of the firms with its automated nature to the ones who have inadequate financial background. Besides, an application of the model implemented which covered 7,853 SMEs based on Turkish Central Bank (TCB) 2007 data. By using EWS model, 31 risk profiles, 15 risk indicators, 2 early warning signals, and 4 financial road maps has been determined for financial risk mitigation.

Keywords: Early Warning Systems, Data Mining, Financial Risk, SMEs.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3390
424 Temporary Housing Respond to Disasters in Developing Countries- Case Study: Iran-Ardabil and Lorestan Province Earthquakes

Authors: Farzaneh Hadafi, Alireza Fallahi

Abstract:

Natural Disasters have always occurred through earth life. As human life developed on earth, he faced with different disasters. Since disasters would destroy his living areas and ruin his life, he learned how to respond and overcome to these matters. Nowadays, in the era of industrialized world and informatics, the man kind seeks for stages and classification of pre and post disaster process in order to identify a framework in these circumstances. Because too many parameters complicate these frameworks and proceedings, it seems that this goal has not been properly established yet and the only resource is guidelines of UNDRO (1982) [1]. This paper will discuss about temporary housing as one of an approved stage in disaster management field and investigate the affects of disapproval or dismissal of this at two earthquakes which took place in Iran.

Keywords: Temporary Housing, Temporary Sheltering, DisasterManagement, Iran

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2302
423 Evolutionary Feature Selection for Text Documents using the SVM

Authors: Daniel I. Morariu, Lucian N. Vintan, Volker Tresp

Abstract:

Text categorization is the problem of classifying text documents into a set of predefined classes. After a preprocessing step, the documents are typically represented as large sparse vectors. When training classifiers on large collections of documents, both the time and memory restrictions can be quite prohibitive. This justifies the application of feature selection methods to reduce the dimensionality of the document-representation vector. In this paper, we present three feature selection methods: Information Gain, Support Vector Machine feature selection called (SVM_FS) and Genetic Algorithm with SVM (called GA_SVM). We show that the best results were obtained with GA_SVM method for a relatively small dimension of the feature vector.

Keywords: Feature Selection, Learning with Kernels, Support Vector Machine, Genetic Algorithm, and Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1708
422 Feature Selection Methods for an Improved SVM Classifier

Authors: Daniel Morariu, Lucian N. Vintan, Volker Tresp

Abstract:

Text categorization is the problem of classifying text documents into a set of predefined classes. After a preprocessing step, the documents are typically represented as large sparse vectors. When training classifiers on large collections of documents, both the time and memory restrictions can be quite prohibitive. This justifies the application of feature selection methods to reduce the dimensionality of the document-representation vector. In this paper, three feature selection methods are evaluated: Random Selection, Information Gain (IG) and Support Vector Machine feature selection (called SVM_FS). We show that the best results were obtained with SVM_FS method for a relatively small dimension of the feature vector. Also we present a novel method to better correlate SVM kernel-s parameters (Polynomial or Gaussian kernel).

Keywords: Feature Selection, Learning with Kernels, SupportVector Machine, and Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1832
421 Meta Random Forests

Authors: Praveen Boinee, Alessandro De Angelis, Gian Luca Foresti

Abstract:

Leo Breimans Random Forests (RF) is a recent development in tree based classifiers and quickly proven to be one of the most important algorithms in the machine learning literature. It has shown robust and improved results of classifications on standard data sets. Ensemble learning algorithms such as AdaBoost and Bagging have been in active research and shown improvements in classification results for several benchmarking data sets with mainly decision trees as their base classifiers. In this paper we experiment to apply these Meta learning techniques to the random forests. We experiment the working of the ensembles of random forests on the standard data sets available in UCI data sets. We compare the original random forest algorithm with their ensemble counterparts and discuss the results.

Keywords: Random Forests [RF], ensembles, UCI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2716
420 The Presence of Enterobacters (E.Coli and Salmonella spp.) in Industrial Growing Poultry in Albania

Authors: Boci J., Çabeli P., Shtylla T., Kumbe I.

Abstract:

The development of the poultry industry in Albania is mainly based on the existence of intensive modern farms with huge capacities, which often are mixed with other forms. Colibacillosis is commonly displayed regardless of the type of breeding, delivering high mortality in poultry industry. The mechanisms with which pathogen enterobacters are able to cause the infection in poultry are not yet clear. The routine diagnose in the field, followed by isolation of E. coli and species of Salmonella genres in reference laboratories cannot lead in classification or full recognition of circulative strains in a territory, if it is not performed a differentiation among the present microorganisms in intensive farms and those in rural areas. In this study were isolated 1.496 strains of E. coli and 378 Salmonella spp. This study, presents distribution of poultry pathogenosity of E.coli and Salmonella spp., based on the usage of innovative diagnostic methods.

Keywords: poultry, E.coli, Salmonella spp., Enterobacter

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2073
419 Technology and Its Social Implications: Myths and Realities in the Interpretation of the Concept

Authors: E. V. Veraszto, J. T. F. Camargo, D. Silva, N. A. Miranda, F. O. Simon, S. F. Amaral, L. V. Freitas

Abstract:

The concept of technology as well as itself has evolved continuously over time, such that, nowadays, this concept is still marked by myths and realities. Even the concept of science is frequently misunderstood as technology. In this way, this paper presents different forms of interpretation of the concept of technology in the course of history, as well as the social and cultural aspects associated with it, through an analysis made by means of insights from sociological studies of science and technology and its multiple relations with society. Through the analysis of contents, the paper presents a classification of how technology is interpreted in the social sphere and search channel efforts to show how a broader understanding can contribute to better interpretations of how scientific and technological development influences the environment in which we operate. The text also presents a particular point of view for the interpretation of the concept from the analysis throughout the whole work.

Keywords: Technology, conceptions of technology, technological myths, definition of technology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1543
418 Pervasive Differentiated Services: A QoS Model for Pervasive Systems

Authors: Sherif G. Aly

Abstract:

In this article, we introduce a mechanism by which the same concept of differentiated services used in network transmission can be applied to provide quality of service levels to pervasive systems applications. The classical DiffServ model, including marking and classification, assured forwarding, and expedited forwarding, are all utilized to create quality of service guarantees for various pervasive applications requiring different levels of quality of service. Through a collection of various sensors, personal devices, and data sources, the transmission of contextsensitive data can automatically occur within a pervasive system with a given quality of service level. Triggers, initiators, sources, and receivers are four entities labeled in our mechanism. An explanation of the role of each is provided, and how quality of service is guaranteed.

Keywords: Pervasive systems, quality of service, differentiated services, mobile devices.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1501
417 A Proposed Hybrid Approach for Feature Selection in Text Document Categorization

Authors: M. F. Zaiyadi, B. Baharudin

Abstract:

Text document categorization involves large amount of data or features. The high dimensionality of features is a troublesome and can affect the performance of the classification. Therefore, feature selection is strongly considered as one of the crucial part in text document categorization. Selecting the best features to represent documents can reduce the dimensionality of feature space hence increase the performance. There were many approaches has been implemented by various researchers to overcome this problem. This paper proposed a novel hybrid approach for feature selection in text document categorization based on Ant Colony Optimization (ACO) and Information Gain (IG). We also presented state-of-the-art algorithms by several other researchers.

Keywords: Ant colony optimization, feature selection, information gain, text categorization, text representation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2071
416 Automatic Threshold Search for Heat Map Based Feature Selection: A Cancer Dataset Analysis

Authors: Carlos Huertas, Reyes Juarez-Ramirez

Abstract:

Public health is one of the most critical issues today; therefore, there is great interest to improve technologies in the area of diseases detection. With machine learning and feature selection, it has been possible to aid the diagnosis of several diseases such as cancer. In this work, we present an extension to the Heat Map Based Feature Selection algorithm, this modification allows automatic threshold parameter selection that helps to improve the generalization performance of high dimensional data such as mass spectrometry. We have performed a comparison analysis using multiple cancer datasets and compare against the well known Recursive Feature Elimination algorithm and our original proposal, the results show improved classification performance that is very competitive against current techniques.

Keywords: Feature selection, mass spectrometry, biomarker discovery, cancer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1594
415 A Proposed Approach for Emotion Lexicon Enrichment

Authors: Amr Mansour Mohsen, Hesham Ahmed Hassan, Amira M. Idrees

Abstract:

Document Analysis is an important research field that aims to gather the information by analyzing the data in documents. As one of the important targets for many fields is to understand what people actually want, sentimental analysis field has been one of the vital fields that are tightly related to the document analysis. This research focuses on analyzing text documents to classify each document according to its opinion. The aim of this research is to detect the emotions from text documents based on enriching the lexicon with adapting their content based on semantic patterns extraction. The proposed approach has been presented, and different experiments are applied by different perspectives to reveal the positive impact of the proposed approach on the classification results.

Keywords: Document analysis, sentimental analysis, emotion detection, WEKA tool, NRC Lexicon.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1458
414 Prediction of Cardiovascular Disease by Applying Feature Extraction

Authors: Nebi Gedik

Abstract:

Heart disease threatens the lives of a great number of people every year around the world. Heart issues lead to many of all deaths; therefore, early diagnosis and treatment are critical. The diagnosis of heart disease is complicated due to several factors affecting health such as high blood pressure, raised cholesterol, an irregular pulse rhythm, and more. Artificial intelligence has the potential to assist in the early detection and treatment of diseases. Improving heart failure prediction is one of the primary goals of research on heart disease risk assessment. This study aims to determine the features that provide the most successful classification prediction in detecting cardiovascular disease. The performances of each feature are compared using the K-Nearest Neighbor machine learning method. The feature that gives the most successful performance has been identified.

Keywords: Cardiovascular disease, feature extraction, supervised learning, k-NN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 142