Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 617

Search results for: Decision Tree Classifiers

617 Empirical and Indian Automotive Equity Portfolio Decision Support

Authors: P. Sankar, P. James Daniel Paul, Siddhant Sahu

Abstract:

A brief review of the empirical studies on the methodology of the stock market decision support would indicate that they are at a threshold of validating the accuracy of the traditional and the fuzzy, artificial neural network and the decision trees. Many researchers have been attempting to compare these models using various data sets worldwide. However, the research community is on the way to the conclusive confidence in the emerged models. This paper attempts to use the automotive sector stock prices from National Stock Exchange (NSE), India and analyze them for the intra-sectorial support for stock market decisions. The study identifies the significant variables and their lags which affect the price of the stocks using OLS analysis and decision tree classifiers.

Keywords: Indian Automotive Sector, Stock Market Decisions, Equity Portfolio Analysis, Decision Tree Classifiers, Statistical Data Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
616 A Comparison of Single of Decision Tree, Decision Tree Forest and Group Method of Data Handling to Evaluate the Surface Roughness in Machining Process

Authors: S. Ghorbani, N. I. Polushin

Abstract:

The machinability of workpieces (AISI 1045 Steel, AA2024 aluminum alloy, A48-class30 gray cast iron) in turning operation has been carried out using different types of cutting tool (conventional, cutting tool with holes in toolholder and cutting tool filled up with composite material) under dry conditions on a turning machine at different stages of spindle speed (630-1000 rpm), feed rate (0.05-0.075 mm/rev), depth of cut (0.05-0.15 mm) and tool overhang (41-65 mm). Experimentation was performed as per Taguchi’s orthogonal array. To evaluate the relative importance of factors affecting surface roughness the single decision tree (SDT), Decision tree forest (DTF) and Group method of data handling (GMDH) were applied.

Keywords: Decision Tree Forest, GMDH, surface roughness, taguchi method, turning process.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
615 An Attribute-Centre Based Decision Tree Classification Algorithm

Authors: Gökhan Silahtaroğlu

Abstract:

Decision tree algorithms have very important place at classification model of data mining. In literature, algorithms use entropy concept or gini index to form the tree. The shape of the classes and their closeness to each other some of the factors that affect the performance of the algorithm. In this paper we introduce a new decision tree algorithm which employs data (attribute) folding method and variation of the class variables over the branches to be created. A comparative performance analysis has been held between the proposed algorithm and C4.5.

Keywords: Classification, decision tree, split, pruning, entropy, gini.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
614 Predicting Protein Function using Decision Tree

Authors: Manpreet Singh, Parminder Kaur Wadhwa, Surinder Kaur

Abstract:

The drug discovery process starts with protein identification because proteins are responsible for many functions required for maintenance of life. Protein identification further needs determination of protein function. Proposed method develops a classifier for human protein function prediction. The model uses decision tree for classification process. The protein function is predicted on the basis of matched sequence derived features per each protein function. The research work includes the development of a tool which determines sequence derived features by analyzing different parameters. The other sequence derived features are determined using various web based tools.

Keywords: Sequence Derived Features, decision tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
613 Defect Cause Modeling with Decision Tree and Regression Analysis

Authors: B. Bakır, İ. Batmaz, F. A. Güntürkün, İ. A. İpekçi, G. Köksal, N. E. Özdemirel

Abstract:

The main aim of this study is to identify the most influential variables that cause defects on the items produced by a casting company located in Turkey. To this end, one of the items produced by the company with high defective percentage rates is selected. Two approaches-the regression analysis and decision treesare used to model the relationship between process parameters and defect types. Although logistic regression models failed, decision tree model gives meaningful results. Based on these results, it can be claimed that the decision tree approach is a promising technique for determining the most important process variables.

Keywords: Casting industry, decision tree algorithm C5.0, logistic regression, quality improvement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
612 Comparison between Associative Classification and Decision Tree for HCV Treatment Response Prediction

Authors: Enas M. F. El Houby, Marwa S. Hassan

Abstract:

Combined therapy using Interferon and Ribavirin is the standard treatment in patients with chronic hepatitis C. However, the number of responders to this treatment is low, whereas its cost and side effects are high. Therefore, there is a clear need to predict patient’s response to the treatment based on clinical information to protect the patients from the bad drawbacks, Intolerable side effects and waste of money. Different machine learning techniques have been developed to fulfill this purpose. From these techniques are Associative Classification (AC) and Decision Tree (DT). The aim of this research is to compare the performance of these two techniques in the prediction of virological response to the standard treatment of HCV from clinical information. 200 patients treated with Interferon and Ribavirin; were analyzed using AC and DT. 150 cases had been used to train the classifiers and 50 cases had been used to test the classifiers. The experiment results showed that the two techniques had given acceptable results however the best accuracy for the AC reached 92% whereas for DT reached 80%.

Keywords: Associative Classification, Data mining, Decision tree, HCV, interferon.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
611 Using Single Decision Tree to Assess the Impact of Cutting Conditions on Vibration

Authors: S. Ghorbani, N. I. Polushin

Abstract:

Vibration during machining process is crucial since it affects cutting tool, machine, and workpiece leading to a tool wear, tool breakage, and an unacceptable surface roughness. This paper applies a nonparametric statistical method, single decision tree (SDT), to identify factors affecting on vibration in machining process. Workpiece material (AISI 1045 Steel, AA2024 Aluminum alloy, A48-class30 Gray Cast Iron), cutting tool (conventional, cutting tool with holes in toolholder, cutting tool filled up with epoxy-granite), tool overhang (41-65 mm), spindle speed (630-1000 rpm), feed rate (0.05-0.075 mm/rev) and depth of cut (0.05-0.15 mm) were used as input variables, while vibration was the output parameter. It is concluded that workpiece material is the most important parameters for natural frequency followed by cutting tool and overhang.

Keywords: Cutting condition, vibration, natural frequency, decision tree, CART algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
610 Decision Tree Based Scheduling for Flexible Job Shops with Multiple Process Plans

Authors: H.-H. Doh, J.-M. Yu, Y.-J. Kwon, J.-H. Shin, H.-W. Kim, S.-H. Nam, D.-H. Lee

Abstract:

This paper suggests a decision tree based approach for flexible job shop scheduling with multiple process plans, i.e. each job can be processed through alternative operations, each of which can be processed on alternative machines. The main decision variables are: (a) selecting operation/machine pair; and (b) sequencing the jobs assigned to each machine. As an extension of the priority scheduling approach that selects the best priority rule combination after many simulation runs, this study suggests a decision tree based approach in which a decision tree is used to select a priority rule combination adequate for a specific system state and hence the burdens required for developing simulation models and carrying out simulation runs can be eliminated. The decision tree based scheduling approach consists of construction and scheduling modules. In the construction module, a decision tree is constructed using a four-stage algorithm, and in the scheduling module, a priority rule combination is selected using the decision tree. To show the performance of the decision tree based approach suggested in this study, a case study was done on a flexible job shop with reconfigurable manufacturing cells and a conventional job shop, and the results are reported by comparing it with individual priority rule combinations for the objectives of minimizing total flow time and total tardiness.

Keywords: Flexible job shop scheduling, Decision tree, Priority rules, Case study.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
609 Decision Tree for Competing Risks Survival Probability in Breast Cancer Study

Authors: N. A. Ibrahim, A. Kudus, I. Daud, M. R. Abu Bakar

Abstract:

Competing risks survival data that comprises of more than one type of event has been used in many applications, and one of these is in clinical study (e.g. in breast cancer study). The decision tree method can be extended to competing risks survival data by modifying the split function so as to accommodate two or more risks which might be dependent on each other. Recently, researchers have constructed some decision trees for recurrent survival time data using frailty and marginal modelling. We further extended the method for the case of competing risks. In this paper, we developed the decision tree method for competing risks survival time data based on proportional hazards for subdistribution of competing risks. In particular, we grow a tree by using deviance statistic. The application of breast cancer data is presented. Finally, to investigate the performance of the proposed method, simulation studies on identification of true group of observations were executed.

Keywords: Competing risks, Decision tree, Simulation, Subdistribution Proportional Hazard.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
608 Decision Tree Modeling in Emergency Logistics Planning

Authors: Yousef Abu Nahleh, Arun Kumar, Fugen Daver, Reham Al-Hindawi

Abstract:

Despite the availability of natural disaster related time series data for last 110 years, there is no forecasting tool available to humanitarian relief organizations to determine forecasts for emergency logistics planning. This study develops a forecasting tool based on identifying probability of disaster for each country in the world by using decision tree modeling. Further, the determination of aggregate forecasts leads to efficient pre-disaster planning. Based on the research findings, the relief agencies can optimize the various resources allocation in emergency logistics planning.

Keywords: Decision tree modeling, Forecasting, Humanitarian relief, emergency supply chain.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
607 Improved C-Fuzzy Decision Tree for Intrusion Detection

Authors: Krishnamoorthi Makkithaya, N. V. Subba Reddy, U. Dinesh Acharya

Abstract:

As the number of networked computers grows, intrusion detection is an essential component in keeping networks secure. Various approaches for intrusion detection are currently being in use with each one has its own merits and demerits. This paper presents our work to test and improve the performance of a new class of decision tree c-fuzzy decision tree to detect intrusion. The work also includes identifying best candidate feature sub set to build the efficient c-fuzzy decision tree based Intrusion Detection System (IDS). We investigated the usefulness of c-fuzzy decision tree for developing IDS with a data partition based on horizontal fragmentation. Empirical results indicate the usefulness of our approach in developing the efficient IDS.

Keywords: Data mining, Decision tree, Feature selection, Fuzzyc- means clustering, Intrusion detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
606 Comparison among Various Question Generations for Decision Tree Based State Tying in Persian Language

Authors: Nasibeh Nasiri, Dawood Talebi Khanmiri

Abstract:

Performance of any continuous speech recognition system is highly dependent on performance of the acoustic models. Generally, development of the robust spoken language technology relies on the availability of large amounts of data. Common way to cope with little data for training each state of Markov models is treebased state tying. This tying method applies contextual questions to tie states. Manual procedure for question generation suffers from human errors and is time consuming. Various automatically generated questions are used to construct decision tree. There are three approaches to generate questions to construct HMMs based on decision tree. One approach is based on misrecognized phonemes, another approach basically uses feature table and the other is based on state distributions corresponding to context-independent subword units. In this paper, all these methods of automatic question generation are applied to the decision tree on FARSDAT corpus in Persian language and their results are compared with those of manually generated questions. The results show that automatically generated questions yield much better results and can replace manually generated questions in Persian language.

Keywords: Decision Tree, Markov Models, Speech Recognition, State Tying.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
605 Optimizing Mobile Agents Migration Based on Decision Tree Learning

Authors: Yasser k. Ali, Hesham N. Elmahdy, Sanaa El Olla Hanfy Ahmed

Abstract:

Mobile agents are a powerful approach to develop distributed systems since they migrate to hosts on which they have the resources to execute individual tasks. In a dynamic environment like a peer-to-peer network, Agents have to be generated frequently and dispatched to the network. Thus they will certainly consume a certain amount of bandwidth of each link in the network if there are too many agents migration through one or several links at the same time, they will introduce too much transferring overhead to the links eventually, these links will be busy and indirectly block the network traffic, therefore, there is a need of developing routing algorithms that consider about traffic load. In this paper we seek to create cooperation between a probabilistic manner according to the quality measure of the network traffic situation and the agent's migration decision making to the next hop based on decision tree learning algorithms.

Keywords: Agent Migration, Decision Tree learning, ID3 algorithm, Naive Bayes Classifier

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
604 Lipschitz Classifiers Ensembles: Usage for Classification of Target Events in C-OTDR Monitoring Systems

Authors: Andrey V. Timofeev

Abstract:

This paper introduces an original method for guaranteed estimation of the accuracy for an ensemble of Lipschitz classifiers. The solution was obtained as a finite closed set of alternative hypotheses, which contains an object of classification with probability of not less than the specified value. Thus, the classification is represented by a set of hypothetical classes. In this case, the smaller the cardinality of the discrete set of hypothetical classes is, the higher is the classification accuracy. Experiments have shown that if cardinality of the classifiers ensemble is increased then the cardinality of this set of hypothetical classes is reduced. The problem of the guaranteed estimation of the accuracy for an ensemble of Lipschitz classifiers is relevant in multichannel classification of target events in C-OTDR monitoring systems. Results of suggested approach practical usage to accuracy control in C-OTDR monitoring systems are present.

Keywords: Lipschitz classifiers, confidence set, C-OTDR monitoring, classifiers accuracy, classifiers ensemble.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
603 An Alternative Approach for Assessing the Impact of Cutting Conditions on Surface Roughness Using Single Decision Tree

Authors: S. Ghorbani, N. I. Polushin

Abstract:

In this study, an approach to identify factors affecting on surface roughness in a machining process is presented. This study is based on 81 data about surface roughness over a wide range of cutting tools (conventional, cutting tool with holes, cutting tool with composite material), workpiece materials (AISI 1045 Steel, AA2024 aluminum alloy, A48-class30 gray cast iron), spindle speed (630-1000 rpm), feed rate (0.05-0.075 mm/rev), depth of cut (0.05-0.15 mm) and tool overhang (41-65 mm). A single decision tree (SDT) analysis was done to identify factors for predicting a model of surface roughness, and the CART algorithm was employed for building and evaluating regression tree. Results show that a single decision tree is better than traditional regression models with higher rate and forecast accuracy and strong value.

Keywords: Cutting condition, surface roughness, decision tree, CART algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
602 Attacks Classification in Adaptive Intrusion Detection using Decision Tree

Authors: Dewan Md. Farid, Nouria Harbi, Emna Bahri, Mohammad Zahidur Rahman, Chowdhury Mofizur Rahman

Abstract:

Recently, information security has become a key issue in information technology as the number of computer security breaches are exposed to an increasing number of security threats. A variety of intrusion detection systems (IDS) have been employed for protecting computers and networks from malicious network-based or host-based attacks by using traditional statistical methods to new data mining approaches in last decades. However, today's commercially available intrusion detection systems are signature-based that are not capable of detecting unknown attacks. In this paper, we present a new learning algorithm for anomaly based network intrusion detection system using decision tree algorithm that distinguishes attacks from normal behaviors and identifies different types of intrusions. Experimental results on the KDD99 benchmark network intrusion detection dataset demonstrate that the proposed learning algorithm achieved 98% detection rate (DR) in comparison with other existing methods.

Keywords: Detection rate, decision tree, intrusion detectionsystem, network security.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
601 The Use of Classifiers in Image Analysis of Oil Wells Profiling Process and the Automatic Identification of Events

Authors: Jaqueline M. R. Vieira

Abstract:

Different strategies and tools are available at the oil and gas industry for detecting and analyzing tension and possible fractures in borehole walls. Most of these techniques are based on manual observation of the captured borehole images. While this strategy may be possible and convenient with small images and few data, it may become difficult and suitable to errors when big databases of images must be treated. While the patterns may differ among the image area, depending on many characteristics (drilling strategy, rock components, rock strength, etc.). In this work we propose the inclusion of data-mining classification strategies in order to create a knowledge database of the segmented curves. These classifiers allow that, after some time using and manually pointing parts of borehole images that correspond to tension regions and breakout areas, the system will indicate and suggest automatically new candidate regions, with higher accuracy. We suggest the use of different classifiers methods, in order to achieve different knowledge dataset configurations.

Keywords: Brazil, classifiers, data-mining, Image Segmentation, oil well visualization, classifiers.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
600 Hybrid Anomaly Detection Using Decision Tree and Support Vector Machine

Authors: Elham Serkani, Hossein Gharaee Garakani, Naser Mohammadzadeh, Elaheh Vaezpour

Abstract:

Intrusion detection systems (IDS) are the main components of network security. These systems analyze the network events for intrusion detection. The design of an IDS is through the training of normal traffic data or attack. The methods of machine learning are the best ways to design IDSs. In the method presented in this article, the pruning algorithm of C5.0 decision tree is being used to reduce the features of traffic data used and training IDS by the least square vector algorithm (LS-SVM). Then, the remaining features are arranged according to the predictor importance criterion. The least important features are eliminated in the order. The remaining features of this stage, which have created the highest level of accuracy in LS-SVM, are selected as the final features. The features obtained, compared to other similar articles which have examined the selected features in the least squared support vector machine model, are better in the accuracy, true positive rate, and false positive. The results are tested by the UNSW-NB15 dataset.

Keywords: Intrusion detection system, decision tree, support vector machine, feature selection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
599 Ensemble Learning with Decision Tree for Remote Sensing Classification

Authors: Mahesh Pal

Abstract:

In recent years, a number of works proposing the combination of multiple classifiers to produce a single classification have been reported in remote sensing literature. The resulting classifier, referred to as an ensemble classifier, is generally found to be more accurate than any of the individual classifiers making up the ensemble. As accuracy is the primary concern, much of the research in the field of land cover classification is focused on improving classification accuracy. This study compares the performance of four ensemble approaches (boosting, bagging, DECORATE and random subspace) with a univariate decision tree as base classifier. Two training datasets, one without ant noise and other with 20 percent noise was used to judge the performance of different ensemble approaches. Results with noise free data set suggest an improvement of about 4% in classification accuracy with all ensemble approaches in comparison to the results provided by univariate decision tree classifier. Highest classification accuracy of 87.43% was achieved by boosted decision tree. A comparison of results with noisy data set suggests that bagging, DECORATE and random subspace approaches works well with this data whereas the performance of boosted decision tree degrades and a classification accuracy of 79.7% is achieved which is even lower than that is achieved (i.e. 80.02%) by using unboosted decision tree classifier.

Keywords: Ensemble learning, decision tree, remote sensingclassification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
598 Decision Tree-based Feature Ranking using Manhattan Hierarchical Cluster Criterion

Authors: Yasmin Mohd Yacob, Harsa A. Mat Sakim, Nor Ashidi Mat Isa

Abstract:

Feature selection study is gaining importance due to its contribution to save classification cost in terms of time and computation load. In search of essential features, one of the methods to search the features is via the decision tree. Decision tree act as an intermediate feature space inducer in order to choose essential features. In decision tree-based feature selection, some studies used decision tree as a feature ranker with a direct threshold measure, while others remain the decision tree but utilized pruning condition that act as a threshold mechanism to choose features. This paper proposed threshold measure using Manhattan Hierarchical Cluster distance to be utilized in feature ranking in order to choose relevant features as part of the feature selection process. The result is promising, and this method can be improved in the future by including test cases of a higher number of attributes.

Keywords: Feature ranking, decision tree, hierarchical cluster, Manhattan distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
597 Performance Analysis of Artificial Neural Network with Decision Tree in Prediction of Diabetes Mellitus

Authors: J. K. Alhassan, B. Attah, S. Misra

Abstract:

Human beings have the ability to make logical decisions. Although human decision - making is often optimal, it is insufficient when huge amount of data is to be classified. Medical dataset is a vital ingredient used in predicting patient’s health condition. In other to have the best prediction, there calls for most suitable machine learning algorithms. This work compared the performance of Artificial Neural Network (ANN) and Decision Tree Algorithms (DTA) as regards to some performance metrics using diabetes data. WEKA software was used for the implementation of the algorithms. Multilayer Perceptron (MLP) and Radial Basis Function (RBF) were the two algorithms used for ANN, while RegTree and LADTree algorithms were the DTA models used. From the results obtained, DTA performed better than ANN. The Root Mean Squared Error (RMSE) of MLP is 0.3913 that of RBF is 0.3625, that of RepTree is 0.3174 and that of LADTree is 0.3206 respectively.

Keywords: Artificial neural network, classification, decision tree, diabetes mellitus.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
596 A Case-Based Reasoning-Decision Tree Hybrid System for Stock Selection

Authors: Yaojun Wang, Yaoqing Wang

Abstract:

Stock selection is an important decision-making problem. Many machine learning and data mining technologies are employed to build automatic stock-selection system. A profitable stock-selection system should consider the stock’s investment value and the market timing. In this paper, we present a hybrid system including both engage for stock selection. This system uses a case-based reasoning (CBR) model to execute the stock classification, uses a decision-tree model to help with market timing and stock selection. The experiments show that the performance of this hybrid system is better than that of other techniques regarding to the classification accuracy, the average return and the Sharpe ratio.

Keywords: Case-based reasoning, decision tree, stock selection, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
595 A Hybrid Classification Method using Artificial Neural Network Based Decision Tree for Automatic Sleep Scoring

Authors: Haoyu Ma, Bin Hu, Mike Jackson, Jingzhi Yan, Wen Zhao

Abstract:

In this paper we propose a new classification method for automatic sleep scoring using an artificial neural network based decision tree. It attempts to treat sleep scoring progress as a series of two-class problems and solves them with a decision tree made up of a group of neural network classifiers, each of which uses a special feature set and is aimed at only one specific sleep stage in order to maximize the classification effect. A single electroencephalogram (EEG) signal is used for our analysis rather than depending on multiple biological signals, which makes greatly simplifies the data acquisition process. Experimental results demonstrate that the average epoch by epoch agreement between the visual and the proposed method in separating 30s wakefulness+S1, REM, S2 and SWS epochs was 88.83%. This study shows that the proposed method performed well in all the four stages, and can effectively limit error propagation at the same time. It could, therefore, be an efficient method for automatic sleep scoring. Additionally, since it requires only a small volume of data it could be suited to pervasive applications.

Keywords: Sleep, Sleep stage, Automatic sleep scoring, Electroencephalography, Decision tree, Artificial neural network

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
594 Integrating Context Priors into a Decision Tree Classification Scheme

Authors: Kasim Terzic, Bernd Neumann

Abstract:

Scene interpretation systems need to match (often ambiguous) low-level input data to concepts from a high-level ontology. In many domains, these decisions are uncertain and benefit greatly from proper context. This paper demonstrates the use of decision trees for estimating class probabilities for regions described by feature vectors, and shows how context can be introduced in order to improve the matching performance.

Keywords: Classification, Decision Trees, Interpretation, Vision

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
593 Judges System for Classifiers Specialization

Authors: Abdel Rodríguez, Isis Bonet, Ricardo Grau, María M. García

Abstract:

In this paper we designed and implemented a new ensemble of classifiers based on a sequence of classifiers which were specialized in regions of the training dataset where errors of its trained homologous are concentrated. In order to separate this regions, and to determine the aptitude of each classifier to properly respond to a new case, it was used another set of classifiers built hierarchically. We explored a selection based variant to combine the base classifiers. We validated this model with different base classifiers using 37 training datasets. It was carried out a statistical comparison of these models with the well known Bagging and Boosting, obtaining significantly superior results with the hierarchical ensemble using Multilayer Perceptron as base classifier. Therefore, we demonstrated the efficacy of the proposed ensemble, as well as its applicability to general problems.

Keywords: classifiers, delegation, ensemble

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
592 Distributed Data-Mining by Probability-Based Patterns

Authors: M. Kargar, F. Gharbalchi

Abstract:

In this paper a new method is suggested for distributed data-mining by the probability patterns. These patterns use decision trees and decision graphs. The patterns are cared to be valid, novel, useful, and understandable. Considering a set of functions, the system reaches to a good pattern or better objectives. By using the suggested method we will be able to extract the useful information from massive and multi-relational data bases.

Keywords: Data-mining, Decision tree, Decision graph, Pattern, Relationship.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
591 Pruning Method of Belief Decision Trees

Authors: Salsabil Trabelsi, Zied Elouedi, Khaled Mellouli

Abstract:

The belief decision tree (BDT) approach is a decision tree in an uncertain environment where the uncertainty is represented through the Transferable Belief Model (TBM), one interpretation of the belief function theory. The uncertainty can appear either in the actual class of training objects or attribute values of objects to classify. In this paper, we develop a post-pruning method of belief decision trees in order to reduce size and improve classification accuracy on unseen cases. The pruning of decision tree has a considerable intention in the areas of machine learning.

Keywords: machine learning, uncertainty, belief function theory, belief decision tree, pruning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
590 Data Mining in Medicine Domain Using Decision Trees and Vector Support Machine

Authors: Djamila Benhaddouche, Abdelkader Benyettou

Abstract:

In this paper, we used data mining to extract biomedical knowledge. In general, complex biomedical data collected in studies of populations are treated by statistical methods, although they are robust, they are not sufficient in themselves to harness the potential wealth of data. For that you used in step two learning algorithms: the Decision Trees and Support Vector Machine (SVM). These supervised classification methods are used to make the diagnosis of thyroid disease. In this context, we propose to promote the study and use of symbolic data mining techniques.

Keywords: A classifier, Algorithms decision tree, knowledge extraction, Support Vector Machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
589 Sequence-based Prediction of Gamma-turn Types using a Physicochemical Property-based Decision Tree Method

Authors: Chyn Liaw, Chun-Wei Tung, Shinn-Jang Ho, Shinn-Ying Ho

Abstract:

The γ-turns play important roles in protein folding and molecular recognition. The prediction and analysis of γ-turn types are important for both protein structure predictions and better understanding the characteristics of different γ-turn types. This study proposed a physicochemical property-based decision tree (PPDT) method to interpretably predict γ-turn types. In addition to the good prediction performance of PPDT, three simple and human interpretable IF-THEN rules are extracted from the decision tree constructed by PPDT. The identified informative physicochemical properties and concise rules provide a simple way for discriminating and understanding γ-turn types.

Keywords: Classification and regression tree (CART), γ-turn, Physicochemical properties, Protein secondary structure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
588 Multilevel Classifiers in Recognition of Handwritten Kannada Numerals

Authors: Dinesh Acharya U., N. V. Subba Reddy, Krishnamoorthi Makkithaya

Abstract:

The recognition of handwritten numeral is an important area of research for its applications in post office, banks and other organizations. This paper presents automatic recognition of handwritten Kannada numerals based on structural features. Five different types of features, namely, profile based 10-segment string, water reservoir; vertical and horizontal strokes, end points and average boundary length from the minimal bounding box are used in the recognition of numeral. The effect of each feature and their combination in the numeral classification is analyzed using nearest neighbor classifiers. It is common to combine multiple categories of features into a single feature vector for the classification. Instead, separate classifiers can be used to classify based on each visual feature individually and the final classification can be obtained based on the combination of separate base classification results. One popular approach is to combine the classifier results into a feature vector and leaving the decision to next level classifier. This method is extended to extract a better information, possibility distribution, from the base classifiers in resolving the conflicts among the classification results. Here, we use fuzzy k Nearest Neighbor (fuzzy k-NN) as base classifier for individual feature sets, the results of which together forms the feature vector for the final k Nearest Neighbor (k-NN) classifier. Testing is done, using different features, individually and in combination, on a database containing 1600 samples of different numerals and the results are compared with the results of different existing methods.

Keywords: Fuzzy k Nearest Neighbor, Multiple Classifiers, Numeral Recognition, Structural features.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF