Search results for: decision tree classifiers
4826 A Ratio-Weighted Decision Tree Algorithm for Imbalance Dataset Classification
Authors: Doyin Afolabi, Phillip Adewole, Oladipupo Sennaike
Abstract:
Most well-known classifiers, including the decision tree algorithm, can make predictions on balanced datasets efficiently. However, the decision tree algorithm tends to be biased towards imbalanced datasets because of the skewness of the distribution of such datasets. To overcome this problem, this study proposes a weighted decision tree algorithm that aims to remove the bias toward the majority class and prevents the reduction of majority observations in imbalance datasets classification. The proposed weighted decision tree algorithm was tested on three imbalanced datasets- cancer dataset, german credit dataset, and banknote dataset. The specificity, sensitivity, and accuracy metrics were used to evaluate the performance of the proposed decision tree algorithm on the datasets. The evaluation results show that for some of the weights of our proposed decision tree, the specificity, sensitivity, and accuracy metrics gave better results compared to that of the ID3 decision tree and decision tree induced with minority entropy for all three datasets.Keywords: data mining, decision tree, classification, imbalance dataset
Procedia PDF Downloads 1374825 Heart Ailment Prediction Using Machine Learning Methods
Authors: Abhigyan Hedau, Priya Shelke, Riddhi Mirajkar, Shreyash Chaple, Mrunali Gadekar, Himanshu Akula
Abstract:
The heart is the coordinating centre of the major endocrine glandular structure of the body, which produces hormones that profoundly affect the operations of the body, and diagnosing cardiovascular disease is a difficult but critical task. By extracting knowledge and information about the disease from patient data, data mining is a more practical technique to help doctors detect disorders. We use a variety of machine learning methods here, including logistic regression and support vector classifiers (SVC), K-nearest neighbours Classifiers (KNN), Decision Tree Classifiers, Random Forest classifiers and Gradient Boosting classifiers. These algorithms are applied to patient data containing 13 different factors to build a system that predicts heart disease in less time with more accuracy.Keywords: logistic regression, support vector classifier, k-nearest neighbour, decision tree, random forest and gradient boosting
Procedia PDF Downloads 514824 Empirical and Indian Automotive Equity Portfolio Decision Support
Authors: P. Sankar, P. James Daniel Paul, Siddhant Sahu
Abstract:
A brief review of the empirical studies on the methodology of the stock market decision support would indicate that they are at a threshold of validating the accuracy of the traditional and the fuzzy, artificial neural network and the decision trees. Many researchers have been attempting to compare these models using various data sets worldwide. However, the research community is on the way to the conclusive confidence in the emerged models. This paper attempts to use the automotive sector stock prices from National Stock Exchange (NSE), India and analyze them for the intra-sectorial support for stock market decisions. The study identifies the significant variables and their lags which affect the price of the stocks using OLS analysis and decision tree classifiers.Keywords: Indian automotive sector, stock market decisions, equity portfolio analysis, decision tree classifiers, statistical data analysis
Procedia PDF Downloads 4854823 Comparative Evaluation of Accuracy of Selected Machine Learning Classification Techniques for Diagnosis of Cancer: A Data Mining Approach
Authors: Rajvir Kaur, Jeewani Anupama Ginige
Abstract:
With recent trends in Big Data and advancements in Information and Communication Technologies, the healthcare industry is at the stage of its transition from clinician oriented to technology oriented. Many people around the world die of cancer because the diagnosis of disease was not done at an early stage. Nowadays, the computational methods in the form of Machine Learning (ML) are used to develop automated decision support systems that can diagnose cancer with high confidence in a timely manner. This paper aims to carry out the comparative evaluation of a selected set of ML classifiers on two existing datasets: breast cancer and cervical cancer. The ML classifiers compared in this study are Decision Tree (DT), Support Vector Machine (SVM), k-Nearest Neighbor (k-NN), Logistic Regression, Ensemble (Bagged Tree) and Artificial Neural Networks (ANN). The evaluation is carried out based on standard evaluation metrics Precision (P), Recall (R), F1-score and Accuracy. The experimental results based on the evaluation metrics show that ANN showed the highest-level accuracy (99.4%) when tested with breast cancer dataset. On the other hand, when these ML classifiers are tested with the cervical cancer dataset, Ensemble (Bagged Tree) technique gave better accuracy (93.1%) in comparison to other classifiers.Keywords: artificial neural networks, breast cancer, classifiers, cervical cancer, f-score, machine learning, precision, recall
Procedia PDF Downloads 2774822 Complex Decision Rules in the Form of Decision Trees
Authors: Avinash S. Jagtap, Sharad D. Gore, Rajendra G. Gurao
Abstract:
Decision rules become more and more complex as the number of conditions increase. As a consequence, the complexity of the decision rule also influences the time complexity of computer implementation of such a rule. Consider, for example, a decision that depends on four conditions A, B, C and D. For simplicity, suppose each of these four conditions is binary. Even then the decision rule will consist of 16 lines, where each line will be of the form: If A and B and C and D, then action 1. If A and B and C but not D, then action 2 and so on. While executing this decision rule, each of the four conditions will be checked every time until all the four conditions in a line are satisfied. The minimum number of logical comparisons is 4 whereas the maximum number is 64. This paper proposes to present a complex decision rule in the form of a decision tree. A decision tree divides the cases into branches every time a condition is checked. In the form of a decision tree, every branching eliminates half of the cases that do not satisfy the related conditions. As a result, every branch of the decision tree involves only four logical comparisons and hence is significantly simpler than the corresponding complex decision rule. The conclusion of this paper is that every complex decision rule can be represented as a decision tree and the decision tree is mathematically equivalent but computationally much simpler than the original complex decision ruleKeywords: strategic, tactical, operational, adaptive, innovative
Procedia PDF Downloads 2884821 A Novel PSO Based Decision Tree Classification
Authors: Ali Farzan
Abstract:
Classification of data objects or patterns is a major part in most of Decision making systems. One of the popular and commonly used classification methods is Decision Tree (DT). It is a hierarchical decision making system by which a binary tree is constructed and starting from root, at each node some of the classes is rejected until reaching the leaf nods. Each leaf node is a representative of one specific class. Finding the splitting criteria in each node for constructing or training the tree is a major problem. Particle Swarm Optimization (PSO) has been adopted as a metaheuristic searching method for finding the best splitting criteria. Result of evaluating the proposed method over benchmark datasets indicates the higher accuracy of the new PSO based decision tree.Keywords: decision tree, particle swarm optimization, splitting criteria, metaheuristic
Procedia PDF Downloads 4064820 Decision Tree Based Scheduling for Flexible Job Shops with Multiple Process Plans
Authors: H.-H. Doh, J.-M. Yu, Y.-J. Kwon, J.-H. Shin, H.-W. Kim, S.-H. Nam, D.-H. Lee
Abstract:
This paper suggests a decision tree based approach for flexible job shop scheduling with multiple process plans, i. e. each job can be processed through alternative operations, each of which can be processed on alternative machines. The main decision variables are: (a) selecting operation/machine pair; and (b) sequencing the jobs assigned to each machine. As an extension of the priority scheduling approach that selects the best priority rule combination after many simulation runs, this study suggests a decision tree based approach in which a decision tree is used to select a priority rule combination adequate for a specific system state and hence the burdens required for developing simulation models and carrying out simulation runs can be eliminated. The decision tree based scheduling approach consists of construction and scheduling modules. In the construction module, a decision tree is constructed using a four-stage algorithm, and in the scheduling module, a priority rule combination is selected using the decision tree. To show the performance of the decision tree based approach suggested in this study, a case study was done on a flexible job shop with reconfigurable manufacturing cells and a conventional job shop, and the results are reported by comparing it with individual priority rule combinations for the objectives of minimizing total flow time and total tardiness.Keywords: flexible job shop scheduling, decision tree, priority rules, case study
Procedia PDF Downloads 3584819 Model for Introducing Products to New Customers through Decision Tree Using Algorithm C4.5 (J-48)
Authors: Komol Phaisarn, Anuphan Suttimarn, Vitchanan Keawtong, Kittisak Thongyoun, Chaiyos Jamsawang
Abstract:
This article is intended to analyze insurance information which contains information on the customer decision when purchasing life insurance pay package. The data were analyzed in order to present new customers with Life Insurance Perfect Pay package to meet new customers’ needs as much as possible. The basic data of insurance pay package were collect to get data mining; thus, reducing the scattering of information. The data were then classified in order to get decision model or decision tree using Algorithm C4.5 (J-48). In the classification, WEKA tools are used to form the model and testing datasets are used to test the decision tree for the accurate decision. The validation of this model in classifying showed that the accurate prediction was 68.43% while 31.25% were errors. The same set of data were then tested with other models, i.e. Naive Bayes and Zero R. The results showed that J-48 method could predict more accurately. So, the researcher applied the decision tree in writing the program used to introduce the product to new customers to persuade customers’ decision making in purchasing the insurance package that meets the new customers’ needs as much as possible.Keywords: decision tree, data mining, customers, life insurance pay package
Procedia PDF Downloads 4284818 Model Development for Real-Time Human Sitting Posture Detection Using a Camera
Authors: Jheanel E. Estrada, Larry A. Vea
Abstract:
This study developed model to detect proper/improper sitting posture using the built in web camera which detects the upper body points’ location and distances (chin, manubrium and acromion process). It also established relationships of human body frames and proper sitting posture. The models were developed by training some well-known classifiers such as KNN, SVM, MLP, and Decision Tree using the data collected from 60 students of different body frames. Decision Tree classifier demonstrated the most promising model performance with an accuracy of 95.35% and a kappa of 0.907 for head and shoulder posture. Results also showed that there were relationships between body frame and posture through Body Mass Index.Keywords: posture, spinal points, gyroscope, image processing, ergonomics
Procedia PDF Downloads 3294817 The Effect of Feature Selection on Pattern Classification
Authors: Chih-Fong Tsai, Ya-Han Hu
Abstract:
The aim of feature selection (or dimensionality reduction) is to filter out unrepresentative features (or variables) making the classifier perform better than the one without feature selection. Since there are many well-known feature selection algorithms, and different classifiers based on different selection results may perform differently, very few studies consider examining the effect of performing different feature selection algorithms on the classification performances by different classifiers over different types of datasets. In this paper, two widely used algorithms, which are the genetic algorithm (GA) and information gain (IG), are used to perform feature selection. On the other hand, three well-known classifiers are constructed, which are the CART decision tree (DT), multi-layer perceptron (MLP) neural network, and support vector machine (SVM). Based on 14 different types of datasets, the experimental results show that in most cases IG is a better feature selection algorithm than GA. In addition, the combinations of IG with DT and IG with SVM perform best and second best for small and large scale datasets.Keywords: data mining, feature selection, pattern classification, dimensionality reduction
Procedia PDF Downloads 6694816 Artificial Neural Networks with Decision Trees for Diagnosis Issues
Authors: Y. Kourd, D. Lefebvre, N. Guersi
Abstract:
This paper presents a new idea for fault detection and isolation (FDI) technique which is applied to industrial system. This technique is based on Neural Networks fault-free and Faulty behaviors Models (NNFM's). NNFM's are used for residual generation, while decision tree architecture is used for residual evaluation. The decision tree is realized with data collected from the NNFM’s outputs and is used to isolate detectable faults depending on computed threshold. Each part of the tree corresponds to specific residual. With the decision tree, it becomes possible to take the appropriate decision regarding the actual process behavior by evaluating few numbers of residuals. In comparison to usual systematic evaluation of all residuals, the proposed technique requires less computational effort and can be used for on line diagnosis. An application example is presented to illustrate and confirm the effectiveness and the accuracy of the proposed approach.Keywords: neural networks, decision trees, diagnosis, behaviors
Procedia PDF Downloads 5054815 A Comparison of Single of Decision Tree, Decision Tree Forest and Group Method of Data Handling to Evaluate the Surface Roughness in Machining Process
Authors: S. Ghorbani, N. I. Polushin
Abstract:
The machinability of workpieces (AISI 1045 Steel, AA2024 aluminum alloy, A48-class30 gray cast iron) in turning operation has been carried out using different types of cutting tool (conventional, cutting tool with holes in toolholder and cutting tool filled up with composite material) under dry conditions on a turning machine at different stages of spindle speed (630-1000 rpm), feed rate (0.05-0.075 mm/rev), depth of cut (0.05-0.15 mm) and tool overhang (41-65 mm). Experimentation was performed as per Taguchi’s orthogonal array. To evaluate the relative importance of factors affecting surface roughness the single decision tree (SDT), Decision tree forest (DTF) and Group method of data handling (GMDH) were applied.Keywords: decision tree forest, GMDH, surface roughness, Taguchi method, turning process
Procedia PDF Downloads 4444814 Comparison Study of Machine Learning Classifiers for Speech Emotion Recognition
Authors: Aishwarya Ravindra Fursule, Shruti Kshirsagar
Abstract:
In the intersection of artificial intelligence and human-centered computing, this paper delves into speech emotion recognition (SER). It presents a comparative analysis of machine learning models such as K-Nearest Neighbors (KNN),logistic regression, support vector machines (SVM), decision trees, ensemble classifiers, and random forests, applied to SER. The research employs four datasets: Crema D, SAVEE, TESS, and RAVDESS. It focuses on extracting salient audio signal features like Zero Crossing Rate (ZCR), Chroma_stft, Mel Frequency Cepstral Coefficients (MFCC), root mean square (RMS) value, and MelSpectogram. These features are used to train and evaluate the models’ ability to recognize eight types of emotions from speech: happy, sad, neutral, angry, calm, disgust, fear, and surprise. Among the models, the Random Forest algorithm demonstrated superior performance, achieving approximately 79% accuracy. This suggests its suitability for SER within the parameters of this study. The research contributes to SER by showcasing the effectiveness of various machine learning algorithms and feature extraction techniques. The findings hold promise for the development of more precise emotion recognition systems in the future. This abstract provides a succinct overview of the paper’s content, methods, and results.Keywords: comparison, ML classifiers, KNN, decision tree, SVM, random forest, logistic regression, ensemble classifiers
Procedia PDF Downloads 454813 Using Risk Management Indicators in Decision Tree Analysis
Authors: Adel Ali Elshaibani
Abstract:
Risk management indicators augment the reporting infrastructure, particularly for the board and senior management, to identify, monitor, and manage risks. This enhancement facilitates improved decision-making throughout the banking organization. Decision tree analysis is a tool that visually outlines potential outcomes, costs, and consequences of complex decisions. It is particularly beneficial for analyzing quantitative data and making decisions based on numerical values. By calculating the expected value of each outcome, decision tree analysis can help assess the best course of action. In the context of banking, decision tree analysis can assist lenders in evaluating a customer’s creditworthiness, thereby preventing losses. However, applying these tools in developing countries may face several limitations, such as data availability, lack of technological infrastructure and resources, lack of skilled professionals, cultural factors, and cost. Moreover, decision trees can create overly complex models that do not generalize well to new data, known as overfitting. They can also be sensitive to small changes in the data, which can result in different tree structures and can become computationally expensive when dealing with large datasets. In conclusion, while risk management indicators and decision tree analysis are beneficial for decision-making in banks, their effectiveness is contingent upon how they are implemented and utilized by the board of directors, especially in the context of developing countries. It’s important to consider these limitations when planning to implement these tools in developing countries.Keywords: risk management indicators, decision tree analysis, developing countries, board of directors, bank performance, risk management strategy, banking institutions
Procedia PDF Downloads 604812 An Alternative Approach for Assessing the Impact of Cutting Conditions on Surface Roughness Using Single Decision Tree
Authors: S. Ghorbani, N. I. Polushin
Abstract:
In this study, an approach to identify factors affecting on surface roughness in a machining process is presented. This study is based on 81 data about surface roughness over a wide range of cutting tools (conventional, cutting tool with holes, cutting tool with composite material), workpiece materials (AISI 1045 Steel, AA2024 aluminum alloy, A48-class30 gray cast iron), spindle speed (630-1000 rpm), feed rate (0.05-0.075 mm/rev), depth of cut (0.05-0.15 mm) and tool overhang (41-65 mm). A single decision tree (SDT) analysis was done to identify factors for predicting a model of surface roughness, and the CART algorithm was employed for building and evaluating regression tree. Results show that a single decision tree is better than traditional regression models with higher rate and forecast accuracy and strong value.Keywords: cutting condition, surface roughness, decision tree, CART algorithm
Procedia PDF Downloads 3754811 The Use of Boosted Multivariate Trees in Medical Decision-Making for Repeated Measurements
Authors: Ebru Turgal, Beyza Doganay Erdogan
Abstract:
Machine learning aims to model the relationship between the response and features. Medical decision-making researchers would like to make decisions about patients’ course and treatment, by examining the repeated measurements over time. Boosting approach is now being used in machine learning area for these aims as an influential tool. The aim of this study is to show the usage of multivariate tree boosting in this field. The main reason for utilizing this approach in the field of decision-making is the ease solutions of complex relationships. To show how multivariate tree boosting method can be used to identify important features and feature-time interaction, we used the data, which was collected retrospectively from Ankara University Chest Diseases Department records. Dataset includes repeated PF ratio measurements. The follow-up time is planned for 120 hours. A set of different models is tested. In conclusion, main idea of classification with weighed combination of classifiers is a reliable method which was shown with simulations several times. Furthermore, time varying variables will be taken into consideration within this concept and it could be possible to make accurate decisions about regression and survival problems.Keywords: boosted multivariate trees, longitudinal data, multivariate regression tree, panel data
Procedia PDF Downloads 2034810 Parkinson’s Disease Detection Analysis through Machine Learning Approaches
Authors: Muhtasim Shafi Kader, Fizar Ahmed, Annesha Acharjee
Abstract:
Machine learning and data mining are crucial in health care, as well as medical information and detection. Machine learning approaches are now being utilized to improve awareness of a variety of critical health issues, including diabetes detection, neuron cell tumor diagnosis, COVID 19 identification, and so on. Parkinson’s disease is basically a disease for our senior citizens in Bangladesh. Parkinson's Disease indications often seem progressive and get worst with time. People got affected trouble walking and communicating with the condition advances. Patients can also have psychological and social vagaries, nap problems, hopelessness, reminiscence loss, and weariness. Parkinson's disease can happen in both men and women. Though men are affected by the illness at a proportion that is around partial of them are women. In this research, we have to get out the accurate ML algorithm to find out the disease with a predictable dataset and the model of the following machine learning classifiers. Therefore, nine ML classifiers are secondhand to portion study to use machine learning approaches like as follows, Naive Bayes, Adaptive Boosting, Bagging Classifier, Decision Tree Classifier, Random Forest classifier, XBG Classifier, K Nearest Neighbor Classifier, Support Vector Machine Classifier, and Gradient Boosting Classifier are used.Keywords: naive bayes, adaptive boosting, bagging classifier, decision tree classifier, random forest classifier, XBG classifier, k nearest neighbor classifier, support vector classifier, gradient boosting classifier
Procedia PDF Downloads 1294809 Real-Time Classification of Marbles with Decision-Tree Method
Authors: K. S. Parlak, E. Turan
Abstract:
The separation of marbles according to the pattern quality is a process made according to expert decision. The classification phase is the most critical part in terms of economic value. In this study, a self-learning system is proposed which performs the classification of marbles quickly and with high success. This system performs ten feature extraction by taking ten marble images from the camera. The marbles are classified by decision tree method using the obtained properties. The user forms the training set by training the system at the marble classification stage. The system evolves itself in every marble image that is classified. The aim of the proposed system is to minimize the error caused by the person performing the classification and achieve it quickly.Keywords: decision tree, feature extraction, k-means clustering, marble classification
Procedia PDF Downloads 3824808 Breast Cancer Survivability Prediction via Classifier Ensemble
Authors: Mohamed Al-Badrashiny, Abdelghani Bellaachia
Abstract:
This paper presents a classifier ensemble approach for predicting the survivability of the breast cancer patients using the latest database version of the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute. The system consists of two main components; features selection and classifier ensemble components. The features selection component divides the features in SEER database into four groups. After that it tries to find the most important features among the four groups that maximizes the weighted average F-score of a certain classification algorithm. The ensemble component uses three different classifiers, each of which models different set of features from SEER through the features selection module. On top of them, another classifier is used to give the final decision based on the output decisions and confidence scores from each of the underlying classifiers. Different classification algorithms have been examined; the best setup found is by using the decision tree, Bayesian network, and Na¨ıve Bayes algorithms for the underlying classifiers and Na¨ıve Bayes for the classifier ensemble step. The system outperforms all published systems to date when evaluated against the exact same data of SEER (period of 1973-2002). It gives 87.39% weighted average F-score compared to 85.82% and 81.34% of the other published systems. By increasing the data size to cover the whole database (period of 1973-2014), the overall weighted average F-score jumps to 92.4% on the held out unseen test set.Keywords: classifier ensemble, breast cancer survivability, data mining, SEER
Procedia PDF Downloads 3284807 Understanding Farmers’ Perceptions Towards Agrivoltaics Using Decision Tree Algorithms
Authors: Mayuri Roy Choudhury
Abstract:
In recent times the concept of agrivoltaics has gained popularity due to the dual use of land and the added value provided by photovoltaics in terms of renewable energy and crop production on farms. However, the transition towards agrivoltaics has been slow, and our research tries to investigate the obstacles leading towards the slow progress of agrivoltaics. We applied data science decision tree algorithms to quantify qualitative perceptions of farmers in the United States for agrivoltaics. To date, there has not been much research that mentions farmers' perceptions, as most of the research focuses on the benefits of agrivoltaics. Our study adds value by putting forward the voices of farmers, which play a crucial towards the transition to agrivoltaics in the future. Our results show a mixture of responses in favor of agrivoltaics. Furthermore, it also portrays significant concerns of farmers, which is useful for decision-makers when it comes to formulating policies for agrivoltaics.Keywords: agrivoltaics, decision-tree algorithms, farmers perception, transition
Procedia PDF Downloads 1904806 Survey on Big Data Stream Classification by Decision Tree
Authors: Mansoureh Ghiasabadi Farahani, Samira Kalantary, Sara Taghi-Pour, Mahboubeh Shamsi
Abstract:
Nowadays, the development of computers technology and its recent applications provide access to new types of data, which have not been considered by the traditional data analysts. Two particularly interesting characteristics of such data sets include their huge size and streaming nature .Incremental learning techniques have been used extensively to address the data stream classification problem. This paper presents a concise survey on the obstacles and the requirements issues classifying data streams with using decision tree. The most important issue is to maintain a balance between accuracy and efficiency, the algorithm should provide good classification performance with a reasonable time response.Keywords: big data, data streams, classification, decision tree
Procedia PDF Downloads 5214805 Performance Comparison of ADTree and Naive Bayes Algorithms for Spam Filtering
Authors: Thanh Nguyen, Andrei Doncescu, Pierre Siegel
Abstract:
Classification is an important data mining technique and could be used as data filtering in artificial intelligence. The broad application of classification for all kind of data leads to be used in nearly every field of our modern life. Classification helps us to put together different items according to the feature items decided as interesting and useful. In this paper, we compare two classification methods Naïve Bayes and ADTree use to detect spam e-mail. This choice is motivated by the fact that Naive Bayes algorithm is based on probability calculus while ADTree algorithm is based on decision tree. The parameter settings of the above classifiers use the maximization of true positive rate and minimization of false positive rate. The experiment results present classification accuracy and cost analysis in view of optimal classifier choice for Spam Detection. It is point out the number of attributes to obtain a tradeoff between number of them and the classification accuracy.Keywords: classification, data mining, spam filtering, naive bayes, decision tree
Procedia PDF Downloads 4114804 A Reliable Multi-Type Vehicle Classification System
Authors: Ghada S. Moussa
Abstract:
Vehicle classification is an important task in traffic surveillance and intelligent transportation systems. Classification of vehicle images is facing several problems such as: high intra-class vehicle variations, occlusion, shadow, illumination. These problems and others must be considered to develop a reliable vehicle classification system. In this study, a reliable multi-type vehicle classification system based on Bag-of-Words (BoW) paradigm is developed. Our proposed system used and compared four well-known classifiers; Linear Discriminant Analysis (LDA), Support Vector Machine (SVM), k-Nearest Neighbour (KNN), and Decision Tree to classify vehicles into four categories: motorcycles, small, medium and large. Experiments on a large dataset show that our approach is efficient and reliable in classifying vehicles with accuracy of 95.7%. The SVM outperforms other classification algorithms in terms of both accuracy and robustness alongside considerable reduction in execution time. The innovativeness of developed system is it can serve as a framework for many vehicle classification systems.Keywords: vehicle classification, bag-of-words technique, SVM classifier, LDA classifier, KNN classifier, decision tree classifier, SIFT algorithm
Procedia PDF Downloads 3584803 Decision Tree Modeling in Emergency Logistics Planning
Authors: Yousef Abu Nahleh, Arun Kumar, Fugen Daver, Reham Al-Hindawi
Abstract:
Despite the availability of natural disaster related time series data for last 110 years, there is no forecasting tool available to humanitarian relief organizations to determine forecasts for emergency logistics planning. This study develops a forecasting tool based on identifying probability of disaster for each country in the world by using decision tree modeling. Further, the determination of aggregate forecasts leads to efficient pre-disaster planning. Based on the research findings, the relief agencies can optimize the various resources allocation in emergency logistics planning.Keywords: decision tree modeling, forecasting, humanitarian relief, emergency supply chain
Procedia PDF Downloads 4834802 Performance Analysis of Artificial Neural Network with Decision Tree in Prediction of Diabetes Mellitus
Authors: J. K. Alhassan, B. Attah, S. Misra
Abstract:
Human beings have the ability to make logical decisions. Although human decision - making is often optimal, it is insufficient when huge amount of data is to be classified. medical dataset is a vital ingredient used in predicting patients health condition. In other to have the best prediction, there calls for most suitable machine learning algorithms. This work compared the performance of Artificial Neural Network (ANN) and Decision Tree Algorithms (DTA) as regards to some performance metrics using diabetes data. The evaluations was done using weka software and found out that DTA performed better than ANN. Multilayer Perceptron (MLP) and Radial Basis Function (RBF) were the two algorithms used for ANN, while RegTree and LADTree algorithms were the DTA models used. The Root Mean Squared Error (RMSE) of MLP is 0.3913,that of RBF is 0.3625, that of RepTree is 0.3174 and that of LADTree is 0.3206 respectively.Keywords: artificial neural network, classification, decision tree algorithms, diabetes mellitus
Procedia PDF Downloads 4084801 A Case-Based Reasoning-Decision Tree Hybrid System for Stock Selection
Authors: Yaojun Wang, Yaoqing Wang
Abstract:
Stock selection is an important decision-making problem. Many machine learning and data mining technologies are employed to build automatic stock-selection system. A profitable stock-selection system should consider the stock’s investment value and the market timing. In this paper, we present a hybrid system including both engage for stock selection. This system uses a case-based reasoning (CBR) model to execute the stock classification, uses a decision-tree model to help with market timing and stock selection. The experiments show that the performance of this hybrid system is better than that of other techniques regarding to the classification accuracy, the average return and the Sharpe ratio.Keywords: case-based reasoning, decision tree, stock selection, machine learning
Procedia PDF Downloads 4204800 Faults Diagnosis by Thresholding and Decision tree with Neuro-Fuzzy System
Authors: Y. Kourd, D. Lefebvre
Abstract:
The monitoring of industrial processes is required to ensure operating conditions of industrial systems through automatic detection and isolation of faults. This paper proposes a method of fault diagnosis based on a neuro-fuzzy hybrid structure. This hybrid structure combines the selection of threshold and decision tree. The validation of this method is obtained with the DAMADICS benchmark. In the first phase of the method, a model will be constructed that represents the normal state of the system to fault detection. Signatures of the faults are obtained with residuals analysis and selection of appropriate thresholds. These signatures provide groups of non-separable faults. In the second phase, we build faulty models to see the flaws in the system that cannot be isolated in the first phase. In the latest phase we construct the tree that isolates these faults.Keywords: decision tree, residuals analysis, ANFIS, fault diagnosis
Procedia PDF Downloads 6264799 Amharic Text News Classification Using Supervised Learning
Authors: Misrak Assefa
Abstract:
The Amharic language is the second most widely spoken Semitic language in the world. There are several new overloaded on the web. Searching some useful documents from the web on a specific topic, which is written in the Amharic language, is a challenging task. Hence, document categorization is required for managing and filtering important information. In the classification of Amharic text news, there is still a gap in the domain of information that needs to be launch. This study attempts to design an automatic Amharic news classification using a supervised learning mechanism on four un-touch classes. To achieve this research, 4,182 news articles were used. Naive Bayes (NB) and Decision tree (j48) algorithms were used to classify the given Amharic dataset. In this paper, k-fold cross-validation is used to estimate the accuracy of the classifier. As a result, it shows those algorithms can be applicable in Amharic news categorization. The best average accuracy result is achieved by j48 decision tree and naïve Bayes is 95.2345 %, and 94.6245 % respectively using three categories. This research indicated that a typical decision tree algorithm is more applicable to Amharic news categorization.Keywords: text categorization, supervised machine learning, naive Bayes, decision tree
Procedia PDF Downloads 2114798 Using Data Mining Technique for Scholarship Disbursement
Authors: J. K. Alhassan, S. A. Lawal
Abstract:
This work is on decision tree-based classification for the disbursement of scholarship. Tree-based data mining classification technique is used in other to determine the generic rule to be used to disburse the scholarship. The system based on the defined rules from the tree is able to determine the class (status) to which an applicant shall belong whether Granted or Not Granted. The applicants that fall to the class of granted denote a successful acquirement of scholarship while those in not granted class are unsuccessful in the scheme. An algorithm that can be used to classify the applicants based on the rules from tree-based classification was also developed. The tree-based classification is adopted because of its efficiency, effectiveness, and easy to comprehend features. The system was tested with the data of National Information Technology Development Agency (NITDA) Abuja, a Parastatal of Federal Ministry of Communication Technology that is mandated to develop and regulate information technology in Nigeria. The system was found working according to the specification. It is therefore recommended for all scholarship disbursement organizations.Keywords: classification, data mining, decision tree, scholarship
Procedia PDF Downloads 3764797 Comparison of Deep Learning and Machine Learning Algorithms to Diagnose and Predict Breast Cancer
Authors: F. Ghazalnaz Sharifonnasabi, Iman Makhdoom
Abstract:
Breast cancer is a serious health concern that affects many people around the world. According to a study published in the Breast journal, the global burden of breast cancer is expected to increase significantly over the next few decades. The number of deaths from breast cancer has been increasing over the years, but the age-standardized mortality rate has decreased in some countries. It’s important to be aware of the risk factors for breast cancer and to get regular check- ups to catch it early if it does occur. Machin learning techniques have been used to aid in the early detection and diagnosis of breast cancer. These techniques, that have been shown to be effective in predicting and diagnosing the disease, have become a research hotspot. In this study, we consider two deep learning approaches including: Multi-Layer Perceptron (MLP), and Convolutional Neural Network (CNN). We also considered the five-machine learning algorithm titled: Decision Tree (C4.5), Naïve Bayesian (NB), Support Vector Machine (SVM), K-Nearest Neighbors (KNN) Algorithm and XGBoost (eXtreme Gradient Boosting) on the Breast Cancer Wisconsin Diagnostic dataset. We have carried out the process of evaluating and comparing classifiers involving selecting appropriate metrics to evaluate classifier performance and selecting an appropriate tool to quantify this performance. The main purpose of the study is predicting and diagnosis breast cancer, applying the mentioned algorithms and also discovering of the most effective with respect to confusion matrix, accuracy and precision. It is realized that CNN outperformed all other classifiers and achieved the highest accuracy (0.982456). The work is implemented in the Anaconda environment based on Python programing language.Keywords: breast cancer, multi-layer perceptron, Naïve Bayesian, SVM, decision tree, convolutional neural network, XGBoost, KNN
Procedia PDF Downloads 75