Search results for: support vector regression.
2945 On the Efficient Implementation of a Serial and Parallel Decomposition Algorithm for Fast Support Vector Machine Training Including a Multi-Parameter Kernel
Authors: Tatjana Eitrich, Bruno Lang
Abstract:
This work deals with aspects of support vector machine learning for large-scale data mining tasks. Based on a decomposition algorithm for support vector machine training that can be run in serial as well as shared memory parallel mode we introduce a transformation of the training data that allows for the usage of an expensive generalized kernel without additional costs. We present experiments for the Gaussian kernel, but usage of other kernel functions is possible, too. In order to further speed up the decomposition algorithm we analyze the critical problem of working set selection for large training data sets. In addition, we analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our tests and conclusions led to several modifications of the algorithm and the improvement of overall support vector machine learning performance. Our method allows for using extensive parameter search methods to optimize classification accuracy.
Keywords: Support Vector Machine Training, Multi-ParameterKernels, Shared Memory Parallel Computing, Large Data
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14432944 Voltage Problem Location Classification Using Performance of Least Squares Support Vector Machine LS-SVM and Learning Vector Quantization LVQ
Authors: Khaled Abduesslam. M, Mohammed Ali, Basher H Alsdai, Muhammad Nizam, Inayati
Abstract:
This paper presents the voltage problem location classification using performance of Least Squares Support Vector Machine (LS-SVM) and Learning Vector Quantization (LVQ) in electrical power system for proper voltage problem location implemented by IEEE 39 bus New- England. The data was collected from the time domain simulation by using Power System Analysis Toolbox (PSAT). Outputs from simulation data such as voltage, phase angle, real power and reactive power were taken as input to estimate voltage stability at particular buses based on Power Transfer Stability Index (PTSI).The simulation data was carried out on the IEEE 39 bus test system by considering load bus increased on the system. To verify of the proposed LS-SVM its performance was compared to Learning Vector Quantization (LVQ). The results showed that LS-SVM is faster and better as compared to LVQ. The results also demonstrated that the LS-SVM was estimated by 0% misclassification whereas LVQ had 7.69% misclassification.
Keywords: IEEE 39 bus, Least Squares Support Vector Machine, Learning Vector Quantization, Voltage Collapse.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 24052943 Intrusion Detection Using a New Particle Swarm Method and Support Vector Machines
Authors: Essam Al Daoud
Abstract:
Intrusion detection is a mechanism used to protect a system and analyse and predict the behaviours of system users. An ideal intrusion detection system is hard to achieve due to nonlinearity, and irrelevant or redundant features. This study introduces a new anomaly-based intrusion detection model. The suggested model is based on particle swarm optimisation and nonlinear, multi-class and multi-kernel support vector machines. Particle swarm optimisation is used for feature selection by applying a new formula to update the position and the velocity of a particle; the support vector machine is used as a classifier. The proposed model is tested and compared with the other methods using the KDD CUP 1999 dataset. The results indicate that this new method achieves better accuracy rates than previous methods.Keywords: Feature selection, Intrusion detection, Support vector machine, Particle swarm.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19902942 Data Mining in Medicine Domain Using Decision Trees and Vector Support Machine
Authors: Djamila Benhaddouche, Abdelkader Benyettou
Abstract:
In this paper, we used data mining to extract biomedical knowledge. In general, complex biomedical data collected in studies of populations are treated by statistical methods, although they are robust, they are not sufficient in themselves to harness the potential wealth of data. For that you used in step two learning algorithms: the Decision Trees and Support Vector Machine (SVM). These supervised classification methods are used to make the diagnosis of thyroid disease. In this context, we propose to promote the study and use of symbolic data mining techniques.
Keywords: A classifier, Algorithms decision tree, knowledge extraction, Support Vector Machine.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18702941 Relationship between Sums of Squares in Linear Regression and Semi-parametric Regression
Authors: Dursun Aydın, Bilgin Senel
Abstract:
In this paper, the sum of squares in linear regression is reduced to sum of squares in semi-parametric regression. We indicated that different sums of squares in the linear regression are similar to various deviance statements in semi-parametric regression. In addition to, coefficient of the determination derived in linear regression model is easily generalized to coefficient of the determination of the semi-parametric regression model. Then, it is made an application in order to support the theory of the linear regression and semi-parametric regression. In this way, study is supported with a simulated data example.Keywords: Semi-parametric regression, Penalized LeastSquares, Residuals, Deviance, Smoothing Spline.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18562940 A Machine Learning Approach for Earthquake Prediction in Various Zones Based on Solar Activity
Authors: Viacheslav Shkuratskyy, Aminu Bello Usman, Michael O’Dea, Mujeeb Ur Rehman, Saifur Rahman Sabuj
Abstract:
This paper examines relationships between solar activity and earthquakes, it applied machine learning techniques: K-nearest neighbour, support vector regression, random forest regression, and long short-term memory network. Data from the SILSO World Data Center, the NOAA National Center, the GOES satellite, NASA OMNIWeb, and the United States Geological Survey were used for the experiment. The 23rd and 24th solar cycles, daily sunspot number, solar wind velocity, proton density, and proton temperature were all included in the dataset. The study also examined sunspots, solar wind, and solar flares, which all reflect solar activity, and earthquake frequency distribution by magnitude and depth. The findings showed that the long short-term memory network model predicts earthquakes more correctly than the other models applied in the study, and solar activity is more likely to effect earthquakes of lower magnitude and shallow depth than earthquakes of magnitude 5.5 or larger with intermediate depth and deep depth
.Keywords: K-Nearest Neighbour, Support Vector Regression, Random Forest Regression, Long Short-Term Memory Network, earthquakes, solar activity, sunspot number, solar wind, solar flares.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2052939 A Medical Resource Forecasting Model for Emergency Room Patients with Acute Hepatitis
Authors: R. J. Kuo, W. C. Cheng, W. C. Lien, T. J. Yang
Abstract:
Taiwan is a hyper endemic area for the Hepatitis B virus (HBV). The estimated total number of HBsAg carriers in the general population who are more than 20 years old is more than 3 million. Therefore, a case record review is conducted from January 2003 to June 2007 for all patients with a diagnosis of acute hepatitis who were admitted to the Emergency Department (ED) of a well-known teaching hospital. The cost for the use of medical resources is defined as the total medical fee. In this study, principal component analysis (PCA) is firstly employed to reduce the number of dimensions. Support vector regression (SVR) and artificial neural network (ANN) are then used to develop the forecasting model. A total of 117 patients meet the inclusion criteria. 61% patients involved in this study are hepatitis B related. The computational result shows that the proposed PCA-SVR model has superior performance than other compared algorithms. In conclusion, the Child-Pugh score and echogram can both be used to predict the cost of medical resources for patients with acute hepatitis in the ED.
Keywords: Acute hepatitis, Medical resource cost, Artificial neural network, Support vector regression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19172938 Support Vector Machines Approach for Detecting the Mean Shifts in Hotelling-s T2 Control Chart with Sensitizing Rules
Authors: Tai-Yue Wang, Hui-Min Chiang, Su-Ni Hsieh, Yu-Min Chiang
Abstract:
In many industries, control charts is one of the most frequently used tools for quality management. Hotelling-s T2 is used widely in multivariate control chart. However, it has little defect when detecting small or medium process shifts. The use of supplementary sensitizing rules can improve the performance of detection. This study applied sensitizing rules for Hotelling-s T2 control chart to improve the performance of detection. Support vector machines (SVM) classifier to identify the characteristic or group of characteristics that are responsible for the signal and to classify the magnitude of the mean shifts. The experimental results demonstrate that the support vector machines (SVM) classifier can effectively identify the characteristic or group of characteristics that caused the process mean shifts and the magnitude of the shifts.Keywords: Hotelling's T2 control chart, Neural networks, Sensitizing rules, Support vector machines.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18722937 Glass Bottle Inspector Based on Machine Vision
Authors: Huanjun Liu, Yaonan Wang, Feng Duan
Abstract:
This text studies glass bottle intelligent inspector based machine vision instead of manual inspection. The system structure is illustrated in detail in this paper. The text presents the method based on watershed transform methods to segment the possible defective regions and extract features of bottle wall by rules. Then wavelet transform are used to exact features of bottle finish from images. After extracting features, the fuzzy support vector machine ensemble is putted forward as classifier. For ensuring that the fuzzy support vector machines have good classification ability, the GA based ensemble method is used to combining the several fuzzy support vector machines. The experiments demonstrate that using this inspector to inspect glass bottles, the accuracy rate may reach above 97.5%.Keywords: Intelligent Inspection, Support Vector Machines, Ensemble Methods, watershed transform, Wavelet Transform
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 38972936 Pattern Recognition as an Internalized Motor Programme
Authors: M. Jändel
Abstract:
A new conceptual architecture for low-level neural pattern recognition is presented. The key ideas are that the brain implements support vector machines and that support vectors are represented as memory patterns in competitive queuing memories. A binary classifier is built from two competitive queuing memories holding positive and negative valence training examples respectively. The support vector machine classification function is calculated in synchronized evaluation cycles. The kernel is computed by bisymmetric feed-forward networks feed by sensory input and by competitive queuing memories traversing the complete sequence of support vectors. Temporary summation generates the output classification. It is speculated that perception apparatus in the brain reuses structures that have evolved for enabling fluent execution of prepared action sequences so that pattern recognition is built on internalized motor programmes.Keywords: Competitive queuing model, Olfactory system, Pattern recognition, Support vector machine, Thalamus
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13692935 Evolutionary Feature Selection for Text Documents using the SVM
Authors: Daniel I. Morariu, Lucian N. Vintan, Volker Tresp
Abstract:
Text categorization is the problem of classifying text documents into a set of predefined classes. After a preprocessing step, the documents are typically represented as large sparse vectors. When training classifiers on large collections of documents, both the time and memory restrictions can be quite prohibitive. This justifies the application of feature selection methods to reduce the dimensionality of the document-representation vector. In this paper, we present three feature selection methods: Information Gain, Support Vector Machine feature selection called (SVM_FS) and Genetic Algorithm with SVM (called GA_SVM). We show that the best results were obtained with GA_SVM method for a relatively small dimension of the feature vector.Keywords: Feature Selection, Learning with Kernels, Support Vector Machine, Genetic Algorithm, and Classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17062934 Support Vector Machine Prediction Model of Early-stage Lung Cancer Based on Curvelet Transform to Extract Texture Features of CT Image
Authors: Guo Xiuhua, Sun Tao, Wu Haifeng, He Wen, Liang Zhigang, Zhang Mengxia, Guo Aimin, Wang Wei
Abstract:
Purpose: To explore the use of Curvelet transform to extract texture features of pulmonary nodules in CT image and support vector machine to establish prediction model of small solitary pulmonary nodules in order to promote the ratio of detection and diagnosis of early-stage lung cancer. Methods: 2461 benign or malignant small solitary pulmonary nodules in CT image from 129 patients were collected. Fourteen Curvelet transform textural features were as parameters to establish support vector machine prediction model. Results: Compared with other methods, using 252 texture features as parameters to establish prediction model is more proper. And the classification consistency, sensitivity and specificity for the model are 81.5%, 93.8% and 38.0% respectively. Conclusion: Based on texture features extracted from Curvelet transform, support vector machine prediction model is sensitive to lung cancer, which can promote the rate of diagnosis for early-stage lung cancer to some extent.Keywords: CT image, Curvelet transform, Small pulmonary nodules, Support vector machines, Texture extraction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 27672933 Integration of Support Vector Machine and Bayesian Neural Network for Data Mining and Classification
Authors: Essam Al-Daoud
Abstract:
Several combinations of the preprocessing algorithms, feature selection techniques and classifiers can be applied to the data classification tasks. This study introduces a new accurate classifier, the proposed classifier consist from four components: Signal-to- Noise as a feature selection technique, support vector machine, Bayesian neural network and AdaBoost as an ensemble algorithm. To verify the effectiveness of the proposed classifier, seven well known classifiers are applied to four datasets. The experiments show that using the suggested classifier enhances the classification rates for all datasets.Keywords: AdaBoost, Bayesian neural network, Signal-to-Noise, support vector machine, MCMC.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20202932 Extrapolation of Clinical Data from an Oral Glucose Tolerance Test Using a Support Vector Machine
Authors: Jianyin Lu, Masayoshi Seike, Wei Liu, Peihong Wu, Lihua Wang, Yihua Wu, Yasuhiro Naito, Hiromu Nakajima, Yasuhiro Kouchi
Abstract:
To extract the important physiological factors related to diabetes from an oral glucose tolerance test (OGTT) by mathematical modeling, highly informative but convenient protocols are required. Current models require a large number of samples and extended period of testing, which is not practical for daily use. The purpose of this study is to make model assessments possible even from a reduced number of samples taken over a relatively short period. For this purpose, test values were extrapolated using a support vector machine. A good correlation was found between reference and extrapolated values in evaluated 741 OGTTs. This result indicates that a reduction in the number of clinical test is possible through a computational approach.Keywords: SVM regression, OGTT, diabetes, mathematical model
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16142931 Feature Subset Selection approach based on Maximizing Margin of Support Vector Classifier
Authors: Khin May Win, Nan Sai Moon Kham
Abstract:
Identification of cancer genes that might anticipate the clinical behaviors from different types of cancer disease is challenging due to the huge number of genes and small number of patients samples. The new method is being proposed based on supervised learning of classification like support vector machines (SVMs).A new solution is described by the introduction of the Maximized Margin (MM) in the subset criterion, which permits to get near the least generalization error rate. In class prediction problem, gene selection is essential to improve the accuracy and to identify genes for cancer disease. The performance of the new method was evaluated with real-world data experiment. It can give the better accuracy for classification.Keywords: Microarray data, feature selection, recursive featureelimination, support vector machines.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15422930 Meta Model Based EA for Complex Optimization
Authors: Maumita Bhattacharya
Abstract:
Evolutionary Algorithms are population-based, stochastic search techniques, widely used as efficient global optimizers. However, many real life optimization problems often require finding optimal solution to complex high dimensional, multimodal problems involving computationally very expensive fitness function evaluations. Use of evolutionary algorithms in such problem domains is thus practically prohibitive. An attractive alternative is to build meta models or use an approximation of the actual fitness functions to be evaluated. These meta models are order of magnitude cheaper to evaluate compared to the actual function evaluation. Many regression and interpolation tools are available to build such meta models. This paper briefly discusses the architectures and use of such meta-modeling tools in an evolutionary optimization context. We further present two evolutionary algorithm frameworks which involve use of meta models for fitness function evaluation. The first framework, namely the Dynamic Approximate Fitness based Hybrid EA (DAFHEA) model [14] reduces computation time by controlled use of meta-models (in this case approximate model generated by Support Vector Machine regression) to partially replace the actual function evaluation by approximate function evaluation. However, the underlying assumption in DAFHEA is that the training samples for the metamodel are generated from a single uniform model. This does not take into account uncertain scenarios involving noisy fitness functions. The second model, DAFHEA-II, an enhanced version of the original DAFHEA framework, incorporates a multiple-model based learning approach for the support vector machine approximator to handle noisy functions [15]. Empirical results obtained by evaluating the frameworks using several benchmark functions demonstrate their efficiencyKeywords: Meta model, Evolutionary algorithm, Stochastictechnique, Fitness function, Optimization, Support vector machine.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20672929 Tongue Diagnosis System Based on PCA and SVM
Authors: Jin-Woong Park, Sun-Kyung Kang, Sung-Tae Jung
Abstract:
In this study, we propose a tongue diagnosis method which detects the tongue from face image and divides the tongue area into six areas, and finally generates tongue coating ratio of each area. To detect the tongue area from face image, we use ASM as one of the active shape models. Detected tongue area is divided into six areas widely used in the Korean traditional medicine and the distribution of tongue coating of the six areas is examined by SVM(Support Vector Machine). For SVM, we use a 3-dimensional vector calculated by PCA(Principal Component Analysis) from a 12-dimentional vector consisting of RGB, HIS, Lab, and Luv. As a result, we detected the tongue area stably using ASM and found that PCA and SVM helped raise the ratio of tongue coating detection.Keywords: Active Shape Model, Principal Component Analysis, Support Vector Machine, Tongue diagnosis
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18672928 Resolving Dependency Ambiguity of Subordinate Clauses using Support Vector Machines
Authors: Sang-Soo Kim, Seong-Bae Park, Sang-Jo Lee
Abstract:
In this paper, we propose a method of resolving dependency ambiguities of Korean subordinate clauses based on Support Vector Machines (SVMs). Dependency analysis of clauses is well known to be one of the most difficult tasks in parsing sentences, especially in Korean. In order to solve this problem, we assume that the dependency relation of Korean subordinate clauses is the dependency relation among verb phrase, verb and endings in the clauses. As a result, this problem is represented as a binary classification task. In order to apply SVMs to this problem, we selected two kinds of features: static and dynamic features. The experimental results on STEP2000 corpus show that our system achieves the accuracy of 73.5%.Keywords: Dependency analysis, subordinate clauses, binaryclassification, support vector machines.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15972927 Fuzzy Rules Generation and Extraction from Support Vector Machine Based on Kernel Function Firing Signals
Authors: Prasan Pitiranggon, Nunthika Benjathepanun, Somsri Banditvilai, Veera Boonjing
Abstract:
Our study proposes an alternative method in building Fuzzy Rule-Based System (FRB) from Support Vector Machine (SVM). The first set of fuzzy IF-THEN rules is obtained through an equivalence of the SVM decision network and the zero-ordered Sugeno FRB type of the Adaptive Network Fuzzy Inference System (ANFIS). The second set of rules is generated by combining the first set based on strength of firing signals of support vectors using Gaussian kernel. The final set of rules is then obtained from the second set through input scatter partitioning. A distinctive advantage of our method is the guarantee that the number of final fuzzy IFTHEN rules is not more than the number of support vectors in the trained SVM. The final FRB system obtained is capable of performing classification with results comparable to its SVM counterpart, but it has an advantage over the black-boxed SVM in that it may reveal human comprehensible patterns.Keywords: Fuzzy Rule Base, Rule Extraction, Rule Generation, Support Vector Machine.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19022926 Estimation of Time -Varying Linear Regression with Unknown Time -Volatility via Continuous Generalization of the Akaike Information Criterion
Authors: Elena Ezhova, Vadim Mottl, Olga Krasotkina
Abstract:
The problem of estimating time-varying regression is inevitably concerned with the necessity to choose the appropriate level of model volatility - ranging from the full stationarity of instant regression models to their absolute independence of each other. In the stationary case the number of regression coefficients to be estimated equals that of regressors, whereas the absence of any smoothness assumptions augments the dimension of the unknown vector by the factor of the time-series length. The Akaike Information Criterion is a commonly adopted means of adjusting a model to the given data set within a succession of nested parametric model classes, but its crucial restriction is that the classes are rigidly defined by the growing integer-valued dimension of the unknown vector. To make the Kullback information maximization principle underlying the classical AIC applicable to the problem of time-varying regression estimation, we extend it onto a wider class of data models in which the dimension of the parameter is fixed, but the freedom of its values is softly constrained by a family of continuously nested a priori probability distributions.Keywords: Time varying regression, time-volatility of regression coefficients, Akaike Information Criterion (AIC), Kullback information maximization principle.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15342925 Indonesian News Classification using Support Vector Machine
Authors: Dewi Y. Liliana, Agung Hardianto, M. Ridok
Abstract:
Digital news with a variety topics is abundant on the internet. The problem is to classify news based on its appropriate category to facilitate user to find relevant news rapidly. Classifier engine is used to split any news automatically into the respective category. This research employs Support Vector Machine (SVM) to classify Indonesian news. SVM is a robust method to classify binary classes. The core processing of SVM is in the formation of an optimum separating plane to separate the different classes. For multiclass problem, a mechanism called one against one is used to combine the binary classification result. Documents were taken from the Indonesian digital news site, www.kompas.com. The experiment showed a promising result with the accuracy rate of 85%. This system is feasible to be implemented on Indonesian news classification.Keywords: classification, Indonesian news, text processing, support vector machine
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 34892924 Multi-Level Air Quality Classification in China Using Information Gain and Support Vector Machine
Authors: Bingchun Liu, Pei-Chann Chang, Natasha Huang, Dun Li
Abstract:
Machine Learning and Data Mining are the two important tools for extracting useful information and knowledge from large datasets. In machine learning, classification is a wildly used technique to predict qualitative variables and is generally preferred over regression from an operational point of view. Due to the enormous increase in air pollution in various countries especially China, Air Quality Classification has become one of the most important topics in air quality research and modelling. This study aims at introducing a hybrid classification model based on information theory and Support Vector Machine (SVM) using the air quality data of four cities in China namely Beijing, Guangzhou, Shanghai and Tianjin from Jan 1, 2014 to April 30, 2016. China's Ministry of Environmental Protection has classified the daily air quality into 6 levels namely Serious Pollution, Severe Pollution, Moderate Pollution, Light Pollution, Good and Excellent based on their respective Air Quality Index (AQI) values. Using the information theory, information gain (IG) is calculated and feature selection is done for both categorical features and continuous numeric features. Then SVM Machine Learning algorithm is implemented on the selected features with cross-validation. The final evaluation reveals that the IG and SVM hybrid model performs better than SVM (alone), Artificial Neural Network (ANN) and K-Nearest Neighbours (KNN) models in terms of accuracy as well as complexity.
Keywords: Machine learning, air quality classification, air quality index, information gain, support vector machine, cross-validation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 9482923 A Hybrid GMM/SVM System for Text Independent Speaker Identification
Authors: Rafik Djemili, Mouldi Bedda, Hocine Bourouba
Abstract:
This paper proposes a novel approach that combines statistical models and support vector machines. A hybrid scheme which appropriately incorporates the advantages of both the generative and discriminant model paradigms is described and evaluated. Support vector machines (SVMs) are trained to divide the whole speakers' space into small subsets of speakers within a hierarchical tree structure. During testing a speech token is assigned to its corresponding group and evaluation using gaussian mixture models (GMMs) is then processed. Experimental results show that the proposed method can significantly improve the performance of text independent speaker identification task. We report improvements of up to 50% reduction in identification error rate compared to the baseline statistical model.Keywords: Speaker identification, Gaussian mixture model (GMM), support vector machine (SVM), hybrid GMM/SVM.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22372922 Comparative Evaluation of Accuracy of Selected Machine Learning Classification Techniques for Diagnosis of Cancer: A Data Mining Approach
Authors: Rajvir Kaur, Jeewani Anupama Ginige
Abstract:
With recent trends in Big Data and advancements in Information and Communication Technologies, the healthcare industry is at the stage of its transition from clinician oriented to technology oriented. Many people around the world die of cancer because the diagnosis of disease was not done at an early stage. Nowadays, the computational methods in the form of Machine Learning (ML) are used to develop automated decision support systems that can diagnose cancer with high confidence in a timely manner. This paper aims to carry out the comparative evaluation of a selected set of ML classifiers on two existing datasets: breast cancer and cervical cancer. The ML classifiers compared in this study are Decision Tree (DT), Support Vector Machine (SVM), k-Nearest Neighbor (k-NN), Logistic Regression, Ensemble (Bagged Tree) and Artificial Neural Networks (ANN). The evaluation is carried out based on standard evaluation metrics Precision (P), Recall (R), F1-score and Accuracy. The experimental results based on the evaluation metrics show that ANN showed the highest-level accuracy (99.4%) when tested with breast cancer dataset. On the other hand, when these ML classifiers are tested with the cervical cancer dataset, Ensemble (Bagged Tree) technique gave better accuracy (93.1%) in comparison to other classifiers.Keywords: Artificial neural networks, breast cancer, cancer dataset, classifiers, cervical cancer, F-score, logistic regression, machine learning, precision, recall, support vector machine.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15532921 Support Vector Machine Approach for Classification of Cancerous Prostate Regions
Authors: Metehan Makinacı
Abstract:
The objective of this paper, is to apply support vector machine (SVM) approach for the classification of cancerous and normal regions of prostate images. Three kinds of textural features are extracted and used for the analysis: parameters of the Gauss- Markov random field (GMRF), correlation function and relative entropy. Prostate images are acquired by the system consisting of a microscope, video camera and a digitizing board. Cross-validated classification over a database of 46 images is implemented to evaluate the performance. In SVM classification, sensitivity and specificity of 96.2% and 97.0% are achieved for the 32x32 pixel block sized data, respectively, with an overall accuracy of 96.6%. Classification performance is compared with artificial neural network and k-nearest neighbor classifiers. Experimental results demonstrate that the SVM approach gives the best performance.
Keywords: Computer-aided diagnosis, support vector machines, Gauss-Markov random fields, texture classification.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17932920 Consumer Product Demand Forecasting based on Artificial Neural Network and Support Vector Machine
Authors: Karin Kandananond
Abstract:
The nature of consumer products causes the difficulty in forecasting the future demands and the accuracy of the forecasts significantly affects the overall performance of the supply chain system. In this study, two data mining methods, artificial neural network (ANN) and support vector machine (SVM), were utilized to predict the demand of consumer products. The training data used was the actual demand of six different products from a consumer product company in Thailand. The results indicated that SVM had a better forecast quality (in term of MAPE) than ANN in every category of products. Moreover, another important finding was the margin difference of MAPE from these two methods was significantly high when the data was highly correlated.Keywords: Artificial neural network (ANN), Bullwhip effect, Consumer products, Demand forecasting, Supply chain, Support vector machine (SVM).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 30092919 Modeling of Reinforcement in Concrete Beams Using Machine Learning Tools
Authors: Yogesh Aggarwal
Abstract:
The paper discusses the results obtained to predict reinforcement in singly reinforced beam using Neural Net (NN), Support Vector Machines (SVM-s) and Tree Based Models. Major advantage of SVM-s over NN is of minimizing a bound on the generalization error of model rather than minimizing a bound on mean square error over the data set as done in NN. Tree Based approach divides the problem into a small number of sub problems to reach at a conclusion. Number of data was created for different parameters of beam to calculate the reinforcement using limit state method for creation of models and validation. The results from this study suggest a remarkably good performance of tree based and SVM-s models. Further, this study found that these two techniques work well and even better than Neural Network methods. A comparison of predicted values with actual values suggests a very good correlation coefficient with all four techniques.Keywords: Linear Regression, M5 Model Tree, Neural Network, Support Vector Machines.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20362918 Offline Signature Recognition using Radon Transform
Authors: M.Radmehr, S.M.Anisheh, I.Yousefian
Abstract:
In this work a new offline signature recognition system based on Radon Transform, Fractal Dimension (FD) and Support Vector Machine (SVM) is presented. In the first step, projections of original signatures along four specified directions have been performed using radon transform. Then, FDs of four obtained vectors are calculated to construct a feature vector for each signature. These vectors are then fed into SVM classifier for recognition of signatures. In order to evaluate the effectiveness of the system several experiments are carried out. Offline signature database from signature verification competition (SVC) 2004 is used during all of the tests. Experimental result indicates that the proposed method achieved high accuracy rate in signature recognition.Keywords: Fractal Dimension, Offline Signature Recognition, Radon Transform, Support Vector Machine
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 26012917 Hybrid Anomaly Detection Using Decision Tree and Support Vector Machine
Authors: Elham Serkani, Hossein Gharaee Garakani, Naser Mohammadzadeh, Elaheh Vaezpour
Abstract:
Intrusion detection systems (IDS) are the main components of network security. These systems analyze the network events for intrusion detection. The design of an IDS is through the training of normal traffic data or attack. The methods of machine learning are the best ways to design IDSs. In the method presented in this article, the pruning algorithm of C5.0 decision tree is being used to reduce the features of traffic data used and training IDS by the least square vector algorithm (LS-SVM). Then, the remaining features are arranged according to the predictor importance criterion. The least important features are eliminated in the order. The remaining features of this stage, which have created the highest level of accuracy in LS-SVM, are selected as the final features. The features obtained, compared to other similar articles which have examined the selected features in the least squared support vector machine model, are better in the accuracy, true positive rate, and false positive. The results are tested by the UNSW-NB15 dataset.
Keywords: Intrusion detection system, decision tree, support vector machine, feature selection.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12412916 Autonomously Determining the Parameters for SVDD with RBF Kernel from a One-Class Training Set
Authors: Andreas Theissler, Ian Dear
Abstract:
The one-class support vector machine “support vector data description” (SVDD) is an ideal approach for anomaly or outlier detection. However, for the applicability of SVDD in real-world applications, the ease of use is crucial. The results of SVDD are massively determined by the choice of the regularisation parameter C and the kernel parameter of the widely used RBF kernel. While for two-class SVMs the parameters can be tuned using cross-validation based on the confusion matrix, for a one-class SVM this is not possible, because only true positives and false negatives can occur during training. This paper proposes an approach to find the optimal set of parameters for SVDD solely based on a training set from one class and without any user parameterisation. Results on artificial and real data sets are presented, underpinning the usefulness of the approach.
Keywords: Support vector data description, anomaly detection, one-class classification, parameter tuning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2935