Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 53

support vector machine Related Abstracts

53 A New Approach of Preprocessing with SVM Optimization Based on PSO for Bearing Fault Diagnosis

Authors: Tawfik Thelaidjia, Salah Chenikher

Abstract:

Bearing fault diagnosis has attracted significant attention over the past few decades. It consists of two major parts: vibration signal feature extraction and condition classification for the extracted features. In this paper, feature extraction from faulty bearing vibration signals is performed by a combination of the signal’s Kurtosis and features obtained through the preprocessing of the vibration signal samples using Db2 discrete wavelet transform at the fifth level of decomposition. In this way, a 7-dimensional vector of the vibration signal feature is obtained. After feature extraction from vibration signal, the support vector machine (SVM) was applied to automate the fault diagnosis procedure. To improve the classification accuracy for bearing fault prediction, particle swarm optimization (PSO) is employed to simultaneously optimize the SVM kernel function parameter and the penalty parameter. The results have shown feasibility and effectiveness of the proposed approach

Keywords: Machine Learning, Condition Monitoring, Fault diagnosis, Particle Swarm Optimization, Rotating Machines, discrete wavelet transform, kurtosis, roller bearing, support vector machine, vibration measurement

Procedia PDF Downloads 280
52 Vehicle Type Classification with Geometric and Appearance Attributes

Authors: Ghada S. Moussa

Abstract:

With the increase in population along with economic prosperity, an enormous increase in the number and types of vehicles on the roads occurred. This fact brings a growing need for efficiently yet effectively classifying vehicles into their corresponding categories, which play a crucial role in many areas of infrastructure planning and traffic management. This paper presents two vehicle-type classification approaches; 1) geometric-based and 2) appearance-based. The two classification approaches are used for two tasks: multi-class and intra-class vehicle classifications. For the evaluation purpose of the proposed classification approaches’ performance and the identification of the most effective yet efficient one, 10-fold cross-validation technique is used with a large dataset. The proposed approaches are distinguishable from previous research on vehicle classification in which: i) they consider both geometric and appearance attributes of vehicles, and ii) they perform remarkably well in both multi-class and intra-class vehicle classification. Experimental results exhibit promising potentials implementations of the proposed vehicle classification approaches into real-world applications.

Keywords: support vector machine, appearance attributes, geometric attributes, vehicle classification

Procedia PDF Downloads 197
51 Dissolved Oxygen Prediction Using Support Vector Machine

Authors: Mogeeb Mosleh, Sorayya Malek, Sharifah M. Syed

Abstract:

In this study, Support Vector Machine (SVM) technique was applied to predict the dichotomized value of Dissolved oxygen (DO) from two freshwater lakes namely Chini and Bera Lake (Malaysia). Data sample contained 11 parameters for water quality features from year 2005 until 2009. All data parameters were used to predicate the dissolved oxygen concentration which was dichotomized into 3 different levels (High, Medium, and Low). The input parameters were ranked, and forward selection method was applied to determine the optimum parameters that yield the lowest errors, and highest accuracy. Initial results showed that pH, water temperature, and conductivity are the most important parameters that significantly affect the predication of DO. Then, SVM model was applied using the Anova kernel with those parameters yielded 74% accuracy rate. We concluded that using SVM models to predicate the DO is feasible, and using dichotomized value of DO yields higher prediction accuracy than using precise DO value.

Keywords: Water Quality, support vector machine, dissolved oxygen, predication DO

Procedia PDF Downloads 157
50 Comparative Analysis of Spectral Estimation Methods for Brain-Computer Interfaces

Authors: Rafik Djemili, Hocine Bourouba, M. C. Amara Korba

Abstract:

In this paper, we present a method in order to classify EEG signals for Brain-Computer Interfaces (BCI). EEG signals are first processed by means of spectral estimation methods to derive reliable features before classification step. Spectral estimation methods used are standard periodogram and the periodogram calculated by the Welch method; both methods are compared with Logarithm of Band Power (logBP) features. In the method proposed, we apply Linear Discriminant Analysis (LDA) followed by Support Vector Machine (SVM). Classification accuracy reached could be as high as 85%, which proves the effectiveness of classification of EEG signals based BCI using spectral methods.

Keywords: Motor imagery, Brain-Computer Interface, support vector machine, electroencephalogram, linear discriminant analysis

Procedia PDF Downloads 270
49 Classifier for Liver Ultrasound Images

Authors: Soumya Sajjan

Abstract:

Liver cancer is the most common cancer disease worldwide in men and women, and is one of the few cancers still on the rise. Liver disease is the 4th leading cause of death. According to new NHS (National Health Service) figures, deaths from liver diseases have reached record levels, rising by 25% in less than a decade; heavy drinking, obesity, and hepatitis are believed to be behind the rise. In this study, we focus on Development of Diagnostic Classifier for Ultrasound liver lesion. Ultrasound (US) Sonography is an easy-to-use and widely popular imaging modality because of its ability to visualize many human soft tissues/organs without any harmful effect. This paper will provide an overview of underlying concepts, along with algorithms for processing of liver ultrasound images Naturaly, Ultrasound liver lesion images are having more spackle noise. Developing classifier for ultrasound liver lesion image is a challenging task. We approach fully automatic machine learning system for developing this classifier. First, we segment the liver image by calculating the textural features from co-occurrence matrix and run length method. For classification, Support Vector Machine is used based on the risk bounds of statistical learning theory. The textural features for different features methods are given as input to the SVM individually. Performance analysis train and test datasets carried out separately using SVM Model. Whenever an ultrasonic liver lesion image is given to the SVM classifier system, the features are calculated, classified, as normal and diseased liver lesion. We hope the result will be helpful to the physician to identify the liver cancer in non-invasive method.

Keywords: Segmentation, support vector machine, ultrasound liver lesion, co-occurance Matrix

Procedia PDF Downloads 252
48 Comparative Study Using WEKA for Red Blood Cells Classification

Authors: Hamid A. Jalab, Loay E. George, Azizah Suliman, Abdul Rahim Ahmad, Karim Al-Jashamy, Jameela Ali

Abstract:

Red blood cells (RBC) are the most common types of blood cells and are the most intensively studied in cell biology. The lack of RBCs is a condition in which the amount of hemoglobin level is lower than normal and is referred to as “anemia”. Abnormalities in RBCs will affect the exchange of oxygen. This paper presents a comparative study for various techniques for classifying the RBCs as normal, or abnormal (anemic) using WEKA. WEKA is an open source consists of different machine learning algorithms for data mining applications. The algorithm tested are Radial Basis Function neural network, Support vector machine, and K-Nearest Neighbors algorithm. Two sets of combined features were utilized for classification of blood cells images. The first set, exclusively consist of geometrical features, was used to identify whether the tested blood cell has a spherical shape or non-spherical cells. While the second set, consist mainly of textural features was used to recognize the types of the spherical cells. We have provided an evaluation based on applying these classification methods to our RBCs image dataset which were obtained from Serdang Hospital-alaysia, and measuring the accuracy of test results. The best achieved classification rates are 97%, 98%, and 79% for Support vector machines, Radial Basis Function neural network, and K-Nearest Neighbors algorithm respectively.

Keywords: support vector machine, red blood cells, radial basis function neural network, K-nearest neighbors algorithm

Procedia PDF Downloads 241
47 The Classification of Parkinson Tremor and Essential Tremor Based on Frequency Alteration of Different Activities

Authors: Chusak Thanawattano, Roongroj Bhidayasiri

Abstract:

This paper proposes a novel feature set utilized for classifying the Parkinson tremor and essential tremor. Ten ET and ten PD subjects are asked to perform kinetic, postural and resting tests. The empirical mode decomposition (EMD) is used to decompose collected tremor signal to a set of intrinsic mode functions (IMF). The IMFs are used for reconstructing representative signals. The feature set is composed of peak frequencies of IMFs and reconstructed signals. Hypothesize that the dominant frequency components of subjects with PD and ET change in different directions for different tests, difference of peak frequencies of IMFs and reconstructed signals of pairwise based tests (kinetic-resting, kinetic-postural and postural-resting) are considered as potential features. Sets of features are used to train and test by classifier including the quadratic discriminant classifier (QLC) and the support vector machine (SVM). The best accuracy, the best sensitivity and the best specificity are 90%, 87.5%, and 92.86%, respectively.

Keywords: Parkinson, Tremor, Spectrum Estimation, support vector machine, essential tremor, empirical mode decomposition, quadratic discriminant, peak frequency, auto-regressive

Procedia PDF Downloads 307
46 Using Probe Person Data for Travel Mode Detection

Authors: Muhammad Awais Shafique, Eiji Hato, Hideki Yaginuma

Abstract:

Recently GPS data is used in a lot of studies to automatically reconstruct travel patterns for trip survey. The aim is to minimize the use of questionnaire surveys and travel diaries so as to reduce their negative effects. In this paper data acquired from GPS and accelerometer embedded in smart phones is utilized to predict the mode of transportation used by the phone carrier. For prediction, Support Vector Machine (SVM) and Adaptive boosting (AdaBoost) are employed. Moreover a unique method to improve the prediction results from these algorithms is also proposed. Results suggest that the prediction accuracy of AdaBoost after improvement is relatively better than the rest.

Keywords: GPS, support vector machine, adaboost, accelerometer, mode prediction

Procedia PDF Downloads 218
45 A Unique Multi-Class Support Vector Machine Algorithm Using MapReduce

Authors: Aditi Viswanathan, Shree Ranjani, Aruna Govada

Abstract:

With data sizes constantly expanding, and with classical machine learning algorithms that analyze such data requiring larger and larger amounts of computation time and storage space, the need to distribute computation and memory requirements among several computers has become apparent. Although substantial work has been done in developing distributed binary SVM algorithms and multi-class SVM algorithms individually, the field of multi-class distributed SVMs remains largely unexplored. This research seeks to develop an algorithm that implements the Support Vector Machine over a multi-class data set and is efficient in a distributed environment. For this, we recursively choose the best binary split of a set of classes using a greedy technique. Much like the divide and conquer approach. Our algorithm has shown better computation time during the testing phase than the traditional sequential SVM methods (One vs. One, One vs. Rest) and out-performs them as the size of the data set grows. This approach also classifies the data with higher accuracy than the traditional multi-class algorithms.

Keywords: support vector machine, distributed algorithm, MapReduce, multi-class

Procedia PDF Downloads 258
44 Statistical Wavelet Features, PCA, and SVM-Based Approach for EEG Signals Classification

Authors: R. K. Chaurasiya, N. D. Londhe, S. Ghosh

Abstract:

The study of the electrical signals produced by neural activities of human brain is called Electroencephalography. In this paper, we propose an automatic and efficient EEG signal classification approach. The proposed approach is used to classify the EEG signal into two classes: epileptic seizure or not. In the proposed approach, we start with extracting the features by applying Discrete Wavelet Transform (DWT) in order to decompose the EEG signals into sub-bands. These features, extracted from details and approximation coefficients of DWT sub-bands, are used as input to Principal Component Analysis (PCA). The classification is based on reducing the feature dimension using PCA and deriving the support-vectors using Support Vector Machine (SVM). The experimental are performed on real and standard dataset. A very high level of classification accuracy is obtained in the result of classification.

Keywords: Pattern Recognition, Principal Component Analysis, discrete wavelet transform, support vector machine, electroencephalogram

Procedia PDF Downloads 387
43 Identification of Hepatocellular Carcinoma Using Supervised Learning Algorithms

Authors: Sagri Sharma

Abstract:

Analysis of diseases integrating multi-factors increases the complexity of the problem and therefore, development of frameworks for the analysis of diseases is an issue that is currently a topic of intense research. Due to the inter-dependence of the various parameters, the use of traditional methodologies has not been very effective. Consequently, newer methodologies are being sought to deal with the problem. Supervised Learning Algorithms are commonly used for performing the prediction on previously unseen data. These algorithms are commonly used for applications in fields ranging from image analysis to protein structure and function prediction and they get trained using a known dataset to come up with a predictor model that generates reasonable predictions for the response to new data. Gene expression profiles generated by DNA analysis experiments can be quite complex since these experiments can involve hypotheses involving entire genomes. The application of well-known machine learning algorithm - Support Vector Machine - to analyze the expression levels of thousands of genes simultaneously in a timely, automated and cost effective way is thus used. The objectives to undertake the presented work are development of a methodology to identify genes relevant to Hepatocellular Carcinoma (HCC) from gene expression dataset utilizing supervised learning algorithms and statistical evaluations along with development of a predictive framework that can perform classification tasks on new, unseen data.

Keywords: Artificial Intelligence, Machine Learning, Biomarker, Hepatocellular Carcinoma, support vector machine, gene expression datasets, supervised learning algorithms

Procedia PDF Downloads 326
42 A System to Detect Inappropriate Messages in Online Social Networks

Authors: Shivani Singh, Shantanu Nakhare, Kalyani Nair, Rohan Shetty

Abstract:

As social networking is growing at a rapid pace today it is vital that we work on improving its management. Research has shown that the content present in online social networks may have significant influence on impressionable minds. If such platforms are misused, it will lead to negative consequences. Detecting insults or inappropriate messages continues to be one of the most challenging aspects of Online Social Networks (OSNs) today. We address this problem through a Machine Learning Based Soft Text Classifier approach using Support Vector Machine algorithm. The proposed system acts as a screening mechanism the alerts the user about such messages. The messages are classified according to their subject matter and each comment is labeled for the presence of profanity and insults.

Keywords: Machine Learning, online social networks, support vector machine, soft text classifier

Procedia PDF Downloads 346
41 Support Vector Regression with Weighted Least Absolute Deviations

Authors: Kang-Mo Jung

Abstract:

Least squares support vector machine (LS-SVM) is a penalized regression which considers both fitting and generalization ability of a model. However, the squared loss function is very sensitive to even single outlier. We proposed a weighted absolute deviation loss function for the robustness of the estimates in least absolute deviation support vector machine. The proposed estimates can be obtained by a quadratic programming algorithm. Numerical experiments on simulated datasets show that the proposed algorithm is competitive in view of robustness to outliers.

Keywords: Robustness, weight, support vector machine, quadratic programming, least absolute deviation

Procedia PDF Downloads 335
40 A Data-Mining Model for Protection of FACTS-Based Transmission Line

Authors: Ashok Kalagura

Abstract:

This paper presents a data-mining model for fault-zone identification of flexible AC transmission systems (FACTS)-based transmission line including a thyristor-controlled series compensator (TCSC) and unified power-flow controller (UPFC), using ensemble decision trees. Given the randomness in the ensemble of decision trees stacked inside the random forests model, it provides an effective decision on the fault-zone identification. Half-cycle post-fault current and voltage samples from the fault inception are used as an input vector against target output ‘1’ for the fault after TCSC/UPFC and ‘1’ for the fault before TCSC/UPFC for fault-zone identification. The algorithm is tested on simulated fault data with wide variations in operating parameters of the power system network, including noisy environment providing a reliability measure of 99% with faster response time (3/4th cycle from fault inception). The results of the presented approach using the RF model indicate the reliable identification of the fault zone in FACTS-based transmission lines.

Keywords: UPFC, support vector machine, SVM, distance relaying, fault-zone identification, random forests, RFs, thyristor-controlled series compensator, TCSC, unified power-flow controller

Procedia PDF Downloads 310
39 Towards a Balancing Medical Database by Using the Least Mean Square Algorithm

Authors: Kamel Belammi, Houria Fatrim

Abstract:

imbalanced data set, a problem often found in real world application, can cause seriously negative effect on classification performance of machine learning algorithms. There have been many attempts at dealing with classification of imbalanced data sets. In medical diagnosis classification, we often face the imbalanced number of data samples between the classes in which there are not enough samples in rare classes. In this paper, we proposed a learning method based on a cost sensitive extension of Least Mean Square (LMS) algorithm that penalizes errors of different samples with different weight and some rules of thumb to determine those weights. After the balancing phase, we applythe different classifiers (support vector machine (SVM), k- nearest neighbor (KNN) and multilayer neuronal networks (MNN)) for balanced data set. We have also compared the obtained results before and after balancing method.

Keywords: Diabetes, support vector machine, least mean square algorithm, multilayer neural networks, k- nearest neighbor, imbalanced medical data

Procedia PDF Downloads 396
38 A Supervised Approach for Detection of Singleton Spam Reviews

Authors: Naomie Salim, Mohammadali Tavakoli, Atefeh Heydari

Abstract:

In recent years, we have witnessed that online reviews are the most important source of customers’ opinion. They are progressively more used by individuals and organisations to make purchase and business decisions. Unfortunately, for the reason of profit or fame, frauds produce deceptive reviews to hoodwink potential customers. Their activities mislead not only potential customers to make appropriate purchasing decisions and organisations to reshape their business, but also opinion mining techniques by preventing them from reaching accurate results. Spam reviews could be divided into two main groups, i.e. multiple and singleton spam reviews. Detecting a singleton spam review that is the only review written by a user ID is extremely challenging due to lack of clue for detection purposes. Singleton spam reviews are very harmful and various features and proofs used in multiple spam reviews detection are not applicable in this case. Current research aims to propose a novel supervised technique to detect singleton spam reviews. To achieve this, various features are proposed in this study and are to be combined with the most appropriate features extracted from literature and employed in a classifier. In order to compare the performance of different classifiers, SVM and naive Bayes classification algorithms were used for model building. The results revealed that SVM was more accurate than naive Bayes and our proposed technique is capable to detect singleton spam reviews effectively.

Keywords: Classification Algorithms, naive Bayes, support vector machine, opinion review spam detection, singleton review spam detection

Procedia PDF Downloads 173
37 One-Class Support Vector Machine for Sentiment Analysis of Movie Review Documents

Authors: Chothmal , Basant Agarwal

Abstract:

Sentiment analysis means to classify a given review document into positive or negative polar document. Sentiment analysis research has been increased tremendously in recent times due to its large number of applications in the industry and academia. Sentiment analysis models can be used to determine the opinion of the user towards any entity or product. E-commerce companies can use sentiment analysis model to improve their products on the basis of users’ opinion. In this paper, we propose a new One-class Support Vector Machine (One-class SVM) based sentiment analysis model for movie review documents. In the proposed approach, we initially extract features from one class of documents, and further test the given documents with the one-class SVM model if a given new test document lies in the model or it is an outlier. Experimental results show the effectiveness of the proposed sentiment analysis model.

Keywords: Machine Learning, sentiment analysis, support vector machine, feature selection methods, one-class SVM

Procedia PDF Downloads 265
36 A Computer-Aided System for Detection and Classification of Liver Cirrhosis

Authors: Abdel Hadi N. Ebraheim, Eman Azomi, Nefisa A. Fahmy

Abstract:

This paper designs and implements a computer-aided system (CAS) to help detect and diagnose liver cirrhosis in patients with Chronic Hepatitis C. Our system reduces the required features (tests) the patient is asked to do to tests to their minimal best most informative subset of tests, with a diagnostic accuracy above 99%, and hence saving both time and costs. We use the Support Vector Machine (SVM) with cross-validation, a Multilayer Perceptron Neural Network (MLP), and a Generalized Regression Neural Network (GRNN) that employs a base of radial functions for functional approximation, as classifiers. Our system is tested on 199 subjects, of them 99 Chronic Hepatitis C.The subjects were selected from among the outpatient clinic in National Herpetology and Tropical Medicine Research Institute (NHTMRI).

Keywords: classification, Liver Cirrhosis, Accuracy, Artificial Neural Network, multi-layer perceptron, support vector machine

Procedia PDF Downloads 305
35 Tool for Maxillary Sinus Quantification in Computed Tomography Exams

Authors: Guilherme Giacomini, Marcela de Oliveira, Fernando Antonio Bacchim Neto, Allan Felipe Fattori Alves, Diana Rodrigues De Pina, Ana Luiza Menegatti Pavan, José Ricardo de Arruda Miranda, Seizo Yamashita

Abstract:

The maxillary sinus (MS), part of the paranasal sinus complex, is one of the most enigmatic structures in modern humans. The literature has suggested that MSs function as olfaction accessories, to heat or humidify inspired air, for thermoregulation, to impart resonance to the voice and others. Thus, the real function of the MS is still uncertain. Furthermore, the MS anatomy is complex and varies from person to person. Many diseases may affect the development process of sinuses. The incidence of rhinosinusitis and other pathoses in the MS is comparatively high, so, volume analysis has clinical value. Providing volume values for MS could be helpful in evaluating the presence of any abnormality and could be used for treatment planning and evaluation of the outcome. The computed tomography (CT) has allowed a more exact assessment of this structure, which enables a quantitative analysis. However, this is not always possible in the clinical routine, and if possible, it involves much effort and/or time. Therefore, it is necessary to have a convenient, robust, and practical tool correlated with the MS volume, allowing clinical applicability. Nowadays, the available methods for MS segmentation are manual or semi-automatic. Additionally, manual methods present inter and intraindividual variability. Thus, the aim of this study was to develop an automatic tool to quantity the MS volume in CT scans of paranasal sinuses. This study was developed with ethical approval from the authors’ institutions and national review panels. The research involved 30 retrospective exams of University Hospital, Botucatu Medical School, São Paulo State University, Brazil. The tool for automatic MS quantification, developed in Matlab®, uses a hybrid method, combining different image processing techniques. For MS detection, the algorithm uses a Support Vector Machine (SVM), by features such as pixel value, spatial distribution, shape and others. The detected pixels are used as seed point for a region growing (RG) segmentation. Then, morphological operators are applied to reduce false-positive pixels, improving the segmentation accuracy. These steps are applied in all slices of CT exam, obtaining the MS volume. To evaluate the accuracy of the developed tool, the automatic method was compared with manual segmentation realized by an experienced radiologist. For comparison, we used Bland-Altman statistics, linear regression, and Jaccard similarity coefficient. From the statistical analyses for the comparison between both methods, the linear regression showed a strong association and low dispersion between variables. The Bland–Altman analyses showed no significant differences between the analyzed methods. The Jaccard similarity coefficient was > 0.90 in all exams. In conclusion, the developed tool to quantify MS volume proved to be robust, fast, and efficient, when compared with manual segmentation. Furthermore, it avoids the intra and inter-observer variations caused by manual and semi-automatic methods. As future work, the tool will be applied in clinical practice. Thus, it may be useful in the diagnosis and treatment determination of MS diseases. Providing volume values for MS could be helpful in evaluating the presence of any abnormality and could be used for treatment planning and evaluation of the outcome. The computed tomography (CT) has allowed a more exact assessment of this structure which enables a quantitative analysis. However, this is not always possible in the clinical routine, and if possible, it involves much effort and/or time. Therefore, it is necessary to have a convenient, robust and practical tool correlated with the MS volume, allowing clinical applicability. Nowadays, the available methods for MS segmentation are manual or semi-automatic. Additionally, manual methods present inter and intraindividual variability. Thus, the aim of this study was to develop an automatic tool to quantity the MS volume in CT scans of paranasal sinuses. This study was developed with ethical approval from the authors’ institutions and national review panels. The research involved 30 retrospective exams of University Hospital, Botucatu Medical School, São Paulo State University, Brazil. The tool for automatic MS quantification, developed in Matlab®, uses a hybrid method, combining different image processing techniques. For MS detection, the algorithm uses a Support Vector Machine (SVM), by features such as pixel value, spatial distribution, shape and others. The detected pixels are used as seed point for a region growing (RG) segmentation. Then, morphological operators are applied to reduce false-positive pixels, improving the segmentation accuracy. These steps are applied in all slices of CT exam, obtaining the MS volume. To evaluate the accuracy of the developed tool, the automatic method was compared with manual segmentation realized by an experienced radiologist. For comparison, we used Bland-Altman statistics, linear regression and Jaccard similarity coefficient. From the statistical analyses for the comparison between both methods, the linear regression showed a strong association and low dispersion between variables. The Bland–Altman analyses showed no significant differences between the analyzed methods. The Jaccard similarity coefficient was > 0.90 in all exams. In conclusion, the developed tool to automatically quantify MS volume proved to be robust, fast and efficient, when compared with manual segmentation. Furthermore, it avoids the intra and inter-observer variations caused by manual and semi-automatic methods. As future work, the tool will be applied in clinical practice. Thus, it may be useful in the diagnosis and treatment determination of MS diseases.

Keywords: support vector machine, maxillary sinus, region growing, volume quantification

Procedia PDF Downloads 376
34 Annual Water Level Simulation Using Support Vector Machine

Authors: Maryam Khalilzadeh Poshtegal, Seyed Ahmad Mirbagheri, Mojtaba Noury

Abstract:

In this paper, by application of the input yearly data of rainfall, temperature and flow to the Urmia Lake, the simulation of water level fluctuation were applied by means of three models. According to the climate change investigation the fluctuation of lakes water level are of high interest. This study investigate data-driven models, support vector machines (SVM), SVM method which is a new regression procedure in water resources are applied to the yearly level data of Lake Urmia that is the biggest and the hyper saline lake in Iran. The evaluated lake levels are found to be in good correlation with the observed values. The results of SVM simulation show better accuracy and implementation. The mean square errors, mean absolute relative errors and determination coefficient statistics are used as comparison criteria.

Keywords: Simulation, support vector machine, Urmia Lake, water level fluctuation

Procedia PDF Downloads 219
33 Data Mining Approach for Commercial Data Classification and Migration in Hybrid Storage Systems

Authors: Mais Haj Qasem, Maen M. Al Assaf, Ali Rodan

Abstract:

Parallel hybrid storage systems consist of a hierarchy of different storage devices that vary in terms of data reading speed performance. As we ascend in the hierarchy, data reading speed becomes faster. Thus, migrating the application’ important data that will be accessed in the near future to the uppermost level will reduce the application I/O waiting time; hence, reducing its execution elapsed time. In this research, we implement trace-driven two-levels parallel hybrid storage system prototype that consists of HDDs and SSDs. The prototype uses data mining techniques to classify application’ data in order to determine its near future data accesses in parallel with the its on-demand request. The important data (i.e. the data that the application will access in the near future) are continuously migrated to the uppermost level of the hierarchy. Our simulation results show that our data migration approach integrated with data mining techniques reduces the application execution elapsed time when using variety of traces in at least to 22%.

Keywords: Data Mining, support vector machine, hybrid storage system, recurrent neural network

Procedia PDF Downloads 174
32 Evaluation of Ensemble Classifiers for Intrusion Detection

Authors: M. Govindarajan

Abstract:

One of the major developments in machine learning in the past decade is the ensemble method, which finds highly accurate classifier by combining many moderately accurate component classifiers. In this research work, new ensemble classification methods are proposed with homogeneous ensemble classifier using bagging and heterogeneous ensemble classifier using arcing and their performances are analyzed in terms of accuracy. A Classifier ensemble is designed using Radial Basis Function (RBF) and Support Vector Machine (SVM) as base classifiers. The feasibility and the benefits of the proposed approaches are demonstrated by the means of standard datasets of intrusion detection. The main originality of the proposed approach is based on three main parts: preprocessing phase, classification phase, and combining phase. A wide range of comparative experiments is conducted for standard datasets of intrusion detection. The performance of the proposed homogeneous and heterogeneous ensemble classifiers are compared to the performance of other standard homogeneous and heterogeneous ensemble methods. The standard homogeneous ensemble methods include Error correcting output codes, Dagging and heterogeneous ensemble methods include majority voting, stacking. The proposed ensemble methods provide significant improvement of accuracy compared to individual classifiers and the proposed bagged RBF and SVM performs significantly better than ECOC and Dagging and the proposed hybrid RBF-SVM performs significantly better than voting and stacking. Also heterogeneous models exhibit better results than homogeneous models for standard datasets of intrusion detection. 

Keywords: Data Mining, Accuracy, Ensemble, support vector machine, radial basis function

Procedia PDF Downloads 126
31 Decision Support System for Diagnosis of Breast Cancer

Authors: Oluwaponmile D. Alao

Abstract:

In this paper, two models have been developed to ascertain the best network needed for diagnosis of breast cancer. Breast cancer has been a disease that required the attention of the medical practitioner. Experience has shown that misdiagnose of the disease has been a major challenge in the medical field. Therefore, designing a system with adequate performance for will help in making diagnosis of the disease faster and accurate. In this paper, two models: backpropagation neural network and support vector machine has been developed. The performance obtained is also compared with other previously obtained algorithms to ascertain the best algorithms.

Keywords: Data Mining, Neural Network, Breast Cancer, support vector machine

Procedia PDF Downloads 217
30 Performance of On-site Earthquake Early Warning Systems for Different Sensor Locations

Authors: Ting-Yu Hsu, Shyu-Yu Wu, Shieh-Kung Huang, Hung-Wei Chiang, Kung-Chun Lu, Pei-Yang Lin, Kuo-Liang Wen

Abstract:

Regional earthquake early warning (EEW) systems are not suitable for Taiwan, as most destructive seismic hazards arise due to in-land earthquakes. These likely cause the lead-time provided by regional EEW systems before a destructive earthquake wave arrives to become null. On the other hand, an on-site EEW system can provide more lead-time at a region closer to an epicenter, since only seismic information of the target site is required. Instead of leveraging the information of several stations, the on-site system extracts some P-wave features from the first few seconds of vertical ground acceleration of a single station and performs a prediction of the oncoming earthquake intensity at the same station according to these features. Since seismometers could be triggered by non-earthquake events such as a passing of a truck or other human activities, to reduce the likelihood of false alarms, a seismometer was installed at three different locations on the same site and the performance of the EEW system for these three sensor locations were discussed. The results show that the location on the ground of the first floor of a school building maybe a good choice, since the false alarms could be reduced and the cost for installation and maintenance is the lowest.

Keywords: support vector machine, earthquake early warning, on-site, seismometer location

Procedia PDF Downloads 119
29 An Analysis of Classification of Imbalanced Datasets by Using Synthetic Minority Over-Sampling Technique

Authors: Ghada A. Alfattni

Abstract:

Analysing unbalanced datasets is one of the challenges that practitioners in machine learning field face. However, many researches have been carried out to determine the effectiveness of the use of the synthetic minority over-sampling technique (SMOTE) to address this issue. The aim of this study was therefore to compare the effectiveness of the SMOTE over different models on unbalanced datasets. Three classification models (Logistic Regression, Support Vector Machine and Nearest Neighbour) were tested with multiple datasets, then the same datasets were oversampled by using SMOTE and applied again to the three models to compare the differences in the performances. Results of experiments show that the highest number of nearest neighbours gives lower values of error rates. 

Keywords: Machine Learning, Logistic Regression, support vector machine, nearest neighbour, SMOTE, imbalanced datasets

Procedia PDF Downloads 201
28 Automatic Detection and Classification of Diabetic Retinopathy Using Retinal Fundus Images

Authors: A. Biran, P. Sobhe Bidari, K. Raahemifar, A. Almazroe, V. Lakshminarayanan

Abstract:

Diabetic Retinopathy (DR) is a severe retinal disease which is caused by diabetes mellitus. It leads to blindness when it progress to proliferative level. Early indications of DR are the appearance of microaneurysms, hemorrhages and hard exudates. In this paper, an automatic algorithm for detection of DR has been proposed. The algorithm is based on combination of several image processing techniques including Circular Hough Transform (CHT), Contrast Limited Adaptive Histogram Equalization (CLAHE), Gabor filter and thresholding. Also, Support Vector Machine (SVM) Classifier is used to classify retinal images to normal or abnormal cases including non-proliferative or proliferative DR. The proposed method has been tested on images selected from Structured Analysis of the Retinal (STARE) database using MATLAB code. The method is perfectly able to detect DR. The sensitivity specificity and accuracy of this approach are 90%, 87.5%, and 91.4% respectively.

Keywords: Diabetic retinopathy, support vector machine, fundus images, Gabor filter, STARE

Procedia PDF Downloads 164
27 Improved Classification Procedure for Imbalanced and Overlapped Situations

Authors: Seoung Bum Kim, Hankyu Lee

Abstract:

The issue with imbalance and overlapping in the class distribution becomes important in various applications of data mining. The imbalanced dataset is a special case in classification problems in which the number of observations of one class (i.e., major class) heavily exceeds the number of observations of the other class (i.e., minor class). Overlapped dataset is the case where many observations are shared together between the two classes. Imbalanced and overlapped data can be frequently found in many real examples including fraud and abuse patients in healthcare, quality prediction in manufacturing, text classification, oil spill detection, remote sensing, and so on. The class imbalance and overlap problem is the challenging issue because this situation degrades the performance of most of the standard classification algorithms. In this study, we propose a classification procedure that can effectively handle imbalanced and overlapped datasets by splitting data space into three parts: nonoverlapping, light overlapping, and severe overlapping and applying the classification algorithm in each part. These three parts were determined based on the Hausdorff distance and the margin of the modified support vector machine. An experiments study was conducted to examine the properties of the proposed method and compared it with other classification algorithms. The results showed that the proposed method outperformed the competitors under various imbalanced and overlapped situations. Moreover, the applicability of the proposed method was demonstrated through the experiment with real data.

Keywords: classification, support vector machine, imbalanced data with class overlap, split data space

Procedia PDF Downloads 172
26 A Nonlinear Feature Selection Method for Hyperspectral Image Classification

Authors: Pei-Jyun Hsieh, Cheng-Hsuan Li, Bor-Chen Kuo

Abstract:

For hyperspectral image classification, feature reduction is an important pre-processing for avoiding the Hughes phenomena due to the difficulty for collecting training samples. Hence, lots of researches developed feature selection methods such as F-score, HSIC (Hilbert-Schmidt Independence Criterion), and etc., to improve hyperspectral image classification. However, most of them only consider the class separability in the original space, i.e., a linear class separability. In this study, we proposed a nonlinear class separability measure based on kernel trick for selecting an appropriate feature subset. The proposed nonlinear class separability was formed by a generalized RBF kernel with different bandwidths with respect to different features. Moreover, it considered the within-class separability and the between-class separability. A genetic algorithm was applied to tune these bandwidths such that the smallest with-class separability and the largest between-class separability simultaneously. This indicates the corresponding feature space is more suitable for classification. In addition, the corresponding nonlinear classification boundary can separate classes very well. These optimal bandwidths also show the importance of bands for hyperspectral image classification. The reciprocals of these bandwidths can be viewed as weights of bands. The smaller bandwidth, the larger weight of the band, and the more importance for classification. Hence, the descending order of the reciprocals of the bands gives an order for selecting the appropriate feature subsets. In the experiments, three hyperspectral image data sets, the Indian Pine Site data set, the PAVIA data set, and the Salinas A data set, were used to demonstrate the selected feature subsets by the proposed nonlinear feature selection method are more appropriate for hyperspectral image classification. Only ten percent of samples were randomly selected to form the training dataset. All non-background samples were used to form the testing dataset. The support vector machine was applied to classify these testing samples based on selected feature subsets. According to the experiments on the Indian Pine Site data set with 220 bands, the highest accuracies by applying the proposed method, F-score, and HSIC are 0.8795, 0.8795, and 0.87404, respectively. However, the proposed method selects 158 features. F-score and HSIC select 168 features and 217 features, respectively. Moreover, the classification accuracies increase dramatically only using first few features. The classification accuracies with respect to feature subsets of 10 features, 20 features, 50 features, and 110 features are 0.69587, 0.7348, 0.79217, and 0.84164, respectively. Furthermore, only using half selected features (110 features) of the proposed method, the corresponding classification accuracy (0.84168) is approximate to the highest classification accuracy, 0.8795. For other two hyperspectral image data sets, the PAVIA data set and Salinas A data set, we can obtain the similar results. These results illustrate our proposed method can efficiently find feature subsets to improve hyperspectral image classification. One can apply the proposed method to determine the suitable feature subset first according to specific purposes. Then researchers can only use the corresponding sensors to obtain the hyperspectral image and classify the samples. This can not only improve the classification performance but also reduce the cost for obtaining hyperspectral images.

Keywords: support vector machine, hyperspectral image classification, nonlinear feature selection, kernel trick

Procedia PDF Downloads 142
25 Alternator Fault Detection Using Wigner-Ville Distribution

Authors: Amir Abolfazl Suratgar, Amin Ranjbar, Amir Arsalan Jalili Zolfaghari, Mehrdad Khajavi

Abstract:

This paper describes two stages of learning-based fault detection procedure in alternators. The procedure consists of three states of machine condition namely shortened brush, high impedance relay and maintaining a healthy condition in the alternator. The fault detection algorithm uses Wigner-Ville distribution as a feature extractor and also appropriate feature classifier. In this work, ANN (Artificial Neural Network) and also SVM (support vector machine) were compared to determine more suitable performance evaluated by the mean squared of errors criteria. Modules work together to detect possible faulty conditions of machines working. To test the method performance, a signal database is prepared by making different conditions on a laboratory setup. Therefore, it seems by implementing this method, satisfactory results are achieved.

Keywords: Time-Frequency Analysis, Artificial Neural Network, support vector machine, alternator, Wigner-Ville distribution

Procedia PDF Downloads 196
24 Pyramid Binary Pattern for Age Invariant Face Verification

Authors: Saroj Bijarnia, Preety Singh

Abstract:

We propose a simple and effective biometrics system based on face verification across aging using a new variant of texture feature, Pyramid Binary Pattern. This employs Local Binary Pattern along with its hierarchical information. Dimension reduction of generated texture feature vector is done using Principal Component Analysis. Support Vector Machine is used for classification. Our proposed method achieves an accuracy of 92:24% and can be used in an automated age-invariant face verification system.

Keywords: Biometrics, Verification, support vector machine, age invariant

Procedia PDF Downloads 207