Search results for: fake health news classification model.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 9233

Search results for: fake health news classification model.

9143 Authenticity of Ecuadorian Commercial Honeys

Authors: Elisabetta Schievano, Valentina Zuccato, Claudia Finotello, Patricia Vit

Abstract:

Control of honey frauds is needed in Ecuador to protect bee keepers and consumers because simple syrups and new syrups with eucalyptus are sold as genuine honeys. Authenticity of Ecuadorian commercial honeys was tested with a vortex emulsion consisting on one volume of honey:water (1:1) dilution, and two volumes of diethyl ether. This method allows a separation of phases in one minute to discriminate genuine honeys that form three phase and fake honeys that form two phases; 34 of the 42 honeys analyzed from five provinces of Ecuador were genuine. This was confirmed with 1H NMR spectra of honey dilutions in deuterated water with an enhanced amino acid region with signals for proline, phenylalanine and tyrosine. Classic quality indicators were also tested with this method (sugars, HMF), indicators of fermentation (ethanol, acetic acid), and residues of citric acid used in the syrup manufacture. One of the honeys gave a false positive for genuine, being an admixture of genuine honey with added syrup, evident for the high sucrose. Sensory analysis was the final confirmation to recognize the honey groups studied here, namely honey produced in combs by Apis mellifera, fake honey, and honey produced in cerumen pots by Geotrigona, Melipona, and Scaptotrigona. Chloroform extractions of honey were also done to search lipophilic additives in NMR spectra. This is a valuable contribution to protect honey consumers, and to develop the beekeeping industry in Ecuador.

Keywords: Fake, genuine, honey, 1H NMR, Ecuador.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2665
9142 Data Mining Classification Methods Applied in Drug Design

Authors: Mária Stachová, Lukáš Sobíšek

Abstract:

Data mining incorporates a group of statistical methods used to analyze a set of information, or a data set. It operates with models and algorithms, which are powerful tools with the great potential. They can help people to understand the patterns in certain chunk of information so it is obvious that the data mining tools have a wide area of applications. For example in the theoretical chemistry data mining tools can be used to predict moleculeproperties or improve computer-assisted drug design. Classification analysis is one of the major data mining methodologies. The aim of thecontribution is to create a classification model, which would be able to deal with a huge data set with high accuracy. For this purpose logistic regression, Bayesian logistic regression and random forest models were built using R software. TheBayesian logistic regression in Latent GOLD software was created as well. These classification methods belong to supervised learning methods. It was necessary to reduce data matrix dimension before construct models and thus the factor analysis (FA) was used. Those models were applied to predict the biological activity of molecules, potential new drug candidates.

Keywords: data mining, classification, drug design, QSAR

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2817
9141 Classification of Health Risk Factors to Predict the Risk of Falling in Older Adults

Authors: L. Lindsay, S. A. Coleman, D. Kerr, B. J. Taylor, A. Moorhead

Abstract:

Cognitive decline and frailty is apparent in older adults leading to an increased likelihood of the risk of falling. Currently health care professionals have to make professional decisions regarding such risks, and hence make difficult decisions regarding the future welfare of the ageing population. This study uses health data from The Irish Longitudinal Study on Ageing (TILDA), focusing on adults over the age of 50 years, in order to analyse health risk factors and predict the likelihood of falls. This prediction is based on the use of machine learning algorithms whereby health risk factors are used as inputs to predict the likelihood of falling. Initial results show that health risk factors such as long-term health issues contribute to the number of falls. The identification of such health risk factors has the potential to inform health and social care professionals, older people and their family members in order to mitigate daily living risks.

Keywords: Classification, falls, health risk factors, machine learning, older adults.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1003
9140 Vehicle Type Classification with Geometric and Appearance Attributes

Authors: Ghada S. Moussa

Abstract:

With the increase in population along with economic prosperity, an enormous increase in the number and types of vehicles on the roads occurred. This fact brings a growing need for efficiently yet effectively classifying vehicles into their corresponding categories, which play a crucial role in many areas of infrastructure planning and traffic management.

This paper presents two vehicle-type classification approaches; 1) geometric-based and 2) appearance-based. The two classification approaches are used for two tasks: multi-class and intra-class vehicle classifications. For the evaluation purpose of the proposed classification approaches’ performance and the identification of the most effective yet efficient one, 10-fold cross-validation technique is used with a large dataset. The proposed approaches are distinguishable from previous research on vehicle classification in which: i) they consider both geometric and appearance attributes of vehicles, and ii) they perform remarkably well in both multi-class and intra-class vehicle classification. Experimental results exhibit promising potentials implementations of the proposed vehicle classification approaches into real-world applications.

Keywords: Appearance attributes, Geometric attributes, Support vector machine, Vehicle classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4233
9139 Wavelet and K-L Seperability Based Feature Extraction Method for Functional Data Classification

Authors: Jun Wan, Zehua Chen, Yingwu Chen, Zhidong Bai

Abstract:

This paper proposes a novel feature extraction method, based on Discrete Wavelet Transform (DWT) and K-L Seperability (KLS), for the classification of Functional Data (FD). This method combines the decorrelation and reduction property of DWT and the additive independence property of KLS, which is helpful to extraction classification features of FD. It is an advanced approach of the popular wavelet based shrinkage method for functional data reduction and classification. A theory analysis is given in the paper to prove the consistent convergence property, and a simulation study is also done to compare the proposed method with the former shrinkage ones. The experiment results show that this method has advantages in improving classification efficiency, precision and robustness.

Keywords: classification, functional data, feature extraction, K-Lseperability, wavelet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1424
9138 Classification of Germinatable Mung Bean by Near Infrared Hyperspectral Imaging

Authors: Kaewkarn Phuangsombat, Arthit Phuangsombat, Anupun Terdwongworakul

Abstract:

Hard seeds will not grow and can cause mold in sprouting process. Thus, the hard seeds need to be separated from the normal seeds. Near infrared hyperspectral imaging in a range of 900 to 1700 nm was implemented to develop a model by partial least squares discriminant analysis to discriminate the hard seeds from the normal seeds. The orientation of the seeds was also studied to compare the performance of the models. The model based on hilum-up orientation achieved the best result giving the coefficient of determination of 0.98, and root mean square error of prediction of 0.07 with classification accuracy was equal to 100%.

Keywords: Mung bean, near infrared, germinatability, hard seed.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1125
9137 Graph-Based Text Similarity Measurement by Exploiting Wikipedia as Background Knowledge

Authors: Lu Zhang, Chunping Li, Jun Liu, Hui Wang

Abstract:

Text similarity measurement is a fundamental issue in many textual applications such as document clustering, classification, summarization and question answering. However, prevailing approaches based on Vector Space Model (VSM) more or less suffer from the limitation of Bag of Words (BOW), which ignores the semantic relationship among words. Enriching document representation with background knowledge from Wikipedia is proven to be an effective way to solve this problem, but most existing methods still cannot avoid similar flaws of BOW in a new vector space. In this paper, we propose a novel text similarity measurement which goes beyond VSM and can find semantic affinity between documents. Specifically, it is a unified graph model that exploits Wikipedia as background knowledge and synthesizes both document representation and similarity computation. The experimental results on two different datasets show that our approach significantly improves VSM-based methods in both text clustering and classification.

Keywords: Text classification, Text clustering, Text similarity, Wikipedia

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2078
9136 Classification of Non Stationary Signals Using Ben Wavelet and Artificial Neural Networks

Authors: Mohammed Benbrahim, Khalid Benjelloun, Aomar Ibenbrahim, Adil Daoudi

Abstract:

The automatic classification of non stationary signals is an important practical goal in several domains. An essential classification task is to allocate the incoming signal to a group associated with the kind of physical phenomena producing it. In this paper, we present a modular system composed by three blocs: 1) Representation, 2) Dimensionality reduction and 3) Classification. The originality of our work consists in the use of a new wavelet called "Ben wavelet" in the representation stage. For the dimensionality reduction, we propose a new algorithm based on the random projection and the principal component analysis.

Keywords: Seismic signals, Ben Wavelet, Dimensionality reduction, Artificial neural networks, Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1409
9135 The Portrayal of Muslim Militants "Southern Bandits" in Thai Newspapers

Authors: Treepon Kirdnark

Abstract:

This paper examines the depiction of Muslim militants in Thai newspapers in 2004. Stuart Hall-s “representation" and “public idioms" are used as theoretical frameworks. Critical Discourse Analysis is employed as a methodology to examine 240 news articles from two leading Thai language newspapers. The results show that the militants are usually labeled as “southern bandits." This suggests that they are just a culprit of the violence in the deep south of Thailand. They are usually described as people who cause turbulence. Consequently, the military have to get rid of them. However, other aspects of the groups such as their political agenda or the failures of the Thai state in dealing with the Malay Muslims were not mention in the news stories. In the time of violence, the researcher argues that this kind of newspaper coverage may help perpetuate the discourse of Malay Muslim, instead of providing fuller picture of the ongoing conflicts.

Keywords: News Discourse, Newspapers, Thailand, Thai Muslims.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1770
9134 Multi-Level Air Quality Classification in China Using Information Gain and Support Vector Machine

Authors: Bingchun Liu, Pei-Chann Chang, Natasha Huang, Dun Li

Abstract:

Machine Learning and Data Mining are the two important tools for extracting useful information and knowledge from large datasets. In machine learning, classification is a wildly used technique to predict qualitative variables and is generally preferred over regression from an operational point of view. Due to the enormous increase in air pollution in various countries especially China, Air Quality Classification has become one of the most important topics in air quality research and modelling. This study aims at introducing a hybrid classification model based on information theory and Support Vector Machine (SVM) using the air quality data of four cities in China namely Beijing, Guangzhou, Shanghai and Tianjin from Jan 1, 2014 to April 30, 2016. China's Ministry of Environmental Protection has classified the daily air quality into 6 levels namely Serious Pollution, Severe Pollution, Moderate Pollution, Light Pollution, Good and Excellent based on their respective Air Quality Index (AQI) values. Using the information theory, information gain (IG) is calculated and feature selection is done for both categorical features and continuous numeric features. Then SVM Machine Learning algorithm is implemented on the selected features with cross-validation. The final evaluation reveals that the IG and SVM hybrid model performs better than SVM (alone), Artificial Neural Network (ANN) and K-Nearest Neighbours (KNN) models in terms of accuracy as well as complexity.

Keywords: Machine learning, air quality classification, air quality index, information gain, support vector machine, cross-validation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 906
9133 An Amalgam Approach for DICOM Image Classification and Recognition

Authors: J. Umamaheswari, G. Radhamani

Abstract:

This paper describes about the process of recognition and classification of brain images such as normal and abnormal based on PSO-SVM. Image Classification is becoming more important for medical diagnosis process. In medical area especially for diagnosis the abnormality of the patient is classified, which plays a great role for the doctors to diagnosis the patient according to the severeness of the diseases. In case of DICOM images it is very tough for optimal recognition and early detection of diseases. Our work focuses on recognition and classification of DICOM image based on collective approach of digital image processing. For optimal recognition and classification Particle Swarm Optimization (PSO), Genetic Algorithm (GA) and Support Vector Machine (SVM) are used. The collective approach by using PSO-SVM gives high approximation capability and much faster convergence.

Keywords: Recognition, classification, Relaxed Median Filter, Adaptive thresholding, clustering and Neural Networks

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2233
9132 Improving Classification Accuracy with Discretization on Datasets Including Continuous Valued Features

Authors: Mehmet Hacibeyoglu, Ahmet Arslan, Sirzat Kahramanli

Abstract:

This study analyzes the effect of discretization on classification of datasets including continuous valued features. Six datasets from UCI which containing continuous valued features are discretized with entropy-based discretization method. The performance improvement between the dataset with original features and the dataset with discretized features is compared with k-nearest neighbors, Naive Bayes, C4.5 and CN2 data mining classification algorithms. As the result the classification accuracies of the six datasets are improved averagely by 1.71% to 12.31%.

Keywords: Data mining classification algorithms, entropy-baseddiscretization method

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2431
9131 Person Identification by Using AR Model for EEG Signals

Authors: Gelareh Mohammadi, Parisa Shoushtari, Behnam Molaee Ardekani, Mohammad B. Shamsollahi

Abstract:

A direct connection between ElectroEncephaloGram (EEG) and the genetic information of individuals has been investigated by neurophysiologists and psychiatrists since 1960-s; and it opens a new research area in the science. This paper focuses on the person identification based on feature extracted from the EEG which can show a direct connection between EEG and the genetic information of subjects. In this work the full EO EEG signal of healthy individuals are estimated by an autoregressive (AR) model and the AR parameters are extracted as features. Here for feature vector constitution, two methods have been proposed; in the first method the extracted parameters of each channel are used as a feature vector in the classification step which employs a competitive neural network and in the second method a combination of different channel parameters are used as a feature vector. Correct classification scores at the range of 80% to 100% reveal the potential of our approach for person classification/identification and are in agreement to the previous researches showing evidence that the EEG signal carries genetic information. The novelty of this work is in the combination of AR parameters and the network type (competitive network) that we have used. A comparison between the first and the second approach imply preference of the second one.

Keywords: Person Identification, Autoregressive Model, EEG, Neural Network

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1716
9130 Dataset Analysis Using Membership-Deviation Graph

Authors: Itgel Bayarsaikhan, Jimin Lee, Sejong Oh

Abstract:

Classification is one of the primary themes in computational biology. The accuracy of classification strongly depends on quality of a dataset, and we need some method to evaluate this quality. In this paper, we propose a new graphical analysis method using 'Membership-Deviation Graph (MDG)' for analyzing quality of a dataset. MDG represents degree of membership and deviations for instances of a class in the dataset. The result of MDG analysis is used for understanding specific feature and for selecting best feature for classification.

Keywords: feature, classification, machine learning algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1414
9129 Genetic Algorithms and Kernel Matrix-based Criteria Combined Approach to Perform Feature and Model Selection for Support Vector Machines

Authors: A. Perolini

Abstract:

Feature and model selection are in the center of attention of many researches because of their impact on classifiers- performance. Both selections are usually performed separately but recent developments suggest using a combined GA-SVM approach to perform them simultaneously. This approach improves the performance of the classifier identifying the best subset of variables and the optimal parameters- values. Although GA-SVM is an effective method it is computationally expensive, thus a rough method can be considered. The paper investigates a joined approach of Genetic Algorithm and kernel matrix criteria to perform simultaneously feature and model selection for SVM classification problem. The purpose of this research is to improve the classification performance of SVM through an efficient approach, the Kernel Matrix Genetic Algorithm method (KMGA).

Keywords: Feature and model selection, Genetic Algorithms, Support Vector Machines, kernel matrix.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1559
9128 A Pattern Recognition Neural Network Model for Detection and Classification of SQL Injection Attacks

Authors: Naghmeh Moradpoor Sheykhkanloo

Abstract:

Thousands of organisations store important and confidential information related to them, their customers, and their business partners in databases all across the world. The stored data ranges from less sensitive (e.g. first name, last name, date of birth) to more sensitive data (e.g. password, pin code, and credit card information). Losing data, disclosing confidential information or even changing the value of data are the severe damages that Structured Query Language injection (SQLi) attack can cause on a given database. It is a code injection technique where malicious SQL statements are inserted into a given SQL database by simply using a web browser. In this paper, we propose an effective pattern recognition neural network model for detection and classification of SQLi attacks. The proposed model is built from three main elements of: a Uniform Resource Locator (URL) generator in order to generate thousands of malicious and benign URLs, a URL classifier in order to: 1) classify each generated URL to either a benign URL or a malicious URL and 2) classify the malicious URLs into different SQLi attack categories, and a NN model in order to: 1) detect either a given URL is a malicious URL or a benign URL and 2) identify the type of SQLi attack for each malicious URL. The model is first trained and then evaluated by employing thousands of benign and malicious URLs. The results of the experiments are presented in order to demonstrate the effectiveness of the proposed approach.

Keywords: Neural Networks, pattern recognition, SQL injection attacks, SQL injection attack classification, SQL injection attack detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2797
9127 Multiple Targets Classification and Fuzzy Logic Decision Fusion in Wireless Sensor Networks

Authors: Ahmad Aljaafreh

Abstract:

This paper proposes a hierarchical hidden Markov model (HHMM) to model the detection of M vehicles in a wireless sensor network (WSN). The HHMM model contains an extra level of hidden Markov model to model the temporal transitions of each state of the first HMM. By modeling the temporal transitions, only those hypothesis with nonzero transition probabilities needs to be tested. Thus, this method efficiently reduces the computation load, which is preferable in WSN applications.This paper integrates several techniques to optimize the detection performance. The output of the states of the first HMM is modeled as Gaussian Mixture Model (GMM), where the number of states and the number of Gaussians are experimentally determined, while the other parameters are estimated using Expectation Maximization (EM). HHMM is used to model the sequence of the local decisions which are based on multiple hypothesis testing with maximum likelihood approach. The states in the HHMM represent various combinations of vehicles of different types. Due to the statistical advantages of multisensor data fusion, we propose a heuristic based on fuzzy weighted majority voting to enhance cooperative classification of moving vehicles within a region that is monitored by a wireless sensor network. A fuzzy inference system weighs each local decision based on the signal to noise ratio of the acoustic signal for target detection and the signal to noise ratio of the radio signal for sensor communication. The spatial correlation among the observations of neighboring sensor nodes is efficiently utilized as well as the temporal correlation. Simulation results demonstrate the efficiency of this scheme.

Keywords: Classification, decision fusion, fuzzy logic, hidden Markov model

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6219
9126 Information Transmission between Large and Small Stocks in the Korean Stock Market

Authors: Sang Hoon Kang, Seong-Min Yoon

Abstract:

Little attention has been paid to information transmission between the portfolios of large stocks and small stocks in the Korean stock market. This study investigates the return and volatility transmission mechanisms between large and small stocks in the Korea Exchange (KRX). This study also explores whether bad news in the large stock market leads to a volatility of the small stock market that is larger than the good news volatility of the large stock market. By employing the Granger causality test, we found unidirectional return transmissions from the large stocks to medium and small stocks. This evidence indicates that pat information about the large stocks has a better ability to predict the returns of the medium and small stocks in the Korean stock market. Moreover, by using the asymmetric GARCH-BEKK model, we observed the unidirectional relationship of asymmetric volatility transmission from large stocks to the medium and small stocks. This finding suggests that volatility in the medium and small stocks following a negative shock in the large stocks is larger than that following a positive shock in the large stocks.

Keywords: Asymmetric GARCH-BEKK model, Asymmetric volatility transmission, Causality, Korean stock market, Spillover effect

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1645
9125 Emotion Classification for Students with Autism in Mathematics E-learning using Physiological and Facial Expression Measures

Authors: Hui-Chuan Chu, Min-Ju Liao, Wei-Kai Cheng, William Wei-Jen Tsai, Yuh-Min Chen

Abstract:

Avoiding learning failures in mathematics e-learning environments caused by emotional problems in students with autism has become an important topic for combining of special education with information and communications technology. This study presents an adaptive emotional adjustment model in mathematics e-learning for students with autism, emphasizing the lack of emotional perception in mathematics e-learning systems. In addition, an emotion classification for students with autism was developed by inducing emotions in mathematical learning environments to record changes in the physiological signals and facial expressions of students. Using these methods, 58 emotional features were obtained. These features were then processed using one-way ANOVA and information gain (IG). After reducing the feature dimension, methods of support vector machines (SVM), k-nearest neighbors (KNN), and classification and regression trees (CART) were used to classify four emotional categories: baseline, happy, angry, and anxious. After testing and comparisons, in a situation without feature selection, the accuracy rate of the SVM classification can reach as high as 79.3-%. After using IG to reduce the feature dimension, with only 28 features remaining, SVM still has a classification accuracy of 78.2-%. The results of this research could enhance the effectiveness of eLearning in special education.

Keywords: Emotion classification, Physiological and facial Expression measures, Students with autism, Mathematics e-learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1744
9124 Classification of Political Affiliations by Reduced Number of Features

Authors: Vesile Evrim, Aliyu Awwal

Abstract:

By the evolvement in technology, the way of expressing opinions switched direction to the digital world. The domain of politics, as one of the hottest topics of opinion mining research, merged together with the behavior analysis for affiliation determination in texts, which constitutes the subject of this paper. This study aims to classify the text in news/blogs either as Republican or Democrat with the minimum number of features. As an initial set, 68 features which 64 were constituted by Linguistic Inquiry and Word Count (LIWC) features were tested against 14 benchmark classification algorithms. In the later experiments, the dimensions of the feature vector reduced based on the 7 feature selection algorithms. The results show that the “Decision Tree”, “Rule Induction” and “M5 Rule” classifiers when used with “SVM” and “IGR” feature selection algorithms performed the best up to 82.5% accuracy on a given dataset. Further tests on a single feature and the linguistic based feature sets showed the similar results. The feature “Function”, as an aggregate feature of the linguistic category, was found as the most differentiating feature among the 68 features with the accuracy of 81% in classifying articles either as Republican or Democrat.

Keywords: Politics, machine learning, feature selection, LIWC.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2333
9123 Machine Learning for Music Aesthetic Annotation Using MIDI Format: A Harmony-Based Classification Approach

Authors: Lin Yang, Zhian Mi, Jiacheng Xiao, Rong Li

Abstract:

Swimming with the tide of deep learning, the field of music information retrieval (MIR) experiences parallel development and a sheer variety of feature-learning models has been applied to music classification and tagging tasks. Among those learning techniques, the deep convolutional neural networks (CNNs) have been widespreadly used with better performance than the traditional approach especially in music genre classification and prediction. However, regarding the music recommendation, there is a large semantic gap between the corresponding audio genres and the various aspects of a song that influence user preference. In our study, aiming to bridge the gap, we strive to construct an automatic music aesthetic annotation model with MIDI format for better comparison and measurement of the similarity between music pieces in the way of harmonic analysis. We use the matrix of qualification converted from MIDI files as input to train two different classifiers, support vector machine (SVM) and Decision Tree (DT). Experimental results in performance of a tag prediction task have shown that both learning algorithms are capable of extracting high-level properties in an end-to end manner from music information. The proposed model is helpful to learn the audience taste and then the resulting recommendations are likely to appeal to a niche consumer.

Keywords: Harmonic analysis, machine learning, music classification and tagging, MIDI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 705
9122 Performance Analysis of Artificial Neural Network Based Land Cover Classification

Authors: Najam Aziz, Nasru Minallah, Ahmad Junaid, Kashaf Gul

Abstract:

Landcover classification using automated classification techniques, while employing remotely sensed multi-spectral imagery, is one of the promising areas of research. Different land conditions at different time are captured through satellite and monitored by applying different classification algorithms in specific environment. In this paper, a SPOT-5 image provided by SUPARCO has been studied and classified in Environment for Visual Interpretation (ENVI), a tool widely used in remote sensing. Then, Artificial Neural Network (ANN) classification technique is used to detect the land cover changes in Abbottabad district. Obtained results are compared with a pixel based Distance classifier. The results show that ANN gives the better overall accuracy of 99.20% and Kappa coefficient value of 0.98 over the Mahalanobis Distance Classifier.

Keywords: Landcover classification, artificial neural network, remote sensing, SPOT-5.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1561
9121 Genetic Programming Approach for Multi-Category Pattern Classification Appliedto Network Intrusions Detection

Authors: K.M. Faraoun, A. Boukelif

Abstract:

This paper describes a new approach of classification using genetic programming. The proposed technique consists of genetically coevolving a population of non-linear transformations on the input data to be classified, and map them to a new space with a reduced dimension, in order to get a maximum inter-classes discrimination. The classification of new samples is then performed on the transformed data, and so become much easier. Contrary to the existing GP-classification techniques, the proposed one use a dynamic repartition of the transformed data in separated intervals, the efficacy of a given intervals repartition is handled by the fitness criterion, with a maximum classes discrimination. Experiments were first performed using the Fisher-s Iris dataset, and then, the KDD-99 Cup dataset was used to study the intrusion detection and classification problem. Obtained results demonstrate that the proposed genetic approach outperform the existing GP-classification methods [1],[2] and [3], and give a very accepted results compared to other existing techniques proposed in [4],[5],[6],[7] and [8].

Keywords: Genetic programming, patterns classification, intrusion detection

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1684
9120 Feature Selection for Web Page Classification Using Swarm Optimization

Authors: B. Leela Devi, A. Sankar

Abstract:

The web’s increased popularity has included a huge amount of information, due to which automated web page classification systems are essential to improve search engines’ performance. Web pages have many features like HTML or XML tags, hyperlinks, URLs and text contents which can be considered during an automated classification process. It is known that Webpage classification is enhanced by hyperlinks as it reflects Web page linkages. The aim of this study is to reduce the number of features to be used to improve the accuracy of the classification of web pages. In this paper, a novel feature selection method using an improved Particle Swarm Optimization (PSO) using principle of evolution is proposed. The extracted features were tested on the WebKB dataset using a parallel Neural Network to reduce the computational cost.

Keywords: Web page classification, WebKB Dataset, Term Frequency-Inverse Document Frequency (TF-IDF), Particle Swarm Optimization (PSO).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3228
9119 Lithofacies Classification from Well Log Data Using Neural Networks, Interval Neutrosophic Sets and Quantification of Uncertainty

Authors: Pawalai Kraipeerapun, Chun Che Fung, Kok Wai Wong

Abstract:

This paper proposes a novel approach to the question of lithofacies classification based on an assessment of the uncertainty in the classification results. The proposed approach has multiple neural networks (NN), and interval neutrosophic sets (INS) are used to classify the input well log data into outputs of multiple classes of lithofacies. A pair of n-class neural networks are used to predict n-degree of truth memberships and n-degree of false memberships. Indeterminacy memberships or uncertainties in the predictions are estimated using a multidimensional interpolation method. These three memberships form the INS used to support the confidence in results of multiclass classification. Based on the experimental data, our approach improves the classification performance as compared to an existing technique applied only to the truth membership. In addition, our approach has the capability to provide a measure of uncertainty in the problem of multiclass classification.

Keywords: Multiclass classification, feed-forward backpropagation neural network, interval neutrosophic sets, uncertainty.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1607
9118 A Framework for Review Spam Detection Research

Authors: Mohammadali Tavakoli, Atefeh Heydari, Zuriati Ismail, Naomie Salim

Abstract:

With the increasing number of people reviewing products online in recent years, opinion sharing websites has become the most important source of customers’ opinions. Unfortunately, spammers generate and post fake reviews in order to promote or demote brands and mislead potential customers. These are notably destructive not only for potential customers, but also for business holders and manufacturers. However, research in this area is not adequate, and many critical problems related to spam detection have not been solved to date. To provide green researchers in the domain with a great aid, in this paper, we have attempted to create a highquality framework to make a clear vision on review spam-detection methods. In addition, this report contains a comprehensive collection of detection metrics used in proposed spam-detection approaches. These metrics are extremely applicable for developing novel detection methods.

Keywords: Fake reviews, Feature collection, Opinion spam, Spam detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2483
9117 Support Vector Machine Approach for Classification of Cancerous Prostate Regions

Authors: Metehan Makinacı

Abstract:

The objective of this paper, is to apply support vector machine (SVM) approach for the classification of cancerous and normal regions of prostate images. Three kinds of textural features are extracted and used for the analysis: parameters of the Gauss- Markov random field (GMRF), correlation function and relative entropy. Prostate images are acquired by the system consisting of a microscope, video camera and a digitizing board. Cross-validated classification over a database of 46 images is implemented to evaluate the performance. In SVM classification, sensitivity and specificity of 96.2% and 97.0% are achieved for the 32x32 pixel block sized data, respectively, with an overall accuracy of 96.6%. Classification performance is compared with artificial neural network and k-nearest neighbor classifiers. Experimental results demonstrate that the SVM approach gives the best performance.

Keywords: Computer-aided diagnosis, support vector machines, Gauss-Markov random fields, texture classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1769
9116 Mining Implicit Knowledge to Predict Political Risk by Providing Novel Framework with Using Bayesian Network

Authors: Siavash Asadi Ghajarloo

Abstract:

Nowadays predicting political risk level of country has become a critical issue for investors who intend to achieve accurate information concerning stability of the business environments. Since, most of the times investors are layman and nonprofessional IT personnel; this paper aims to propose a framework named GECR in order to help nonexpert persons to discover political risk stability across time based on the political news and events. To achieve this goal, the Bayesian Networks approach was utilized for 186 political news of Pakistan as sample dataset. Bayesian Networks as an artificial intelligence approach has been employed in presented framework, since this is a powerful technique that can be applied to model uncertain domains. The results showed that our framework along with Bayesian Networks as decision support tool, predicted the political risk level with a high degree of accuracy.

Keywords: Bayesian Networks, Data mining, GECRframework, Predicting political risk.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2145
9115 Text Summarization for Oil and Gas News Article

Authors: L. H. Chong, Y. Y. Chen

Abstract:

Information is increasing in volumes; companies are overloaded with information that they may lose track in getting the intended information. It is a time consuming task to scan through each of the lengthy document. A shorter version of the document which contains only the gist information is more favourable for most information seekers. Therefore, in this paper, we implement a text summarization system to produce a summary that contains gist information of oil and gas news articles. The summarization is intended to provide important information for oil and gas companies to monitor their competitor-s behaviour in enhancing them in formulating business strategies. The system integrated statistical approach with three underlying concepts: keyword occurrences, title of the news article and location of the sentence. The generated summaries were compared with human generated summaries from an oil and gas company. Precision and recall ratio are used to evaluate the accuracy of the generated summary. Based on the experimental results, the system is able to produce an effective summary with the average recall value of 83% at the compression rate of 25%.

Keywords: Information retrieval, text summarization, statistical approach.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1578
9114 Automatic Building an Extensive Arabic FA Terms Dictionary

Authors: El-Sayed Atlam, Masao Fuketa, Kazuhiro Morita, Jun-ichi Aoe

Abstract:

Field Association (FA) terms are a limited set of discriminating terms that give us the knowledge to identify document fields which are effective in document classification, similar file retrieval and passage retrieval. But the problem lies in the lack of an effective method to extract automatically relevant Arabic FA Terms to build a comprehensive dictionary. Moreover, all previous studies are based on FA terms in English and Japanese, and the extension of FA terms to other language such Arabic could be definitely strengthen further researches. This paper presents a new method to extract, Arabic FA Terms from domain-specific corpora using part-of-speech (POS) pattern rules and corpora comparison. Experimental evaluation is carried out for 14 different fields using 251 MB of domain-specific corpora obtained from Arabic Wikipedia dumps and Alhyah news selected average of 2,825 FA Terms (single and compound) per field. From the experimental results, recall and precision are 84% and 79% respectively. Therefore, this method selects higher number of relevant Arabic FA Terms at high precision and recall.

Keywords: Arabic Field Association Terms, information extraction, document classification, information retrieval.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1706