Search results for: Classification rule mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1820

Search results for: Classification rule mining

1100 Texture Based Weed Detection Using Multi Resolution Combined Statistical and Spatial Frequency (MRCSF)

Authors: R.S.Sabeenian, V.Palanisamy

Abstract:

Texture classification is a trendy and a catchy technology in the field of texture analysis. Textures, the repeated patterns, have different frequency components along different orientations. Our work is based on Texture Classification and its applications. It finds its applications in various fields like Medical Image Classification, Computer Vision, Remote Sensing, Agricultural Field, and Textile Industry. Weed control has a major effect on agriculture. A large amount of herbicide has been used for controlling weeds in agriculture fields, lawns, golf courses, sport fields, etc. Random spraying of herbicides does not meet the exact requirement of the field. Certain areas in field have more weed patches than estimated. So, we need a visual system that can discriminate weeds from the field image which will reduce or even eliminate the amount of herbicide used. This would allow farmers to not use any herbicides or only apply them where they are needed. A machine vision precision automated weed control system could reduce the usage of chemicals in crop fields. In this paper, an intelligent system for automatic weeding strategy Multi Resolution Combined Statistical & spatial Frequency is used to discriminate the weeds from the crops and to classify them as narrow, little and broad weeds.

Keywords: crop weed discrimination, MRCSF, MRFM, Weeddetection, Spatial Frequency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1807
1099 Feature Based Unsupervised Intrusion Detection

Authors: Deeman Yousif Mahmood, Mohammed Abdullah Hussein

Abstract:

The goal of a network-based intrusion detection system is to classify activities of network traffics into two major categories: normal and attack (intrusive) activities. Nowadays, data mining and machine learning plays an important role in many sciences; including intrusion detection system (IDS) using both supervised and unsupervised techniques. However, one of the essential steps of data mining is feature selection that helps in improving the efficiency, performance and prediction rate of proposed approach. This paper applies unsupervised K-means clustering algorithm with information gain (IG) for feature selection and reduction to build a network intrusion detection system. For our experimental analysis, we have used the new NSL-KDD dataset, which is a modified dataset for KDDCup 1999 intrusion detection benchmark dataset. With a split of 60.0% for the training set and the remainder for the testing set, a 2 class classifications have been implemented (Normal, Attack). Weka framework which is a java based open source software consists of a collection of machine learning algorithms for data mining tasks has been used in the testing process. The experimental results show that the proposed approach is very accurate with low false positive rate and high true positive rate and it takes less learning time in comparison with using the full features of the dataset with the same algorithm.

Keywords: Information Gain (IG), Intrusion Detection System (IDS), K-means Clustering, Weka.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2746
1098 Pattern Recognition Based Prosthesis Control for Movement of Forearms Using Surface and Intramuscular EMG Signals

Authors: Anjana Goen, D. C. Tiwari

Abstract:

Myoelectric control system is the fundamental component of modern prostheses, which uses the myoelectric signals from an individual’s muscles to control the prosthesis movements. The surface electromyogram signal (sEMG) being noninvasive has been used as an input to prostheses controllers for many years. Recent technological advances has led to the development of implantable myoelectric sensors which enable the internal myoelectric signal (MES) to be used as input to these prostheses controllers. The intramuscular measurement can provide focal recordings from deep muscles of the forearm and independent signals relatively free of crosstalk thus allowing for more independent control sites. However, little work has been done to compare the two inputs. In this paper we have compared the classification accuracy of six pattern recognition based myoelectric controllers which use surface myoelectric signals recorded using untargeted (symmetric) surface electrode arrays to the same controllers with multichannel intramuscular myolectric signals from targeted intramuscular electrodes as inputs. There was no significant enhancement in the classification accuracy as a result of using the intramuscular EMG measurement technique when compared to the results acquired using the surface EMG measurement technique. Impressive classification accuracy (99%) could be achieved by optimally selecting only five channels of surface EMG.

Keywords: Discriminant Locality Preserving Projections (DLPP), myoelectric signal (MES), Sparse Principal Component Analysis (SPCA), Time Frequency Representations (TFRs).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1382
1097 Post Mining- Discovering Valid Rules from Different Sized Data Sources

Authors: R. Nedunchezhian, K. Anbumani

Abstract:

A big organization may have multiple branches spread across different locations. Processing of data from these branches becomes a huge task when innumerable transactions take place. Also, branches may be reluctant to forward their data for centralized processing but are ready to pass their association rules. Local mining may also generate a large amount of rules. Further, it is not practically possible for all local data sources to be of the same size. A model is proposed for discovering valid rules from different sized data sources where the valid rules are high weighted rules. These rules can be obtained from the high frequency rules generated from each of the data sources. A data source selection procedure is considered in order to efficiently synthesize rules. Support Equalization is another method proposed which focuses on eliminating low frequency rules at the local sites itself thus reducing the rules by a significant amount.

Keywords: Association rules, multiple data stores, synthesizing, valid rules.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1380
1096 Detection, Tracking and Classification of Vehicles and Aircraft based on Magnetic Sensing Technology

Authors: K. Dimitropoulos, N. Grammalidis, I. Gragopoulos, H. Gao, Th. Heuer, M. Weinmann, S. Voit, C. Stockhammer, U. Hartmann, N. Pavlidou

Abstract:

Existing ground movement surveillance technologies at airports are subjected to limitations due to shadowing effects or multiple reflections. Therefore, there is a strong demand for a new sensing technology, which will be cost effective and will provide detection of non-cooperative targets under any weather conditions. This paper aims to present a new intelligent system, developed within the framework of the EC-funded ISMAEL project, which is based on a new magnetic sensing technology and provides detection, tracking and automatic classification of targets moving on the airport surface. The system is currently being installed at two European airports. Initial experimental results under real airport traffic demonstrate the great potential of the proposed system.

Keywords: Air traffic management, magnetic sensors, multitracking, A-SMGCS.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1912
1095 Remarks on Some Properties of Decision Rules

Authors: Songlin Yang, Ying Ge

Abstract:

This paper shows that some properties of the decision rules in the literature do not hold by presenting a counterexample. We give sufficient and necessary conditions under which these properties are valid. These results will be helpful when one tries to choose the right decision rules in the research of rough set theory.

Keywords: set, Decision table, Decision rule, coverage factor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1386
1094 Spectral Mixture Model Applied to Cannabis Parcel Determination

Authors: Levent Basayigit, Sinan Demir, Yusuf Ucar, Burhan Kara

Abstract:

Many research projects require accurate delineation of the different land cover type of the agricultural area. Especially it is critically important for the definition of specific plants like cannabis. However, the complexity of vegetation stands structure, abundant vegetation species, and the smooth transition between different seconder section stages make vegetation classification difficult when using traditional approaches such as the maximum likelihood classifier. Most of the time, classification distinguishes only between trees/annual or grain. It has been difficult to accurately determine the cannabis mixed with other plants. In this paper, a mixed distribution models approach is applied to classify pure and mix cannabis parcels using Worldview-2 imagery in the Lakes region of Turkey. Five different land use types (i.e. sunflower, maize, bare soil, and cannabis) were identified in the image. A constrained Gaussian mixture discriminant analysis (GMDA) was used to unmix the image. In the study, 255 reflectance ratios derived from spectral signatures of seven bands (Blue-Green-Yellow-Red-Rededge-NIR1-NIR2) were randomly arranged as 80% for training and 20% for test data. Gaussian mixed distribution model approach is proved to be an effective and convenient way to combine very high spatial resolution imagery for distinguishing cannabis vegetation. Based on the overall accuracies of the classification, the Gaussian mixed distribution model was found to be very successful to achieve image classification tasks. This approach is sensitive to capture the illegal cannabis planting areas in the large plain. This approach can also be used for monitoring and determination with spectral reflections in illegal cannabis planting areas.

Keywords: Gaussian mixture discriminant analysis, spectral mixture model, World View-2, land parcels.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 774
1093 Detection and Classification of Faults on Parallel Transmission Lines Using Wavelet Transform and Neural Network

Authors: V.S.Kale, S.R.Bhide, P.P.Bedekar, G.V.K.Mohan

Abstract:

The protection of parallel transmission lines has been a challenging task due to mutual coupling between the adjacent circuits of the line. This paper presents a novel scheme for detection and classification of faults on parallel transmission lines. The proposed approach uses combination of wavelet transform and neural network, to solve the problem. While wavelet transform is a powerful mathematical tool which can be employed as a fast and very effective means of analyzing power system transient signals, artificial neural network has a ability to classify non-linear relationship between measured signals by identifying different patterns of the associated signals. The proposed algorithm consists of time-frequency analysis of fault generated transients using wavelet transform, followed by pattern recognition using artificial neural network to identify the type of the fault. MATLAB/Simulink is used to generate fault signals and verify the correctness of the algorithm. The adaptive discrimination scheme is tested by simulating different types of fault and varying fault resistance, fault location and fault inception time, on a given power system model. The simulation results show that the proposed scheme for fault diagnosis is able to classify all the faults on the parallel transmission line rapidly and correctly.

Keywords: Artificial neural network, fault detection and classification, parallel transmission lines, wavelet transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2983
1092 Feature Subset Selection approach based on Maximizing Margin of Support Vector Classifier

Authors: Khin May Win, Nan Sai Moon Kham

Abstract:

Identification of cancer genes that might anticipate the clinical behaviors from different types of cancer disease is challenging due to the huge number of genes and small number of patients samples. The new method is being proposed based on supervised learning of classification like support vector machines (SVMs).A new solution is described by the introduction of the Maximized Margin (MM) in the subset criterion, which permits to get near the least generalization error rate. In class prediction problem, gene selection is essential to improve the accuracy and to identify genes for cancer disease. The performance of the new method was evaluated with real-world data experiment. It can give the better accuracy for classification.

Keywords: Microarray data, feature selection, recursive featureelimination, support vector machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1520
1091 Combining ILP with Semi-supervised Learning for Web Page Categorization

Authors: Nuanwan Soonthornphisaj, Boonserm Kijsirikul

Abstract:

This paper presents a semi-supervised learning algorithm called Iterative-Cross Training (ICT) to solve the Web pages classification problems. We apply Inductive logic programming (ILP) as a strong learner in ICT. The objective of this research is to evaluate the potential of the strong learner in order to boost the performance of the weak learner of ICT. We compare the result with the supervised Naive Bayes, which is the well-known algorithm for the text classification problem. The performance of our learning algorithm is also compare with other semi-supervised learning algorithms which are Co-Training and EM. The experimental results show that ICT algorithm outperforms those algorithms and the performance of the weak learner can be enhanced by ILP system.

Keywords: Inductive Logic Programming, Semi-supervisedLearning, Web Page Categorization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1624
1090 Automatic Motion Trajectory Analysis for Dual Human Interaction Using Video Sequences

Authors: Yuan-Hsiang Chang, Pin-Chi Lin, Li-Der Jeng

Abstract:

Advance in techniques of image and video processing has enabled the development of intelligent video surveillance systems. This study was aimed to automatically detect moving human objects and to analyze events of dual human interaction in a surveillance scene. Our system was developed in four major steps: image preprocessing, human object detection, human object tracking, and motion trajectory analysis. The adaptive background subtraction and image processing techniques were used to detect and track moving human objects. To solve the occlusion problem during the interaction, the Kalman filter was used to retain a complete trajectory for each human object. Finally, the motion trajectory analysis was developed to distinguish between the interaction and non-interaction events based on derivatives of trajectories related to the speed of the moving objects. Using a database of 60 video sequences, our system could achieve the classification accuracy of 80% in interaction events and 95% in non-interaction events, respectively. In summary, we have explored the idea to investigate a system for the automatic classification of events for interaction and non-interaction events using surveillance cameras. Ultimately, this system could be incorporated in an intelligent surveillance system for the detection and/or classification of abnormal or criminal events (e.g., theft, snatch, fighting, etc.). 

Keywords: Motion detection, motion tracking, trajectory analysis, video surveillance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1701
1089 Rigorous Electromagnetic Model of Fourier Transform Infrared (FT-IR) Spectroscopic Imaging Applied to Automated Histology of Prostate Tissue Specimens

Authors: Rohith K Reddy, David Mayerich, Michael Walsh, P Scott Carney, Rohit Bhargava

Abstract:

Fourier transform infrared (FT-IR) spectroscopic imaging is an emerging technique that provides both chemically and spatially resolved information. The rich chemical content of data may be utilized for computer-aided determinations of structure and pathologic state (cancer diagnosis) in histological tissue sections for prostate cancer. FT-IR spectroscopic imaging of prostate tissue has shown that tissue type (histological) classification can be performed to a high degree of accuracy [1] and cancer diagnosis can be performed with an accuracy of about 80% [2] on a microscopic (≈ 6μm) length scale. In performing these analyses, it has been observed that there is large variability (more than 60%) between spectra from different points on tissue that is expected to consist of the same essential chemical constituents. Spectra at the edges of tissues are characteristically and consistently different from chemically similar tissue in the middle of the same sample. Here, we explain these differences using a rigorous electromagnetic model for light-sample interaction. Spectra from FT-IR spectroscopic imaging of chemically heterogeneous samples are different from bulk spectra of individual chemical constituents of the sample. This is because spectra not only depend on chemistry, but also on the shape of the sample. Using coupled wave analysis, we characterize and quantify the nature of spectral distortions at the edges of tissues. Furthermore, we present a method of performing histological classification of tissue samples. Since the mid-infrared spectrum is typically assumed to be a quantitative measure of chemical composition, classification results can vary widely due to spectral distortions. However, we demonstrate that the selection of localized metrics based on chemical information can make our data robust to the spectral distortions caused by scattering at the tissue boundary.

Keywords: Infrared, Spectroscopy, Imaging, Tissue classification

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1613
1088 Personal Information Classification Based on Deep Learning in Automatic Form Filling System

Authors: Shunzuo Wu, Xudong Luo, Yuanxiu Liao

Abstract:

Recently, the rapid development of deep learning makes artificial intelligence (AI) penetrate into many fields, replacing manual work there. In particular, AI systems also become a research focus in the field of automatic office. To meet real needs in automatic officiating, in this paper we develop an automatic form filling system. Specifically, it uses two classical neural network models and several word embedding models to classify various relevant information elicited from the Internet. When training the neural network models, we use less noisy and balanced data for training. We conduct a series of experiments to test my systems and the results show that our system can achieve better classification results.

Keywords: Personal information, deep learning, auto fill, NLP, document analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 818
1087 Evaluation of Classifiers Based On I2C Distance for Action Recognition

Authors: Lei Zhang, Tao Wang, Xiantong Zhen

Abstract:

Naive Bayes Nearest Neighbor (NBNN) and its variants, i,e., local NBNN and the NBNN kernels, are local feature-based classifiers that have achieved impressive performance in image classification. By exploiting instance-to-class (I2C) distances (instance means image/video in image/video classification), they avoid quantization errors of local image descriptors in the bag of words (BoW) model. However, the performances of NBNN, local NBNN and the NBNN kernels have not been validated on video analysis. In this paper, we introduce these three classifiers into human action recognition and conduct comprehensive experiments on the benchmark KTH and the realistic HMDB datasets. The results shows that those I2C based classifiers consistently outperform the SVM classifier with the BoW model.

Keywords: Instance-to-class distance, NBNN, Local NBNN, NBNN kernel.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1635
1086 Content Based Sampling over Transactional Data Streams

Authors: Mansour Tarafdar, Mohammad Saniee Abade

Abstract:

This paper investigates the problem of sampling from transactional data streams. We introduce CFISDS as a content based sampling algorithm that works on a landmark window model of data streams and preserve more informed sample in sample space. This algorithm that work based on closed frequent itemset mining tasks, first initiate a concept lattice using initial data, then update lattice structure using an incremental mechanism.Incremental mechanism insert, update and delete nodes in/from concept lattice in batch manner. Presented algorithm extracts the final samples on demand of user. Experimental results show the accuracy of CFISDS on synthetic and real datasets, despite on CFISDS algorithm is not faster than exist sampling algorithms such as Z and DSS.

Keywords: Sampling, data streams, closed frequent item set mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1687
1085 A Novel Technique for Ferroresonance Identification in Distribution Networks

Authors: G. Mokryani, M. R. Haghifam, J. Esmaeilpoor

Abstract:

Happening of Ferroresonance phenomenon is one of the reasons of consuming and ruining transformers, so recognition of Ferroresonance phenomenon has a special importance. A novel method for classification of Ferroresonance presented in this paper. Using this method Ferroresonance can be discriminate from other transients such as capacitor switching, load switching, transformer switching. Wavelet transform is used for decomposition of signals and Competitive Neural Network used for classification. Ferroresonance data and other transients was obtained by simulation using EMTP program. Using Daubechies wavelet transform signals has been decomposed till six levels. The energy of six detailed signals that obtained by wavelet transform are used for training and trailing Competitive Neural Network. Results show that the proposed procedure is efficient in identifying Ferroresonance from other events.

Keywords: Competitive Neural Network, Ferroresonance, EMTP program, Wavelet transform.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1400
1084 Slice Bispectrogram Analysis-Based Classification of Environmental Sounds Using Convolutional Neural Network

Authors: Katsumi Hirata

Abstract:

Certain systems can function well only if they recognize the sound environment as humans do. In this research, we focus on sound classification by adopting a convolutional neural network and aim to develop a method that automatically classifies various environmental sounds. Although the neural network is a powerful technique, the performance depends on the type of input data. Therefore, we propose an approach via a slice bispectrogram, which is a third-order spectrogram and is a slice version of the amplitude for the short-time bispectrum. This paper explains the slice bispectrogram and discusses the effectiveness of the derived method by evaluating the experimental results using the ESC‑50 sound dataset. As a result, the proposed scheme gives high accuracy and stability. Furthermore, some relationship between the accuracy and non-Gaussianity of sound signals was confirmed.

Keywords: Bispectrum, convolutional neural network, environmental sound, slice bispectrogram, spectrogram.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 587
1083 Selection of Best Band Combination for Soil Salinity Studies using ETM+ Satellite Images (A Case study: Nyshaboor Region,Iran)

Authors: Sanaeinejad, S. H.; A. Astaraei, . P. Mirhoseini.Mousavi, M. Ghaemi,

Abstract:

One of the main environmental problems which affect extensive areas in the world is soil salinity. Traditional data collection methods are neither enough for considering this important environmental problem nor accurate for soil studies. Remote sensing data could overcome most of these problems. Although satellite images are commonly used for these studies, however there are still needs to find the best calibration between the data and real situations in each specified area. Neyshaboor area, North East of Iran was selected as a field study of this research. Landsat satellite images for this area were used in order to prepare suitable learning samples for processing and classifying the images. 300 locations were selected randomly in the area to collect soil samples and finally 273 locations were reselected for further laboratory works and image processing analysis. Electrical conductivity of all samples was measured. Six reflective bands of ETM+ satellite images taken from the study area in 2002 were used for soil salinity classification. The classification was carried out using common algorithms based on the best composition bands. The results showed that the reflective bands 7, 3, 4 and 1 are the best band composition for preparing the color composite images. We also found out, that hybrid classification is a suitable method for identifying and delineation of different salinity classes in the area.

Keywords: Soil salinity, Remote sensing, Image processing, ETM+, Nyshaboor

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2001
1082 Further Investigations on Higher Mathematics Scores for Chinese University Students

Authors: Xun Ge

Abstract:

Recently, X. Ge and J. Qian investigated some relations between higher mathematics scores and calculus scores (resp. linear algebra scores, probability statistics scores) for Chinese university students. Based on rough-set theory, they established an information system S = (U,CuD,V, f). In this information system, higher mathematics score was taken as a decision attribute and calculus score, linear algebra score, probability statistics score were taken as condition attributes. They investigated importance of each condition attribute with respective to decision attribute and strength of each condition attribute supporting decision attribute. In this paper, we give further investigations for this issue. Based on the above information system S = (U, CU D, V, f), we analyze the decision rules between condition and decision granules. For each x E U, we obtain support (resp. strength, certainty factor, coverage factor) of the decision rule C —>x D, where C —>x D is the decision rule induced by x in S = (U, CU D, V, f). Results of this paper gives new analysis of on higher mathematics scores for Chinese university students, which can further lead Chinese university students to raise higher mathematics scores in Chinese graduate student entrance examination.

Keywords: Rough set, support, strength, certainty factor, coverage factor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1343
1081 Wavelet-Based ECG Signal Analysis and Classification

Authors: Madina Hamiane, May Hashim Ali

Abstract:

This paper presents the processing and analysis of ECG signals. The study is based on wavelet transform and uses exclusively the MATLAB environment. This study includes removing Baseline wander and further de-noising through wavelet transform and metrics such as signal-to noise ratio (SNR), Peak signal-to-noise ratio (PSNR) and the mean squared error (MSE) are used to assess the efficiency of the de-noising techniques. Feature extraction is subsequently performed whereby signal features such as heart rate, rise and fall levels are extracted and the QRS complex was detected which helped in classifying the ECG signal. The classification is the last step in the analysis of the ECG signals and it is shown that these are successfully classified as Normal rhythm or Abnormal rhythm.  The final result proved the adequacy of using wavelet transform for the analysis of ECG signals.

Keywords: ECG Signal, QRS detection, thresholding, wavelet decomposition, feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1233
1080 Automated Particle Picking based on Correlation Peak Shape Analysis and Iterative Classification

Authors: Hrabe Thomas, Beck Florian, Nickell Stephan

Abstract:

Cryo-electron microscopy (CEM) in combination with single particle analysis (SPA) is a widely used technique for elucidating structural details of macromolecular assemblies at closeto- atomic resolutions. However, development of automated software for SPA processing is still vital since thousands to millions of individual particle images need to be processed. Here, we present our workflow for automated particle picking. Our approach integrates peak shape analysis to the classical correlation and an iterative approach to separate macromolecules and background by classification. This particle selection workflow furthermore provides a robust means for SPA with little user interaction. Processing simulated and experimental data assesses performance of the presented tools.

Keywords: Cryo-electron Microscopy, Single Particle Analysis, Image Processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1645
1079 Development of Genetic-based Machine Learning for Network Intrusion Detection (GBML-NID)

Authors: Wafa' S.Al-Sharafat, Reyadh Naoum

Abstract:

Society has grown to rely on Internet services, and the number of Internet users increases every day. As more and more users become connected to the network, the window of opportunity for malicious users to do their damage becomes very great and lucrative. The objective of this paper is to incorporate different techniques into classier system to detect and classify intrusion from normal network packet. Among several techniques, Steady State Genetic-based Machine Leaning Algorithm (SSGBML) will be used to detect intrusions. Where Steady State Genetic Algorithm (SSGA), Simple Genetic Algorithm (SGA), Modified Genetic Algorithm and Zeroth Level Classifier system are investigated in this research. SSGA is used as a discovery mechanism instead of SGA. SGA replaces all old rules with new produced rule preventing old good rules from participating in the next rule generation. Zeroth Level Classifier System is used to play the role of detector by matching incoming environment message with classifiers to determine whether the current message is normal or intrusion and receiving feedback from environment. Finally, in order to attain the best results, Modified SSGA will enhance our discovery engine by using Fuzzy Logic to optimize crossover and mutation probability. The experiments and evaluations of the proposed method were performed with the KDD 99 intrusion detection dataset.

Keywords: MSSGBML, Network Intrusion Detection, SGA, SSGA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1650
1078 A Methodology for Automatic Diversification of Document Categories

Authors: Dasom Kim, Chen Liu, Myungsu Lim, Soo-Hyeon Jeon, Byeoung Kug Jeon, Kee-Young Kwahk, Namgyu Kim

Abstract:

Recently, numerous documents including large volumes of unstructured data and text have been created because of the rapid increase in the use of social media and the Internet. Usually, these documents are categorized for the convenience of users. Because the accuracy of manual categorization is not guaranteed, and such categorization requires a large amount of time and incurs huge costs. Many studies on automatic categorization have been conducted to help mitigate the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorize complex documents with multiple topics because they work on the assumption that individual documents can be categorized into single categories only. Therefore, to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, the learning process employed in these studies involves training using a multi-categorized document set. These methods therefore cannot be applied to the multi-categorization of most documents unless multi-categorized training sets using traditional multi-categorization algorithms are provided. To overcome this limitation, in this study, we review our novel methodology for extending the category of a single-categorized document to multiple categorizes, and then introduce a survey-based verification scenario for estimating the accuracy of our automatic categorization methodology.

Keywords: Big Data Analysis, Document Classification, Text Mining, Topic Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1725
1077 A New Model for Question Answering Systems

Authors: Mohammad Reza Kangavari, Samira Ghandchi, Manak Golpour

Abstract:

Most of the Question Answering systems composed of three main modules: question processing, document processing and answer processing. Question processing module plays an important role in QA systems. If this module doesn't work properly, it will make problems for other sections. Moreover answer processing module is an emerging topic in Question Answering, where these systems are often required to rank and validate candidate answers. These techniques aiming at finding short and precise answers are often based on the semantic classification. This paper discussed about a new model for question answering which improved two main modules, question processing and answer processing. There are two important components which are the bases of the question processing. First component is question classification that specifies types of question and answer. Second one is reformulation which converts the user's question into an understandable question by QA system in a specific domain. Answer processing module, consists of candidate answer filtering, candidate answer ordering components and also it has a validation section for interacting with user. This module makes it more suitable to find exact answer. In this paper we have described question and answer processing modules with modeling, implementing and evaluating the system. System implemented in two versions. Results show that 'Version No.1' gave correct answer to 70% of questions (30 correct answers to 50 asked questions) and 'version No.2' gave correct answers to 94% of questions (47 correct answers to 50 asked questions).

Keywords: Answer Processing, Classification, QuestionAnswering and Query Reformulation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2100
1076 Heritage Tree Expert Assessment and Classification: Malaysian Perspective

Authors: B.-Y.-S. Lau, Y.-C.-T. Jonathan, M.-S. Alias

Abstract:

Heritage trees are natural large, individual trees with exceptionally value due to association with age or event or distinguished people. In Malaysia, there is an abundance of tropical heritage trees throughout the country. It is essential to set up a repository of heritage trees to prevent valuable trees from being cut down. In this cross domain study, a web-based online expert system namely the Heritage Tree Expert Assessment and Classification (HTEAC) is developed and deployed for public to nominate potential heritage trees. Based on the nomination, tree care experts or arborists would evaluate and verify the nominated trees as heritage trees. The expert system automatically rates the approved heritage trees according to pre-defined grades via Delphi technique. Features and usability test of the expert system are presented. Preliminary result is promising for the system to be used as a full scale public system.

Keywords: Arboriculture, Delphi, expert system, heritage tree, urban forestry.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1399
1075 Mining Implicit Knowledge to Predict Political Risk by Providing Novel Framework with Using Bayesian Network

Authors: Siavash Asadi Ghajarloo

Abstract:

Nowadays predicting political risk level of country has become a critical issue for investors who intend to achieve accurate information concerning stability of the business environments. Since, most of the times investors are layman and nonprofessional IT personnel; this paper aims to propose a framework named GECR in order to help nonexpert persons to discover political risk stability across time based on the political news and events. To achieve this goal, the Bayesian Networks approach was utilized for 186 political news of Pakistan as sample dataset. Bayesian Networks as an artificial intelligence approach has been employed in presented framework, since this is a powerful technique that can be applied to model uncertain domains. The results showed that our framework along with Bayesian Networks as decision support tool, predicted the political risk level with a high degree of accuracy.

Keywords: Bayesian Networks, Data mining, GECRframework, Predicting political risk.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2156
1074 Appraisal of Methods for Identifying, Mapping, and Modelling of Fluvial Erosion in a Mining Environment

Authors: F. F. Howard, I. Yakubu, C. B. Boye, J. S. Y. Kuma

Abstract:

Natural and human activities, such as mining operations, expose the natural soil to adverse environmental conditions, leading to contamination of soil, groundwater, and surface water, which has negative effects on humans, flora, and fauna. Bare or partly exposed soil is most liable to fluvial erosion. This paper enumerates various methods used to identify, map, and model fluvial erosion in a mining environment. Classical, Artificial Intelligence (AI), and GIS methods have been reviewed. One of the many classical methods used to estimate river erosion is the Revised Universal Soil Loss Equation (RUSLE) model. The RUSLE model is easy to use. Its reliance on empirical relationships that may not always be applicable to specific circumstances or locations is a flaw. Other classical models for estimating fluvial erosion are the Soil and Water Assessment Tool (SWAT) and the Universal Soil Loss Equation (USLE). These models offer a more complete understanding of the underlying physical processes and encompass a wider range of situations. Although more difficult to utilise, they depend on the availability and dependability of input data for correctness. AI can help deal with multivariate and complex difficulties and predict soil loss with higher accuracy than traditional methods, and also be used to build unique models for identifying degraded areas. AI techniques have become popular as an alternative predictor for degraded environments. However, this research proposed a hybrid of classical, AI, and GIS methods for efficient and effective modelling of fluvial erosion.

Keywords: Fluvial erosion, classical methods, Artificial Intelligence, Geographic Information System.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 140
1073 A Pattern Recognition Neural Network Model for Detection and Classification of SQL Injection Attacks

Authors: Naghmeh Moradpoor Sheykhkanloo

Abstract:

Thousands of organisations store important and confidential information related to them, their customers, and their business partners in databases all across the world. The stored data ranges from less sensitive (e.g. first name, last name, date of birth) to more sensitive data (e.g. password, pin code, and credit card information). Losing data, disclosing confidential information or even changing the value of data are the severe damages that Structured Query Language injection (SQLi) attack can cause on a given database. It is a code injection technique where malicious SQL statements are inserted into a given SQL database by simply using a web browser. In this paper, we propose an effective pattern recognition neural network model for detection and classification of SQLi attacks. The proposed model is built from three main elements of: a Uniform Resource Locator (URL) generator in order to generate thousands of malicious and benign URLs, a URL classifier in order to: 1) classify each generated URL to either a benign URL or a malicious URL and 2) classify the malicious URLs into different SQLi attack categories, and a NN model in order to: 1) detect either a given URL is a malicious URL or a benign URL and 2) identify the type of SQLi attack for each malicious URL. The model is first trained and then evaluated by employing thousands of benign and malicious URLs. The results of the experiments are presented in order to demonstrate the effectiveness of the proposed approach.

Keywords: Neural Networks, pattern recognition, SQL injection attacks, SQL injection attack classification, SQL injection attack detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2804
1072 Very-high-Precision Normalized Eigenfunctions for a Class of Schrödinger Type Equations

Authors: Amna Noreen , Kare Olaussen

Abstract:

We demonstrate that it is possible to compute wave function normalization constants for a class of Schr¨odinger type equations by an algorithm which scales linearly (in the number of eigenfunction evaluations) with the desired precision P in decimals.

Keywords: Eigenvalue problems, bound states, trapezoidal rule, poisson resummation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2834
1071 A New History Based Method to Handle the Recurring Concept Shifts in Data Streams

Authors: Hossein Morshedlou, Ahmad Abdollahzade Barforoush

Abstract:

Recent developments in storage technology and networking architectures have made it possible for broad areas of applications to rely on data streams for quick response and accurate decision making. Data streams are generated from events of real world so existence of associations, which are among the occurrence of these events in real world, among concepts of data streams is logical. Extraction of these hidden associations can be useful for prediction of subsequent concepts in concept shifting data streams. In this paper we present a new method for learning association among concepts of data stream and prediction of what the next concept will be. Knowing the next concept, an informed update of data model will be possible. The results of conducted experiments show that the proposed method is proper for classification of concept shifting data streams.

Keywords: Data Stream, Classification, Concept Shift, History.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1262