Search results for: naïveBayesian classifier
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 291

Search results for: naïveBayesian classifier

201 Adaptive Network Intrusion Detection Learning: Attribute Selection and Classification

Authors: Dewan Md. Farid, Jerome Darmont, Nouria Harbi, Nguyen Huu Hoa, Mohammad Zahidur Rahman

Abstract:

In this paper, a new learning approach for network intrusion detection using naïve Bayesian classifier and ID3 algorithm is presented, which identifies effective attributes from the training dataset, calculates the conditional probabilities for the best attribute values, and then correctly classifies all the examples of training and testing dataset. Most of the current intrusion detection datasets are dynamic, complex and contain large number of attributes. Some of the attributes may be redundant or contribute little for detection making. It has been successfully tested that significant attribute selection is important to design a real world intrusion detection systems (IDS). The purpose of this study is to identify effective attributes from the training dataset to build a classifier for network intrusion detection using data mining algorithms. The experimental results on KDD99 benchmark intrusion detection dataset demonstrate that this new approach achieves high classification rates and reduce false positives using limited computational resources.

Keywords: Attributes selection, Conditional probabilities, information gain, network intrusion detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2697
200 Hierarchical PSO-Adaboost Based Classifiers for Fast and Robust Face Detection

Authors: Hong Pan, Yaping Zhu, Liang Zheng Xia

Abstract:

We propose a fast and robust hierarchical face detection system which finds and localizes face images with a cascade of classifiers. Three modules contribute to the efficiency of our detector. First, heterogeneous feature descriptors are exploited to enrich feature types and feature numbers for face representation. Second, a PSO-Adaboost algorithm is proposed to efficiently select discriminative features from a large pool of available features and reinforce them into the final ensemble classifier. Compared with the standard exhaustive Adaboost for feature selection, the new PSOAdaboost algorithm reduces the training time up to 20 times. Finally, a three-stage hierarchical classifier framework is developed for rapid background removal. In particular, candidate face regions are detected more quickly by using a large size window in the first stage. Nonlinear SVM classifiers are used instead of decision stump functions in the last stage to remove those remaining complex nonface patterns that can not be rejected in the previous two stages. Experimental results show our detector achieves superior performance on the CMU+MIT frontal face dataset.

Keywords: Adaboost, Face detection, Feature selection, PSO

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2198
199 Electromyography Pattern Classification with Laplacian Eigenmaps in Human Running

Authors: Elnaz Lashgari, Emel Demircan

Abstract:

Electromyography (EMG) is one of the most important interfaces between humans and robots for rehabilitation. Decoding this signal helps to recognize muscle activation and converts it into smooth motion for the robots. Detecting each muscle’s pattern during walking and running is vital for improving the quality of a patient’s life. In this study, EMG data from 10 muscles in 10 subjects at 4 different speeds were analyzed. EMG signals are nonlinear with high dimensionality. To deal with this challenge, we extracted some features in time-frequency domain and used manifold learning and Laplacian Eigenmaps algorithm to find the intrinsic features that represent data in low-dimensional space. We then used the Bayesian classifier to identify various patterns of EMG signals for different muscles across a range of running speeds. The best result for vastus medialis muscle corresponds to 97.87±0.69 for sensitivity and 88.37±0.79 for specificity with 97.07±0.29 accuracy using Bayesian classifier. The results of this study provide important insight into human movement and its application for robotics research.

Keywords: Electrocardiogram, manifold learning, Laplacian Eigenmaps, running pattern.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1119
198 Machine Learning Approach for Identifying Dementia from MRI Images

Authors: S. K. Aruna, S. Chitra

Abstract:

This research paper presents a framework for classifying Magnetic Resonance Imaging (MRI) images for Dementia. Dementia, an age-related cognitive decline is indicated by degeneration of cortical and sub-cortical structures. Characterizing morphological changes helps understand disease development and contributes to early prediction and prevention of the disease. Modelling, that captures the brain’s structural variability and which is valid in disease classification and interpretation is very challenging. Features are extracted using Gabor filter with 0, 30, 60, 90 orientations and Gray Level Co-occurrence Matrix (GLCM). It is proposed to normalize and fuse the features. Independent Component Analysis (ICA) selects features. Support Vector Machine (SVM) classifier with different kernels is evaluated, for efficiency to classify dementia. This study evaluates the presented framework using MRI images from OASIS dataset for identifying dementia. Results showed that the proposed feature fusion classifier achieves higher classification accuracy.

Keywords: Magnetic resonance imaging, dementia, Gabor filter, gray level co-occurrence matrix, support vector machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2114
197 Predication Model for Leukemia Diseases Based on Data Mining Classification Algorithms with Best Accuracy

Authors: Fahd Sabry Esmail, M. Badr Senousy, Mohamed Ragaie

Abstract:

In recent years, there has been an explosion in the rate of using technology that help discovering the diseases. For example, DNA microarrays allow us for the first time to obtain a "global" view of the cell. It has great potential to provide accurate medical diagnosis, to help in finding the right treatment and cure for many diseases. Various classification algorithms can be applied on such micro-array datasets to devise methods that can predict the occurrence of Leukemia disease. In this study, we compared the classification accuracy and response time among eleven decision tree methods and six rule classifier methods using five performance criteria. The experiment results show that the performance of Random Tree is producing better result. Also it takes lowest time to build model in tree classifier. The classification rules algorithms such as nearest- neighbor-like algorithm (NNge) is the best algorithm due to the high accuracy and it takes lowest time to build model in classification.

Keywords: Data mining, classification techniques, decision tree, classification rule, leukemia diseases, microarray data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2557
196 Scaling up Detection Rates and Reducing False Positives in Intrusion Detection using NBTree

Authors: Dewan Md. Farid, Nguyen Huu Hoa, Jerome Darmont, Nouria Harbi, Mohammad Zahidur Rahman

Abstract:

In this paper, we present a new learning algorithm for anomaly based network intrusion detection using improved self adaptive naïve Bayesian tree (NBTree), which induces a hybrid of decision tree and naïve Bayesian classifier. The proposed approach scales up the balance detections for different attack types and keeps the false positives at acceptable level in intrusion detection. In complex and dynamic large intrusion detection dataset, the detection accuracy of naïve Bayesian classifier does not scale up as well as decision tree. It has been successfully tested in other problem domains that naïve Bayesian tree improves the classification rates in large dataset. In naïve Bayesian tree nodes contain and split as regular decision-trees, but the leaves contain naïve Bayesian classifiers. The experimental results on KDD99 benchmark network intrusion detection dataset demonstrate that this new approach scales up the detection rates for different attack types and reduces false positives in network intrusion detection.

Keywords: Detection rates, false positives, network intrusiondetection, naïve Bayesian tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2280
195 Multi-Layer Perceptron Neural Network Classifier with Binary Particle Swarm Optimization Based Feature Selection for Brain-Computer Interfaces

Authors: K. Akilandeswari, G. M. Nasira

Abstract:

Brain-Computer Interfaces (BCIs) measure brain signals activity, intentionally and unintentionally induced by users, and provides a communication channel without depending on the brain’s normal peripheral nerves and muscles output pathway. Feature Selection (FS) is a global optimization machine learning problem that reduces features, removes irrelevant and noisy data resulting in acceptable recognition accuracy. It is a vital step affecting pattern recognition system performance. This study presents a new Binary Particle Swarm Optimization (BPSO) based feature selection algorithm. Multi-layer Perceptron Neural Network (MLPNN) classifier with backpropagation training algorithm and Levenberg-Marquardt training algorithm classify selected features.

Keywords: Brain-Computer Interfaces (BCI), Feature Selection (FS), Walsh–Hadamard Transform (WHT), Binary Particle Swarm Optimization (BPSO), Multi-Layer Perceptron (MLP), Levenberg–Marquardt algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2184
194 New Features for Specific JPEG Steganalysis

Authors: Johann Barbier, Eric Filiol, Kichenakoumar Mayoura

Abstract:

We present in this paper a new approach for specific JPEG steganalysis and propose studying statistics of the compressed DCT coefficients. Traditionally, steganographic algorithms try to preserve statistics of the DCT and of the spatial domain, but they cannot preserve both and also control the alteration of the compressed data. We have noticed a deviation of the entropy of the compressed data after a first embedding. This deviation is greater when the image is a cover medium than when the image is a stego image. To observe this deviation, we pointed out new statistic features and combined them with the Multiple Embedding Method. This approach is motivated by the Avalanche Criterion of the JPEG lossless compression step. This criterion makes possible the design of detectors whose detection rates are independent of the payload. Finally, we designed a Fisher discriminant based classifier for well known steganographic algorithms, Outguess, F5 and Hide and Seek. The experiemental results we obtained show the efficiency of our classifier for these algorithms. Moreover, it is also designed to work with low embedding rates (< 10-5) and according to the avalanche criterion of RLE and Huffman compression step, its efficiency is independent of the quantity of hidden information.

Keywords: Compressed frequency domain, Fisher discriminant, specific JPEG steganalysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2161
193 Improving Subjective Bias Detection Using Bidirectional Encoder Representations from Transformers and Bidirectional Long Short-Term Memory

Authors: Ebipatei Victoria Tunyan, T. A. Cao, Cheol Young Ock

Abstract:

Detecting subjectively biased statements is a vital task. This is because this kind of bias, when present in the text or other forms of information dissemination media such as news, social media, scientific texts, and encyclopedias, can weaken trust in the information and stir conflicts amongst consumers. Subjective bias detection is also critical for many Natural Language Processing (NLP) tasks like sentiment analysis, opinion identification, and bias neutralization. Having a system that can adequately detect subjectivity in text will boost research in the above-mentioned areas significantly. It can also come in handy for platforms like Wikipedia, where the use of neutral language is of importance. The goal of this work is to identify the subjectively biased language in text on a sentence level. With machine learning, we can solve complex AI problems, making it a good fit for the problem of subjective bias detection. A key step in this approach is to train a classifier based on BERT (Bidirectional Encoder Representations from Transformers) as upstream model. BERT by itself can be used as a classifier; however, in this study, we use BERT as data preprocessor as well as an embedding generator for a Bi-LSTM (Bidirectional Long Short-Term Memory) network incorporated with attention mechanism. This approach produces a deeper and better classifier. We evaluate the effectiveness of our model using the Wiki Neutrality Corpus (WNC), which was compiled from Wikipedia edits that removed various biased instances from sentences as a benchmark dataset, with which we also compare our model to existing approaches. Experimental analysis indicates an improved performance, as our model achieved state-of-the-art accuracy in detecting subjective bias. This study focuses on the English language, but the model can be fine-tuned to accommodate other languages.

Keywords: Subjective bias detection, machine learning, BERT–BiLSTM–Attention, text classification, natural language processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 830
192 A Computer Aided Detection (CAD) System for Microcalcifications in Mammograms - MammoScan mCaD

Authors: Kjersti Engan, Thor Ole Gulsrud, Karl Fredrik Fretheim, Barbro Furebotten Iversen, Liv Eriksen

Abstract:

Clusters of microcalcifications in mammograms are an important sign of breast cancer. This paper presents a complete Computer Aided Detection (CAD) scheme for automatic detection of clustered microcalcifications in digital mammograms. The proposed system, MammoScan μCaD, consists of three main steps. Firstly all potential microcalcifications are detected using a a method for feature extraction, VarMet, and adaptive thresholding. This will also give a number of false detections. The goal of the second step, Classifier level 1, is to remove everything but microcalcifications. The last step, Classifier level 2, uses learned dictionaries and sparse representations as a texture classification technique to distinguish single, benign microcalcifications from clustered microcalcifications, in addition to remove some remaining false detections. The system is trained and tested on true digital data from Stavanger University Hospital, and the results are evaluated by radiologists. The overall results are promising, with a sensitivity > 90 % and a low false detection rate (approx 1 unwanted pr. image, or 0.3 false pr. image).

Keywords: mammogram, microcalcifications, detection, CAD, MammoScan μCaD, VarMet, dictionary learning, texture, FTCM, classification, adaptive thresholding

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1806
191 A Novel Approach for Protein Classification Using Fourier Transform

Authors: A. F. Ali, D. M. Shawky

Abstract:

Discovering new biological knowledge from the highthroughput biological data is a major challenge to bioinformatics today. To address this challenge, we developed a new approach for protein classification. Proteins that are evolutionarily- and thereby functionally- related are said to belong to the same classification. Identifying protein classification is of fundamental importance to document the diversity of the known protein universe. It also provides a means to determine the functional roles of newly discovered protein sequences. Our goal is to predict the functional classification of novel protein sequences based on a set of features extracted from each protein sequence. The proposed technique used datasets extracted from the Structural Classification of Proteins (SCOP) database. A set of spectral domain features based on Fast Fourier Transform (FFT) is used. The proposed classifier uses multilayer back propagation (MLBP) neural network for protein classification. The maximum classification accuracy is about 91% when applying the classifier to the full four levels of the SCOP database. However, it reaches a maximum of 96% when limiting the classification to the family level. The classification results reveal that spectral domain contains information that can be used for classification with high accuracy. In addition, the results emphasize that sequence similarity measures are of great importance especially at the family level.

Keywords: Bioinformatics, Artificial Neural Networks, Protein Sequence Analysis, Feature Extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2359
190 Diagnosis of the Abdominal Aorta Aneurysm in Magnetic Resonance Imaging Images

Authors: W. Kultangwattana, K. Somkantha, P. Phuangsuwan

Abstract:

This paper presents a technique for diagnosis of the abdominal aorta aneurysm in magnetic resonance imaging (MRI) images. First, our technique is designed to segment the aorta image in MRI images. This is a required step to determine the volume of aorta image which is the important step for diagnosis of the abdominal aorta aneurysm. Our proposed technique can detect the volume of aorta in MRI images using a new external energy for snakes model. The new external energy for snakes model is calculated from Law-s texture. The new external energy can increase the capture range of snakes model efficiently more than the old external energy of snakes models. Second, our technique is designed to diagnose the abdominal aorta aneurysm by Bayesian classifier which is classification models based on statistical theory. The feature for data classification of abdominal aorta aneurysm was derived from the contour of aorta images which was a result from segmenting of our snakes model, i.e., area, perimeter and compactness. We also compare the proposed technique with the traditional snakes model. In our experiment results, 30 images are trained, 20 images are tested and compared with expert opinion. The experimental results show that our technique is able to provide more accurate results than 95%.

Keywords: Adbominal Aorta Aneurysm, Bayesian Classifier, Snakes Model, Texture Feature.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1591
189 Statistical Measures and Optimization Algorithms for Gene Selection in Lung and Ovarian Tumor

Authors: C. Gunavathi, K. Premalatha

Abstract:

Microarray technology is universally used in the study of disease diagnosis using gene expression levels. The main shortcoming of gene expression data is that it includes thousands of genes and a small number of samples. Abundant methods and techniques have been proposed for tumor classification using microarray gene expression data. Feature or gene selection methods can be used to mine the genes that directly involve in the classification and to eliminate irrelevant genes. In this paper statistical measures like T-Statistics, Signal-to-Noise Ratio (SNR) and F-Statistics are used to rank the genes. The ranked genes are used for further classification. Particle Swarm Optimization (PSO) algorithm and Shuffled Frog Leaping (SFL) algorithm are used to find the significant genes from the top-m ranked genes. The Naïve Bayes Classifier (NBC) is used to classify the samples based on the significant genes. The proposed work is applied on Lung and Ovarian datasets. The experimental results show that the proposed method achieves 100% accuracy in all the three datasets and the results are compared with previous works.

Keywords: Microarray, T-Statistics, Signal-to-Noise Ratio, FStatistics, Particle Swarm Optimization, Shuffled Frog Leaping, Naïve Bayes Classifier.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1945
188 Trajectory Guided Recognition of Hand Gestures having only Global Motions

Authors: M. K. Bhuyan, P. K. Bora, D. Ghosh

Abstract:

One very interesting field of research in Pattern Recognition that has gained much attention in recent times is Gesture Recognition. In this paper, we consider a form of dynamic hand gestures that are characterized by total movement of the hand (arm) in space. For these types of gestures, the shape of the hand (palm) during gesturing does not bear any significance. In our work, we propose a model-based method for tracking hand motion in space, thereby estimating the hand motion trajectory. We employ the dynamic time warping (DTW) algorithm for time alignment and normalization of spatio-temporal variations that exist among samples belonging to the same gesture class. During training, one template trajectory and one prototype feature vector are generated for every gesture class. Features used in our work include some static and dynamic motion trajectory features. Recognition is accomplished in two stages. In the first stage, all unlikely gesture classes are eliminated by comparing the input gesture trajectory to all the template trajectories. In the next stage, feature vector extracted from the input gesture is compared to all the class prototype feature vectors using a distance classifier. Experimental results demonstrate that our proposed trajectory estimator and classifier is suitable for Human Computer Interaction (HCI) platform.

Keywords: Hand gesture, human computer interaction, key video object plane, dynamic time warping.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2741
187 Automatic Sleep Stage Scoring with Wavelet Packets Based on Single EEG Recording

Authors: Luay A. Fraiwan, Natheer Y. Khaswaneh, Khaldon Y. Lweesy

Abstract:

Sleep stage scoring is the process of classifying the stage of the sleep in which the subject is in. Sleep is classified into two states based on the constellation of physiological parameters. The two states are the non-rapid eye movement (NREM) and the rapid eye movement (REM). The NREM sleep is also classified into four stages (1-4). These states and the state wakefulness are distinguished from each other based on the brain activity. In this work, a classification method for automated sleep stage scoring based on a single EEG recording using wavelet packet decomposition was implemented. Thirty two ploysomnographic recording from the MIT-BIH database were used for training and validation of the proposed method. A single EEG recording was extracted and smoothed using Savitzky-Golay filter. Wavelet packets decomposition up to the fourth level based on 20th order Daubechies filter was used to extract features from the EEG signal. A features vector of 54 features was formed. It was reduced to a size of 25 using the gain ratio method and fed into a classifier of regression trees. The regression trees were trained using 67% of the records available. The records for training were selected based on cross validation of the records. The remaining of the records was used for testing the classifier. The overall correct rate of the proposed method was found to be around 75%, which is acceptable compared to the techniques in the literature.

Keywords: Features selection, regression trees, sleep stagescoring, wavelet packets.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2328
186 An Approach for Vocal Register Recognition Based on Spectral Analysis of Singing

Authors: Aleksandra Zysk, Pawel Badura

Abstract:

Recognizing and controlling vocal registers during singing is a difficult task for beginner vocalist. It requires among others identifying which part of natural resonators is being used when a sound propagates through the body. Thus, an application has been designed allowing for sound recording, automatic vocal register recognition (VRR), and a graphical user interface providing real-time visualization of the signal and recognition results. Six spectral features are determined for each time frame and passed to the support vector machine classifier yielding a binary decision on the head or chest register assignment of the segment. The classification training and testing data have been recorded by ten professional female singers (soprano, aged 19-29) performing sounds for both chest and head register. The classification accuracy exceeded 93% in each of various validation schemes. Apart from a hard two-class clustering, the support vector classifier returns also information on the distance between particular feature vector and the discrimination hyperplane in a feature space. Such an information reflects the level of certainty of the vocal register classification in a fuzzy way. Thus, the designed recognition and training application is able to assess and visualize the continuous trend in singing in a user-friendly graphical mode providing an easy way to control the vocal emission.

Keywords: Classification, singing, spectral analysis, vocal emission, vocal register.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1313
185 Distances over Incomplete Diabetes and Breast Cancer Data Based on Bhattacharyya Distance

Authors: Loai AbdAllah, Mahmoud Kaiyal

Abstract:

Missing values in real-world datasets are a common problem. Many algorithms were developed to deal with this problem, most of them replace the missing values with a fixed value that was computed based on the observed values. In our work, we used a distance function based on Bhattacharyya distance to measure the distance between objects with missing values. Bhattacharyya distance, which measures the similarity of two probability distributions. The proposed distance distinguishes between known and unknown values. Where the distance between two known values is the Mahalanobis distance. When, on the other hand, one of them is missing the distance is computed based on the distribution of the known values, for the coordinate that contains the missing value. This method was integrated with Wikaya, a digital health company developing a platform that helps to improve prevention of chronic diseases such as diabetes and cancer. In order for Wikaya’s recommendation system to work distance between users need to be measured. Since there are missing values in the collected data, there is a need to develop a distance function distances between incomplete users profiles. To evaluate the accuracy of the proposed distance function in reflecting the actual similarity between different objects, when some of them contain missing values, we integrated it within the framework of k nearest neighbors (kNN) classifier, since its computation is based only on the similarity between objects. To validate this, we ran the algorithm over diabetes and breast cancer datasets, standard benchmark datasets from the UCI repository. Our experiments show that kNN classifier using our proposed distance function outperforms the kNN using other existing methods.

Keywords: Missing values, distance metric, Bhattacharyya distance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 781
184 Intelligent Recognition of Diabetes Disease via FCM Based Attribute Weighting

Authors: Kemal Polat

Abstract:

In this paper, an attribute weighting method called fuzzy C-means clustering based attribute weighting (FCMAW) for classification of Diabetes disease dataset has been used. The aims of this study are to reduce the variance within attributes of diabetes dataset and to improve the classification accuracy of classifier algorithm transforming from non-linear separable datasets to linearly separable datasets. Pima Indians Diabetes dataset has two classes including normal subjects (500 instances) and diabetes subjects (268 instances). Fuzzy C-means clustering is an improved version of K-means clustering method and is one of most used clustering methods in data mining and machine learning applications. In this study, as the first stage, fuzzy C-means clustering process has been used for finding the centers of attributes in Pima Indians diabetes dataset and then weighted the dataset according to the ratios of the means of attributes to centers of theirs. Secondly, after weighting process, the classifier algorithms including support vector machine (SVM) and k-NN (k- nearest neighbor) classifiers have been used for classifying weighted Pima Indians diabetes dataset. Experimental results show that the proposed attribute weighting method (FCMAW) has obtained very promising results in the classification of Pima Indians diabetes dataset.

Keywords: Fuzzy C-means clustering, Fuzzy C-means clustering based attribute weighting, Pima Indians diabetes dataset, SVM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1762
183 Development of Genetic-based Machine Learning for Network Intrusion Detection (GBML-NID)

Authors: Wafa' S.Al-Sharafat, Reyadh Naoum

Abstract:

Society has grown to rely on Internet services, and the number of Internet users increases every day. As more and more users become connected to the network, the window of opportunity for malicious users to do their damage becomes very great and lucrative. The objective of this paper is to incorporate different techniques into classier system to detect and classify intrusion from normal network packet. Among several techniques, Steady State Genetic-based Machine Leaning Algorithm (SSGBML) will be used to detect intrusions. Where Steady State Genetic Algorithm (SSGA), Simple Genetic Algorithm (SGA), Modified Genetic Algorithm and Zeroth Level Classifier system are investigated in this research. SSGA is used as a discovery mechanism instead of SGA. SGA replaces all old rules with new produced rule preventing old good rules from participating in the next rule generation. Zeroth Level Classifier System is used to play the role of detector by matching incoming environment message with classifiers to determine whether the current message is normal or intrusion and receiving feedback from environment. Finally, in order to attain the best results, Modified SSGA will enhance our discovery engine by using Fuzzy Logic to optimize crossover and mutation probability. The experiments and evaluations of the proposed method were performed with the KDD 99 intrusion detection dataset.

Keywords: MSSGBML, Network Intrusion Detection, SGA, SSGA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1669
182 Classification of Potential Biomarkers in Breast Cancer Using Artificial Intelligence Algorithms and Anthropometric Datasets

Authors: Aref Aasi, Sahar Ebrahimi Bajgani, Erfan Aasi

Abstract:

Breast cancer (BC) continues to be the most frequent cancer in females and causes the highest number of cancer-related deaths in women worldwide. Inspired by recent advances in studying the relationship between different patient attributes and features and the disease, in this paper, we have tried to investigate the different classification methods for better diagnosis of BC in the early stages. In this regard, datasets from the University Hospital Centre of Coimbra were chosen, and different machine learning (ML)-based and neural network (NN) classifiers have been studied. For this purpose, we have selected favorable features among the nine provided attributes from the clinical dataset by using a random forest algorithm. This dataset consists of both healthy controls and BC patients, and it was noted that glucose, BMI, resistin, and age have the most importance, respectively. Moreover, we have analyzed these features with various ML-based classifier methods, including Decision Tree (DT), K-Nearest Neighbors (KNN), eXtreme Gradient Boosting (XGBoost), Logistic Regression (LR), Naive Bayes (NB), and Support Vector Machine (SVM) along with NN-based Multi-Layer Perceptron (MLP) classifier. The results revealed that among different techniques, the SVM and MLP classifiers have the most accuracy, with amounts of 96% and 92%, respectively. These results divulged that the adopted procedure could be used effectively for the classification of cancer cells, and also it encourages further experimental investigations with more collected data for other types of cancers.

Keywords: Breast cancer, health diagnosis, Machine Learning, biomarker classification, Neural Network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 319
181 AI-Based Techniques for Online Social Media Network Sentiment Analysis: A Methodical Review

Authors: A. M. John-Otumu, M. M. Rahman, O. C. Nwokonkwo, M. C. Onuoha

Abstract:

Online social media networks have long served as a primary arena for group conversations, gossip, text-based information sharing and distribution. The use of natural language processing techniques for text classification and unbiased decision making has not been far-fetched. Proper classification of these textual information in a given context has also been very difficult. As a result, a systematic review was conducted from previous literature on sentiment classification and AI-based techniques. The study was done in order to gain a better understanding of the process of designing and developing a robust and more accurate sentiment classifier that could correctly classify social media textual information of a given context between hate speech and inverted compliments with a high level of accuracy using the knowledge gain from the evaluation of different artificial intelligence techniques reviewed. The study evaluated over 250 articles from digital sources like ACM digital library, Google Scholar, and IEEE Xplore; and whittled down the number of research to 52 articles. Findings revealed that deep learning approaches such as Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Bidirectional Encoder Representations from Transformer (BERT), and Long Short-Term Memory (LSTM) outperformed various machine learning techniques in terms of performance accuracy. A large dataset is also required to develop a robust sentiment classifier. Results also revealed that data can be obtained from places like Twitter, movie reviews, Kaggle, Stanford Sentiment Treebank (SST), and SemEval Task4 based on the required domain. The hybrid deep learning techniques like CNN+LSTM, CNN+ Gated Recurrent Unit (GRU), CNN+BERT outperformed single deep learning techniques and machine learning techniques. Python programming language outperformed Java programming language in terms of development simplicity and AI-based library functionalities. Finally, the study recommended the findings obtained for building robust sentiment classifier in the future.

Keywords: Artificial Intelligence, Natural Language Processing, Sentiment Analysis, Social Network, Text.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 593
180 ANN based Multi Classifier System for Prediction of High Energy Shower Primary Energy and Core Location

Authors: Gitanjali Devi, Kandarpa Kumar Sarma, Pranayee Datta, Anjana Kakoti Mahanta

Abstract:

Cosmic showers, during the transit through space, produce sub - products as a result of interactions with the intergalactic or interstellar medium which after entering earth generate secondary particles called Extensive Air Shower (EAS). Detection and analysis of High Energy Particle Showers involve a plethora of theoretical and experimental works with a host of constraints resulting in inaccuracies in measurements. Therefore, there exist a necessity to develop a readily available system based on soft-computational approaches which can be used for EAS analysis. This is due to the fact that soft computational tools such as Artificial Neural Network (ANN)s can be trained as classifiers to adapt and learn the surrounding variations. But single classifiers fail to reach optimality of decision making in many situations for which Multiple Classifier System (MCS) are preferred to enhance the ability of the system to make decisions adjusting to finer variations. This work describes the formation of an MCS using Multi Layer Perceptron (MLP), Recurrent Neural Network (RNN) and Probabilistic Neural Network (PNN) with data inputs from correlation mapping Self Organizing Map (SOM) blocks and the output optimized by another SOM. The results show that the setup can be adopted for real time practical applications for prediction of primary energy and location of EAS from density values captured using detectors in a circular grid.

Keywords: EAS, Shower, Core, ANN, Location.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1301
179 Reduction of False Positives in Head-Shoulder Detection Based on Multi-Part Color Segmentation

Authors: Lae-Jeong Park

Abstract:

The paper presents a method that utilizes figure-ground color segmentation to extract effective global feature in terms of false positive reduction in the head-shoulder detection. Conventional detectors that rely on local features such as HOG due to real-time operation suffer from false positives. Color cue in an input image provides salient information on a global characteristic which is necessary to alleviate the false positives of the local feature based detectors. An effective approach that uses figure-ground color segmentation has been presented in an effort to reduce the false positives in object detection. In this paper, an extended version of the approach is presented that adopts separate multipart foregrounds instead of a single prior foreground and performs the figure-ground color segmentation with each of the foregrounds. The multipart foregrounds include the parts of the head-shoulder shape and additional auxiliary foregrounds being optimized by a search algorithm. A classifier is constructed with the feature that consists of a set of the multiple resulting segmentations. Experimental results show that the presented method can discriminate more false positive than the single prior shape-based classifier as well as detectors with the local features. The improvement is possible because the presented approach can reduce the false positives that have the same colors in the head and shoulder foregrounds.

Keywords: Pedestrian detection, color segmentation, false positives, feature extraction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1144
178 sEMG Interface Design for Locomotion Identification

Authors: Rohit Gupta, Ravinder Agarwal

Abstract:

Surface electromyographic (sEMG) signal has the potential to identify the human activities and intention. This potential is further exploited to control the artificial limbs using the sEMG signal from residual limbs of amputees. The paper deals with the development of multichannel cost efficient sEMG signal interface for research application, along with evaluation of proposed class dependent statistical approach of the feature selection method. The sEMG signal acquisition interface was developed using ADS1298 of Texas Instruments, which is a front-end interface integrated circuit for ECG application. Further, the sEMG signal is recorded from two lower limb muscles for three locomotions namely: Plane Walk (PW), Stair Ascending (SA), Stair Descending (SD). A class dependent statistical approach is proposed for feature selection and also its performance is compared with 12 preexisting feature vectors. To make the study more extensive, performance of five different types of classifiers are compared. The outcome of the current piece of work proves the suitability of the proposed feature selection algorithm for locomotion recognition, as compared to other existing feature vectors. The SVM Classifier is found as the outperformed classifier among compared classifiers with an average recognition accuracy of 97.40%. Feature vector selection emerges as the most dominant factor affecting the classification performance as it holds 51.51% of the total variance in classification accuracy. The results demonstrate the potentials of the developed sEMG signal acquisition interface along with the proposed feature selection algorithm.

Keywords: Classifiers, feature selection, locomotion, sEMG.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1491
177 Machine Learning Techniques in Bank Credit Analysis

Authors: Fernanda M. Assef, Maria Teresinha A. Steiner

Abstract:

The aim of this paper is to compare and discuss better classifier algorithm options for credit risk assessment by applying different Machine Learning techniques. Using records from a Brazilian financial institution, this study uses a database of 5,432 companies that are clients of the bank, where 2,600 clients are classified as non-defaulters, 1,551 are classified as defaulters and 1,281 are temporarily defaulters, meaning that the clients are overdue on their payments for up 180 days. For each case, a total of 15 attributes was considered for a one-against-all assessment using four different techniques: Artificial Neural Networks Multilayer Perceptron (ANN-MLP), Artificial Neural Networks Radial Basis Functions (ANN-RBF), Logistic Regression (LR) and finally Support Vector Machines (SVM). For each method, different parameters were analyzed in order to obtain different results when the best of each technique was compared. Initially the data were coded in thermometer code (numerical attributes) or dummy coding (for nominal attributes). The methods were then evaluated for each parameter and the best result of each technique was compared in terms of accuracy, false positives, false negatives, true positives and true negatives. This comparison showed that the best method, in terms of accuracy, was ANN-RBF (79.20% for non-defaulter classification, 97.74% for defaulters and 75.37% for the temporarily defaulter classification). However, the best accuracy does not always represent the best technique. For instance, on the classification of temporarily defaulters, this technique, in terms of false positives, was surpassed by SVM, which had the lowest rate (0.07%) of false positive classifications. All these intrinsic details are discussed considering the results found, and an overview of what was presented is shown in the conclusion of this study.

Keywords: Artificial Neural Networks, ANNs, classifier algorithms, credit risk assessment, logistic regression, machine learning, support vector machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1281
176 Constructing of Classifier for Face Recognition on the Basis of the Conjugation Indexes

Authors: Vladimir A. Fursov, Nikita E. Kozin

Abstract:

In this work the opportunity of construction of the qualifiers for face-recognition systems based on conjugation criteria is investigated. The linkage between the bipartite conjugation, the conjugation with a subspace and the conjugation with the null-space is shown. The unified solving rule is investigated. It makes the decision on the rating of face to a class considering the linkage between conjugation values. The described recognition method can be successfully applied to the distributed systems of video control and video observation.

Keywords: Conjugation, Eigenfaces, Recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1467
175 The Optimization of Decision Rules in Multimodal Decision-Level Fusion Scheme

Authors: Andrey V. Timofeev, Dmitry V. Egorov

Abstract:

This paper introduces an original method of parametric optimization of the structure for multimodal decisionlevel fusion scheme which combines the results of the partial solution of the classification task obtained from assembly of the mono-modal classifiers. As a result, a multimodal fusion classifier which has the minimum value of the total error rate has been obtained.

Keywords: Сlassification accuracy, fusion solution, total error rate.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1975
174 Automatic Discrimimation of the Modes of Permanent Flow of a Liquid Simulating Blood

Authors: Malika.D Kedir-Talha, Mohamed Mehenni

Abstract:

In order to be able to automatically differentiate between two modes of permanent flow of a liquid simulating blood, it was imperative to put together a data bank. Thus, the acquisition of the various amplitude spectra of the Doppler signal of this liquid in laminar flow and other spectra in turbulent flow enabled us to establish an automatic difference between the two modes. According to the number of parameters and their nature, a comparative study allowed us to choose the best classifier.

Keywords: Doppler spectrum, flow mode, pattern recognition, permanent flow.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1202
173 Multiple Mental Thought Parametric Classification: A New Approach for Individual Identification

Authors: Ramaswamy Palaniappan

Abstract:

This paper reports a new approach on identifying the individuality of persons by using parametric classification of multiple mental thoughts. In the approach, electroencephalogram (EEG) signals were recorded when the subjects were thinking of one or more (up to five) mental thoughts. Autoregressive features were computed from these EEG signals and classified by Linear Discriminant classifier. The results here indicate that near perfect identification of 400 test EEG patterns from four subjects was possible, thereby opening up a new avenue in biometrics.

Keywords: Autoregressive, Biometrics, Electroencephalogram, Linear discrimination, Mental thoughts.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1397
172 Multi-Label Hierarchical Classification for Protein Function Prediction

Authors: Helyane B. Borges, Julio Cesar Nievola

Abstract:

Hierarchical classification is a problem with applications in many areas as protein function prediction where the dates are hierarchically structured. Therefore, it is necessary the development of algorithms able to induce hierarchical classification models. This paper presents experimenters using the algorithm for hierarchical classification called Multi-label Hierarchical Classification using a Competitive Neural Network (MHC-CNN). It was tested in ten datasets the Gene Ontology (GO) Cellular Component Domain. The results are compared with the Clus-HMC and Clus-HSC using the hF-Measure.

Keywords: Hierarchical Classification, Competitive Neural Network, Global Classifier.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2380