Search results for: classifiers ensembles
196 Detecting Hate Speech And Cyberbullying Using Natural Language Processing
Authors: Nádia Pereira, Paula Ferreira, Sofia Francisco, Sofia Oliveira, Sidclay Souza, Paula Paulino, Ana Margarida Veiga Simão
Abstract:
Social media has progressed into a platform for hate speech among its users, and thus, there is an increasing need to develop automatic detection classifiers of offense and conflicts to help decrease the prevalence of such incidents. Online communication can be used to intentionally harm someone, which is why such classifiers could be essential in social networks. A possible application of these classifiers is the automatic detection of cyberbullying. Even though identifying the aggressive language used in online interactions could be important to build cyberbullying datasets, there are other criteria that must be considered. Being able to capture the language, which is indicative of the intent to harm others in a specific context of online interaction is fundamental. Offense and hate speech may be the foundation of online conflicts, which have become commonly used in social media and are an emergent research focus in machine learning and natural language processing. This study presents two Portuguese language offense-related datasets which serve as examples for future research and extend the study of the topic. The first is similar to other offense detection related datasets and is entitled Aggressiveness dataset. The second is a novelty because of the use of the history of the interaction between users and is entitled the Conflicts/Attacks dataset. Both datasets were developed in different phases. Firstly, we performed a content analysis of verbal aggression witnessed by adolescents in situations of cyberbullying. Secondly, we computed frequency analyses from the previous phase to gather lexical and linguistic cues used to identify potentially aggressive conflicts and attacks which were posted on Twitter. Thirdly, thorough annotation of real tweets was performed byindependent postgraduate educational psychologists with experience in cyberbullying research. Lastly, we benchmarked these datasets with other machine learning classifiers.Keywords: aggression, classifiers, cyberbullying, datasets, hate speech, machine learning
Procedia PDF Downloads 229195 Evaluate the Changes in Stress Level Using Facial Thermal Imaging
Authors: Amin Derakhshan, Mohammad Mikaili, Mohammad Ali Khalilzadeh, Amin Mohammadian
Abstract:
This paper proposes a stress recognition system from multi-modal bio-potential signals. For stress recognition, Support Vector Machines (SVM) and LDA are applied to design the stress classifiers and its characteristics are investigated. Using gathered data under psychological polygraph experiments, the classifiers are trained and tested. The pattern recognition method classifies stressful from non-stressful subjects based on labels which come from polygraph data. The successful classification rate is 96% for 12 subjects. It means that facial thermal imaging due to its non-contact advantage could be a remarkable alternative for psycho-physiological methods.Keywords: stress, thermal imaging, face, SVM, polygraph
Procedia PDF Downloads 487194 Comparing Emotion Recognition from Voice and Facial Data Using Time Invariant Features
Authors: Vesna Kirandziska, Nevena Ackovska, Ana Madevska Bogdanova
Abstract:
The problem of emotion recognition is a challenging problem. It is still an open problem from the aspect of both intelligent systems and psychology. In this paper, both voice features and facial features are used for building an emotion recognition system. A Support Vector Machine classifiers are built by using raw data from video recordings. In this paper, the results obtained for the emotion recognition are given, and a discussion about the validity and the expressiveness of different emotions is presented. A comparison between the classifiers build from facial data only, voice data only and from the combination of both data is made here. The need for a better combination of the information from facial expression and voice data is argued.Keywords: emotion recognition, facial recognition, signal processing, machine learning
Procedia PDF Downloads 317193 Music for Peace, a Model for Socialization
Authors: Mina Fenercioglu
Abstract:
This study discusses a Turkish music education model similar to El Sistema. The Music for Peace (Baris icin Muzik) program, founded in 2005 by an idealist humanitarian in Istanbul, started as a pilot project with accordion and then with flute in ensembles at the Ulubatlı Hasan Primary School where mostly underprivileged children attend. The program gives complimentary music lessons particularly to deprived children, who at the beginning were prone to crime. With music education, the attitudes of the children turn to a positive aspect. The aim of this initiative provides social and cultural awareness, which serves the same mission as the world known El Sistema. In 2009, the Music for Peace project received Deutsche Bank Urban Age Award, which is a prize presented to enterprises that improve the quality of life in urban environment. Since 2010, the Music for Peace continues the symphonic music education at its own place. In 2011, Music for Peace gained foundation status, and started to accept donations as musical instruments for children who attend the courses. On July 2013, IKSV (Istanbul Culture and Arts Foundation) became the institutional partner of Music for Peace Foundation and in June 2014, the foundation signed up to join El Sistema’s global program. Now in 2015, the foundation has three ensembles: the Music for Peace Orchestra, which consists of two orchestras practicing and performing in different levels; the Music for Peace Chorus, which has joined Istanbul International Polyphonic Choruses Festival; and the recently established Music for Peace Brass Ensemble.Keywords: El Sistema, music education, music for peace, socialization
Procedia PDF Downloads 421192 Linguistic Features for Sentence Difficulty Prediction in Aspect-Based Sentiment Analysis
Authors: Adrian-Gabriel Chifu, Sebastien Fournier
Abstract:
One of the challenges of natural language understanding is to deal with the subjectivity of sentences, which may express opinions and emotions that add layers of complexity and nuance. Sentiment analysis is a field that aims to extract and analyze these subjective elements from text, and it can be applied at different levels of granularity, such as document, paragraph, sentence, or aspect. Aspect-based sentiment analysis is a well-studied topic with many available data sets and models. However, there is no clear definition of what makes a sentence difficult for aspect-based sentiment analysis. In this paper, we explore this question by conducting an experiment with three data sets: ”Laptops”, ”Restaurants”, and ”MTSC” (Multi-Target-dependent Sentiment Classification), and a merged version of these three datasets. We study the impact of domain diversity and syntactic diversity on difficulty. We use a combination of classifiers to identify the most difficult sentences and analyze their characteristics. We employ two ways of defining sentence difficulty. The first one is binary and labels a sentence as difficult if the classifiers fail to correctly predict the sentiment polarity. The second one is a six-level scale based on how many of the top five best-performing classifiers can correctly predict the sentiment polarity. We also define 9 linguistic features that, combined, aim at estimating the difficulty at sentence level.Keywords: sentiment analysis, difficulty, classification, machine learning
Procedia PDF Downloads 92191 Performance Comparison of Situation-Aware Models for Activating Robot Vacuum Cleaner in a Smart Home
Authors: Seongcheol Kwon, Jeongmin Kim, Kwang Ryel Ryu
Abstract:
We assume an IoT-based smart-home environment where the on-off status of each of the electrical appliances including the room lights can be recognized in a real time by monitoring and analyzing the smart meter data. At any moment in such an environment, we can recognize what the household or the user is doing by referring to the status data of the appliances. In this paper, we focus on a smart-home service that is to activate a robot vacuum cleaner at right time by recognizing the user situation, which requires a situation-aware model that can distinguish the situations that allow vacuum cleaning (Yes) from those that do not (No). We learn as our candidate models a few classifiers such as naïve Bayes, decision tree, and logistic regression that can map the appliance-status data into Yes and No situations. Our training and test data are obtained from simulations of user behaviors, in which a sequence of user situations such as cooking, eating, dish washing, and so on is generated with the status of the relevant appliances changed in accordance with the situation changes. During the simulation, both the situation transition and the resulting appliance status are determined stochastically. To compare the performances of the aforementioned classifiers we obtain their learning curves for different types of users through simulations. The result of our empirical study reveals that naïve Bayes achieves a slightly better classification accuracy than the other compared classifiers.Keywords: situation-awareness, smart home, IoT, machine learning, classifier
Procedia PDF Downloads 422190 Breast Cancer Survivability Prediction via Classifier Ensemble
Authors: Mohamed Al-Badrashiny, Abdelghani Bellaachia
Abstract:
This paper presents a classifier ensemble approach for predicting the survivability of the breast cancer patients using the latest database version of the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute. The system consists of two main components; features selection and classifier ensemble components. The features selection component divides the features in SEER database into four groups. After that it tries to find the most important features among the four groups that maximizes the weighted average F-score of a certain classification algorithm. The ensemble component uses three different classifiers, each of which models different set of features from SEER through the features selection module. On top of them, another classifier is used to give the final decision based on the output decisions and confidence scores from each of the underlying classifiers. Different classification algorithms have been examined; the best setup found is by using the decision tree, Bayesian network, and Na¨ıve Bayes algorithms for the underlying classifiers and Na¨ıve Bayes for the classifier ensemble step. The system outperforms all published systems to date when evaluated against the exact same data of SEER (period of 1973-2002). It gives 87.39% weighted average F-score compared to 85.82% and 81.34% of the other published systems. By increasing the data size to cover the whole database (period of 1973-2014), the overall weighted average F-score jumps to 92.4% on the held out unseen test set.Keywords: classifier ensemble, breast cancer survivability, data mining, SEER
Procedia PDF Downloads 329189 Empirical and Indian Automotive Equity Portfolio Decision Support
Authors: P. Sankar, P. James Daniel Paul, Siddhant Sahu
Abstract:
A brief review of the empirical studies on the methodology of the stock market decision support would indicate that they are at a threshold of validating the accuracy of the traditional and the fuzzy, artificial neural network and the decision trees. Many researchers have been attempting to compare these models using various data sets worldwide. However, the research community is on the way to the conclusive confidence in the emerged models. This paper attempts to use the automotive sector stock prices from National Stock Exchange (NSE), India and analyze them for the intra-sectorial support for stock market decisions. The study identifies the significant variables and their lags which affect the price of the stocks using OLS analysis and decision tree classifiers.Keywords: Indian automotive sector, stock market decisions, equity portfolio analysis, decision tree classifiers, statistical data analysis
Procedia PDF Downloads 486188 Diagnosis of the Heart Rhythm Disorders by Using Hybrid Classifiers
Authors: Sule Yucelbas, Gulay Tezel, Cuneyt Yucelbas, Seral Ozsen
Abstract:
In this study, it was tried to identify some heart rhythm disorders by electrocardiography (ECG) data that is taken from MIT-BIH arrhythmia database by subtracting the required features, presenting to artificial neural networks (ANN), artificial immune systems (AIS), artificial neural network based on artificial immune system (AIS-ANN) and particle swarm optimization based artificial neural network (PSO-NN) classifier systems. The main purpose of this study is to evaluate the performance of hybrid AIS-ANN and PSO-ANN classifiers with regard to the ANN and AIS. For this purpose, the normal sinus rhythm (NSR), atrial premature contraction (APC), sinus arrhythmia (SA), ventricular trigeminy (VTI), ventricular tachycardia (VTK) and atrial fibrillation (AF) data for each of the RR intervals were found. Then these data in the form of pairs (NSR-APC, NSR-SA, NSR-VTI, NSR-VTK and NSR-AF) is created by combining discrete wavelet transform which is applied to each of these two groups of data and two different data sets with 9 and 27 features were obtained from each of them after data reduction. Afterwards, the data randomly was firstly mixed within themselves, and then 4-fold cross validation method was applied to create the training and testing data. The training and testing accuracy rates and training time are compared with each other. As a result, performances of the hybrid classification systems, AIS-ANN and PSO-ANN were seen to be close to the performance of the ANN system. Also, the results of the hybrid systems were much better than AIS, too. However, ANN had much shorter period of training time than other systems. In terms of training times, ANN was followed by PSO-ANN, AIS-ANN and AIS systems respectively. Also, the features that extracted from the data affected the classification results significantly.Keywords: AIS, ANN, ECG, hybrid classifiers, PSO
Procedia PDF Downloads 445187 [Keynote Talk]: Evidence Fusion in Decision Making
Authors: Mohammad Abdullah-Al-Wadud
Abstract:
In the current era of automation and artificial intelligence, different systems have been increasingly keeping on depending on decision-making capabilities of machines. Such systems/applications may range from simple classifiers to sophisticated surveillance systems based on traditional sensors and related equipment which are becoming more common in the internet of things (IoT) paradigm. However, the available data for such problems are usually imprecise and incomplete, which leads to uncertainty in decisions made based on traditional probability-based classifiers. This requires a robust fusion framework to combine the available information sources with some degree of certainty. The theory of evidence can provide with such a method for combining evidence from different (may be unreliable) sources/observers. This talk will address the employment of the Dempster-Shafer Theory of evidence in some practical applications.Keywords: decision making, dempster-shafer theory, evidence fusion, incomplete data, uncertainty
Procedia PDF Downloads 427186 Fake News Detection for Korean News Using Machine Learning Techniques
Authors: Tae-Uk Yun, Pullip Chung, Kee-Young Kwahk, Hyunchul Ahn
Abstract:
Fake news is defined as the news articles that are intentionally and verifiably false, and could mislead readers. Spread of fake news may provoke anxiety, chaos, fear, or irrational decisions of the public. Thus, detecting fake news and preventing its spread has become very important issue in our society. However, due to the huge amount of fake news produced every day, it is almost impossible to identify it by a human. Under this context, researchers have tried to develop automated fake news detection using machine learning techniques over the past years. But, there have been no prior studies proposed an automated fake news detection method for Korean news to our best knowledge. In this study, we aim to detect Korean fake news using text mining and machine learning techniques. Our proposed method consists of two steps. In the first step, the news contents to be analyzed is convert to quantified values using various text mining techniques (topic modeling, TF-IDF, and so on). After that, in step 2, classifiers are trained using the values produced in step 1. As the classifiers, machine learning techniques such as logistic regression, backpropagation network, support vector machine, and deep neural network can be applied. To validate the effectiveness of the proposed method, we collected about 200 short Korean news from Seoul National University’s FactCheck. which provides with detailed analysis reports from 20 media outlets and links to source documents for each case. Using this dataset, we will identify which text features are important as well as which classifiers are effective in detecting Korean fake news.Keywords: fake news detection, Korean news, machine learning, text mining
Procedia PDF Downloads 276185 Investigation of Topic Modeling-Based Semi-Supervised Interpretable Document Classifier
Authors: Dasom Kim, William Xiu Shun Wong, Yoonjin Hyun, Donghoon Lee, Minji Paek, Sungho Byun, Namgyu Kim
Abstract:
There have been many researches on document classification for classifying voluminous documents automatically. Through document classification, we can assign a specific category to each unlabeled document on the basis of various machine learning algorithms. However, providing labeled documents manually requires considerable time and effort. To overcome the limitations, the semi-supervised learning which uses unlabeled document as well as labeled documents has been invented. However, traditional document classifiers, regardless of supervised or semi-supervised ones, cannot sufficiently explain the reason or the process of the classification. Thus, in this paper, we proposed a methodology to visualize major topics and class components of each document. We believe that our methodology for visualizing topics and classes of each document can enhance the reliability and explanatory power of document classifiers.Keywords: data mining, document classifier, text mining, topic modeling
Procedia PDF Downloads 403184 [Keynote Talk]: sEMG Interface Design for Locomotion Identification
Authors: Rohit Gupta, Ravinder Agarwal
Abstract:
Surface electromyographic (sEMG) signal has the potential to identify the human activities and intention. This potential is further exploited to control the artificial limbs using the sEMG signal from residual limbs of amputees. The paper deals with the development of multichannel cost efficient sEMG signal interface for research application, along with evaluation of proposed class dependent statistical approach of the feature selection method. The sEMG signal acquisition interface was developed using ADS1298 of Texas Instruments, which is a front-end interface integrated circuit for ECG application. Further, the sEMG signal is recorded from two lower limb muscles for three locomotions namely: Plane Walk (PW), Stair Ascending (SA), Stair Descending (SD). A class dependent statistical approach is proposed for feature selection and also its performance is compared with 12 preexisting feature vectors. To make the study more extensive, performance of five different types of classifiers are compared. The outcome of the current piece of work proves the suitability of the proposed feature selection algorithm for locomotion recognition, as compared to other existing feature vectors. The SVM Classifier is found as the outperformed classifier among compared classifiers with an average recognition accuracy of 97.40%. Feature vector selection emerges as the most dominant factor affecting the classification performance as it holds 51.51% of the total variance in classification accuracy. The results demonstrate the potentials of the developed sEMG signal acquisition interface along with the proposed feature selection algorithm.Keywords: classifiers, feature selection, locomotion, sEMG
Procedia PDF Downloads 293183 Use of Interpretable Evolved Search Query Classifiers for Sinhala Documents
Authors: Prasanna Haddela
Abstract:
Document analysis is a well matured yet still active research field, partly as a result of the intricate nature of building computational tools but also due to the inherent problems arising from the variety and complexity of human languages. Breaking down language barriers is vital in enabling access to a number of recent technologies. This paper investigates the application of document classification methods to new Sinhalese datasets. This language is geographically isolated and rich with many of its own unique features. We will examine the interpretability of the classification models with a particular focus on the use of evolved Lucene search queries generated using a Genetic Algorithm (GA) as a method of document classification. We will compare the accuracy and interpretability of these search queries with other popular classifiers. The results are promising and are roughly in line with previous work on English language datasets.Keywords: evolved search queries, Sinhala document classification, Lucene Sinhala analyzer, interpretable text classification, genetic algorithm
Procedia PDF Downloads 114182 Ensemble-Based SVM Classification Approach for miRNA Prediction
Authors: Sondos M. Hammad, Sherin M. ElGokhy, Mahmoud M. Fahmy, Elsayed A. Sallam
Abstract:
In this paper, an ensemble-based Support Vector Machine (SVM) classification approach is proposed. It is used for miRNA prediction. Three problems, commonly associated with previous approaches, are alleviated. These problems arise due to impose assumptions on the secondary structural of premiRNA, imbalance between the numbers of the laboratory checked miRNAs and the pseudo-hairpins, and finally using a training data set that does not consider all the varieties of samples in different species. We aggregate the predicted outputs of three well-known SVM classifiers; namely, Triplet-SVM, Virgo and Mirident, weighted by their variant features without any structural assumptions. An additional SVM layer is used in aggregating the final output. The proposed approach is trained and then tested with balanced data sets. The results of the proposed approach outperform the three base classifiers. Improved values for the metrics of 88.88% f-score, 92.73% accuracy, 90.64% precision, 96.64% specificity, 87.2% sensitivity, and the area under the ROC curve is 0.91 are achieved.Keywords: MiRNAs, SVM classification, ensemble algorithm, assumption problem, imbalance data
Procedia PDF Downloads 349181 Sentiment Analysis: Comparative Analysis of Multilingual Sentiment and Opinion Classification Techniques
Authors: Sannikumar Patel, Brian Nolan, Markus Hofmann, Philip Owende, Kunjan Patel
Abstract:
Sentiment analysis and opinion mining have become emerging topics of research in recent years but most of the work is focused on data in the English language. A comprehensive research and analysis are essential which considers multiple languages, machine translation techniques, and different classifiers. This paper presents, a comparative analysis of different approaches for multilingual sentiment analysis. These approaches are divided into two parts: one using classification of text without language translation and second using the translation of testing data to a target language, such as English, before classification. The presented research and results are useful for understanding whether machine translation should be used for multilingual sentiment analysis or building language specific sentiment classification systems is a better approach. The effects of language translation techniques, features, and accuracy of various classifiers for multilingual sentiment analysis is also discussed in this study.Keywords: cross-language analysis, machine learning, machine translation, sentiment analysis
Procedia PDF Downloads 715180 Fem Models of Glued Laminated Timber Beams Enhanced by Bayesian Updating of Elastic Moduli
Authors: L. Melzerová, T. Janda, M. Šejnoha, J. Šejnoha
Abstract:
Two finite element (FEM) models are presented in this paper to address the random nature of the response of glued timber structures made of wood segments with variable elastic moduli evaluated from 3600 indentation measurements. This total database served to create the same number of ensembles as was the number of segments in the tested beam. Statistics of these ensembles were then assigned to given segments of beams and the Latin Hypercube Sampling (LHS) method was called to perform 100 simulations resulting into the ensemble of 100 deflections subjected to statistical evaluation. Here, a detailed geometrical arrangement of individual segments in the laminated beam was considered in the construction of two-dimensional FEM model subjected to in four-point bending to comply with the laboratory tests. Since laboratory measurements of local elastic moduli may in general suffer from a significant experimental error, it appears advantageous to exploit the full scale measurements of timber beams, i.e. deflections, to improve their prior distributions with the help of the Bayesian statistical method. This, however, requires an efficient computational model when simulating the laboratory tests numerically. To this end, a simplified model based on Mindlin’s beam theory was established. The improved posterior distributions show that the most significant change of the Young’s modulus distribution takes place in laminae in the most strained zones, i.e. in the top and bottom layers within the beam center region. Posterior distributions of moduli of elasticity were subsequently utilized in the 2D FEM model and compared with the original simulations.Keywords: Bayesian inference, FEM, four point bending test, laminated timber, parameter estimation, prior and posterior distribution, Young’s modulus
Procedia PDF Downloads 284179 Using Classifiers to Predict Student Outcome at Higher Institute of Telecommunication
Authors: Fuad M. Alkoot
Abstract:
We aim at highlighting the benefits of classifier systems especially in supporting educational management decisions. The paper aims at using classifiers in an educational application where an outcome is predicted based on given input parameters that represent various conditions at the institute. We present a classifier system that is designed using a limited training set with data for only one semester. The achieved system is able to reach at previously known outcomes accurately. It is also tested on new input parameters representing variations of input conditions to see its prediction on the possible outcome value. Given the supervised expectation of the outcome for the new input we find the system is able to predict the correct outcome. Experiments were conducted on one semester data from two departments only, Switching and Mathematics. Future work on other departments with larger training sets and wider input variations will show additional benefits of classifier systems in supporting the management decisions at an educational institute.Keywords: machine learning, pattern recognition, classifier design, educational management, outcome estimation
Procedia PDF Downloads 279178 Evaluation of Robust Feature Descriptors for Texture Classification
Authors: Jia-Hong Lee, Mei-Yi Wu, Hsien-Tsung Kuo
Abstract:
Texture is an important characteristic in real and synthetic scenes. Texture analysis plays a critical role in inspecting surfaces and provides important techniques in a variety of applications. Although several descriptors have been presented to extract texture features, the development of object recognition is still a difficult task due to the complex aspects of texture. Recently, many robust and scaling-invariant image features such as SIFT, SURF and ORB have been successfully used in image retrieval and object recognition. In this paper, we have tried to compare the performance for texture classification using these feature descriptors with k-means clustering. Different classifiers including K-NN, Naive Bayes, Back Propagation Neural Network , Decision Tree and Kstar were applied in three texture image sets - UIUCTex, KTH-TIPS and Brodatz, respectively. Experimental results reveal SIFTS as the best average accuracy rate holder in UIUCTex, KTH-TIPS and SURF is advantaged in Brodatz texture set. BP neuro network works best in the test set classification among all used classifiers.Keywords: texture classification, texture descriptor, SIFT, SURF, ORB
Procedia PDF Downloads 371177 Machine Learning Automatic Detection on Twitter Cyberbullying
Authors: Raghad A. Altowairgi
Abstract:
With the wide spread of social media platforms, young people tend to use them extensively as the first means of communication due to their ease and modernity. But these platforms often create a fertile ground for bullies to practice their aggressive behavior against their victims. Platform usage cannot be reduced, but intelligent mechanisms can be implemented to reduce the abuse. This is where machine learning comes in. Understanding and classifying text can be helpful in order to minimize the act of cyberbullying. Artificial intelligence techniques have expanded to formulate an applied tool to address the phenomenon of cyberbullying. In this research, machine learning models are built to classify text into two classes; cyberbullying and non-cyberbullying. After preprocessing the data in 4 stages; removing characters that do not provide meaningful information to the models, tokenization, removing stop words, and lowering text. BoW and TF-IDF are used as the main features for the five classifiers, which are; logistic regression, Naïve Bayes, Random Forest, XGboost, and Catboost classifiers. Each of them scores 92%, 90%, 92%, 91%, 86% respectively.Keywords: cyberbullying, machine learning, Bag-of-Words, term frequency-inverse document frequency, natural language processing, Catboost
Procedia PDF Downloads 132176 An Ensemble System of Classifiers for Computer-Aided Volcano Monitoring
Authors: Flavio Cannavo
Abstract:
Continuous evaluation of the status of potentially hazardous volcanos plays a key role for civil protection purposes. The importance of monitoring volcanic activity, especially for energetic paroxysms that usually come with tephra emissions, is crucial not only for exposures to the local population but also for airline traffic. Presently, real-time surveillance of most volcanoes worldwide is essentially delegated to one or more human experts in volcanology, who interpret data coming from different kind of monitoring networks. Unfavorably, the high nonlinearity of the complex and coupled volcanic dynamics leads to a large variety of different volcanic behaviors. Moreover, continuously measured parameters (e.g. seismic, deformation, infrasonic and geochemical signals) are often not able to fully explain the ongoing phenomenon, thus making the fast volcano state assessment a very puzzling task for the personnel on duty at the control rooms. With the aim of aiding the personnel on duty in volcano surveillance, here we introduce a system based on an ensemble of data-driven classifiers to infer automatically the ongoing volcano status from all the available different kind of measurements. The system consists of a heterogeneous set of independent classifiers, each one built with its own data and algorithm. Each classifier gives an output about the volcanic status. The ensemble technique allows weighting the single classifier output to combine all the classifications into a single status that maximizes the performance. We tested the model on the Mt. Etna (Italy) case study by considering a long record of multivariate data from 2011 to 2015 and cross-validated it. Results indicate that the proposed model is effective and of great power for decision-making purposes.Keywords: Bayesian networks, expert system, mount Etna, volcano monitoring
Procedia PDF Downloads 247175 Semi-Automatic Method to Assist Expert for Association Rules Validation
Authors: Amdouni Hamida, Gammoudi Mohamed Mohsen
Abstract:
In order to help the expert to validate association rules extracted from data, some quality measures are proposed in the literature. We distinguish two categories: objective and subjective measures. The first one depends on a fixed threshold and on data quality from which the rules are extracted. The second one consists on providing to the expert some tools in the objective to explore and visualize rules during the evaluation step. However, the number of extracted rules to validate remains high. Thus, the manually mining rules task is very hard. To solve this problem, we propose, in this paper, a semi-automatic method to assist the expert during the association rule's validation. Our method uses rule-based classification as follow: (i) We transform association rules into classification rules (classifiers), (ii) We use the generated classifiers for data classification. (iii) We visualize association rules with their quality classification to give an idea to the expert and to assist him during validation process.Keywords: association rules, rule-based classification, classification quality, validation
Procedia PDF Downloads 440174 Comparison of Linear Discriminant Analysis and Support Vector Machine Classifications for Electromyography Signals Acquired at Five Positions of Elbow Joint
Authors: Amna Khan, Zareena Kausar, Saad Malik
Abstract:
Bio Mechatronics has extended applications in the field of rehabilitation. It has been contributing since World War II in improving the applicability of prosthesis and assistive devices in real life scenarios. In this paper, classification accuracies have been compared for two classifiers against five positions of elbow. Electromyography (EMG) signals analysis have been acquired directly from skeletal muscles of human forearm for each of the three defined positions and at modified extreme positions of elbow flexion and extension using 8 electrode Myo armband sensor. Features were extracted from filtered EMG signals for each position. Performance of two classifiers, support vector machine (SVM) and linear discriminant analysis (LDA) has been compared by analyzing the classification accuracies. SVM illustrated classification accuracies between 90-96%, in contrast to 84-87% depicted by LDA for five defined positions of elbow keeping the number of samples and selected feature the same for both SVM and LDA.Keywords: classification accuracies, electromyography, linear discriminant analysis (LDA), Myo armband sensor, support vector machine (SVM)
Procedia PDF Downloads 368173 Margin-Based Feed-Forward Neural Network Classifiers
Authors: Xiaohan Bookman, Xiaoyan Zhu
Abstract:
Margin-Based Principle has been proposed for a long time, it has been proved that this principle could reduce the structural risk and improve the performance in both theoretical and practical aspects. Meanwhile, feed-forward neural network is a traditional classifier, which is very hot at present with a deeper architecture. However, the training algorithm of feed-forward neural network is developed and generated from Widrow-Hoff Principle that means to minimize the squared error. In this paper, we propose a new training algorithm for feed-forward neural networks based on Margin-Based Principle, which could effectively promote the accuracy and generalization ability of neural network classifiers with less labeled samples and flexible network. We have conducted experiments on four UCI open data sets and achieved good results as expected. In conclusion, our model could handle more sparse labeled and more high-dimension data set in a high accuracy while modification from old ANN method to our method is easy and almost free of work.Keywords: Max-Margin Principle, Feed-Forward Neural Network, classifier, structural risk
Procedia PDF Downloads 346172 Visual Thing Recognition with Binary Scale-Invariant Feature Transform and Support Vector Machine Classifiers Using Color Information
Authors: Wei-Jong Yang, Wei-Hau Du, Pau-Choo Chang, Jar-Ferr Yang, Pi-Hsia Hung
Abstract:
The demands of smart visual thing recognition in various devices have been increased rapidly for daily smart production, living and learning systems in recent years. This paper proposed a visual thing recognition system, which combines binary scale-invariant feature transform (SIFT), bag of words model (BoW), and support vector machine (SVM) by using color information. Since the traditional SIFT features and SVM classifiers only use the gray information, color information is still an important feature for visual thing recognition. With color-based SIFT features and SVM, we can discard unreliable matching pairs and increase the robustness of matching tasks. The experimental results show that the proposed object recognition system with color-assistant SIFT SVM classifier achieves higher recognition rate than that with the traditional gray SIFT and SVM classification in various situations.Keywords: color moments, visual thing recognition system, SIFT, color SIFT
Procedia PDF Downloads 471171 Brain Computer Interface Implementation for Affective Computing Sensing: Classifiers Comparison
Authors: Ramón Aparicio-García, Gustavo Juárez Gracia, Jesús Álvarez Cedillo
Abstract:
A research line of the computer science that involve the study of the Human-Computer Interaction (HCI), which search to recognize and interpret the user intent by the storage and the subsequent analysis of the electrical signals of the brain, for using them in the control of electronic devices. On the other hand, the affective computing research applies the human emotions in the HCI process helping to reduce the user frustration. This paper shows the results obtained during the hardware and software development of a Brain Computer Interface (BCI) capable of recognizing the human emotions through the association of the brain electrical activity patterns. The hardware involves the sensing stage and analogical-digital conversion. The interface software involves algorithms for pre-processing of the signal in time and frequency analysis and the classification of patterns associated with the electrical brain activity. The methods used for the analysis and classification of the signal have been tested separately, by using a database that is accessible to the public, besides to a comparison among classifiers in order to know the best performing.Keywords: affective computing, interface, brain, intelligent interaction
Procedia PDF Downloads 390170 Biofilm Text Classifiers Developed Using Natural Language Processing and Unsupervised Learning Approach
Authors: Kanika Gupta, Ashok Kumar
Abstract:
Biofilms are dense, highly hydrated cell clusters that are irreversibly attached to a substratum, to an interface or to each other, and are embedded in a self-produced gelatinous matrix composed of extracellular polymeric substances. Research in biofilm field has become very significant, as biofilm has shown high mechanical resilience and resistance to antibiotic treatment and constituted as a significant problem in both healthcare and other industry related to microorganisms. The massive information both stated and hidden in the biofilm literature are growing exponentially therefore it is not possible for researchers and practitioners to automatically extract and relate information from different written resources. So, the current work proposes and discusses the use of text mining techniques for the extraction of information from biofilm literature corpora containing 34306 documents. It is very difficult and expensive to obtain annotated material for biomedical literature as the literature is unstructured i.e. free-text. Therefore, we considered unsupervised approach, where no annotated training is necessary and using this approach we developed a system that will classify the text on the basis of growth and development, drug effects, radiation effects, classification and physiology of biofilms. For this, a two-step structure was used where the first step is to extract keywords from the biofilm literature using a metathesaurus and standard natural language processing tools like Rapid Miner_v5.3 and the second step is to discover relations between the genes extracted from the whole set of biofilm literature using pubmed.mineR_v1.0.11. We used unsupervised approach, which is the machine learning task of inferring a function to describe hidden structure from 'unlabeled' data, in the above-extracted datasets to develop classifiers using WinPython-64 bit_v3.5.4.0Qt5 and R studio_v0.99.467 packages which will automatically classify the text by using the mentioned sets. The developed classifiers were tested on a large data set of biofilm literature which showed that the unsupervised approach proposed is promising as well as suited for a semi-automatic labeling of the extracted relations. The entire information was stored in the relational database which was hosted locally on the server. The generated biofilm vocabulary and genes relations will be significant for researchers dealing with biofilm research, making their search easy and efficient as the keywords and genes could be directly mapped with the documents used for database development.Keywords: biofilms literature, classifiers development, text mining, unsupervised learning approach, unstructured data, relational database
Procedia PDF Downloads 172169 Neuroevolution Based on Adaptive Ensembles of Biologically Inspired Optimization Algorithms Applied for Modeling a Chemical Engineering Process
Authors: Sabina-Adriana Floria, Marius Gavrilescu, Florin Leon, Silvia Curteanu, Costel Anton
Abstract:
Neuroevolution is a subfield of artificial intelligence used to solve various problems in different application areas. Specifically, neuroevolution is a technique that applies biologically inspired methods to generate neural network architectures and optimize their parameters automatically. In this paper, we use different biologically inspired optimization algorithms in an ensemble strategy with the aim of training multilayer perceptron neural networks, resulting in regression models used to simulate the industrial chemical process of obtaining bricks from silicone-based materials. Installations in the raw ceramics industry, i.e., bricks, are characterized by significant energy consumption and large quantities of emissions. In addition, the initial conditions that were taken into account during the design and commissioning of the installation can change over time, which leads to the need to add new mixes to adjust the operating conditions for the desired purpose, e.g., material properties and energy saving. The present approach follows the study by simulation of a process of obtaining bricks from silicone-based materials, i.e., the modeling and optimization of the process. Optimization aims to determine the working conditions that minimize the emissions represented by nitrogen monoxide. We first use a search procedure to find the best values for the parameters of various biologically inspired optimization algorithms. Then, we propose an adaptive ensemble strategy that uses only a subset of the best algorithms identified in the search stage. The adaptive ensemble strategy combines the results of selected algorithms and automatically assigns more processing capacity to the more efficient algorithms. Their efficiency may also vary at different stages of the optimization process. In a given ensemble iteration, the most efficient algorithms aim to maintain good convergence, while the less efficient algorithms can improve population diversity. The proposed adaptive ensemble strategy outperforms the individual optimizers and the non-adaptive ensemble strategy in convergence speed, and the obtained results provide lower error values.Keywords: optimization, biologically inspired algorithm, neuroevolution, ensembles, bricks, emission minimization
Procedia PDF Downloads 118168 Comparison of Artificial Neural Networks and Statistical Classifiers in Olive Sorting Using Near-Infrared Spectroscopy
Authors: İsmail Kavdır, M. Burak Büyükcan, Ferhat Kurtulmuş
Abstract:
Table olive is a valuable product especially in Mediterranean countries. It is usually consumed after some fermentation process. Defects happened naturally or as a result of an impact while olives are still fresh may become more distinct after processing period. Defected olives are not desired both in table olive and olive oil industries as it will affect the final product quality and reduce market prices considerably. Therefore it is critical to sort table olives before processing or even after processing according to their quality and surface defects. However, doing manual sorting has many drawbacks such as high expenses, subjectivity, tediousness and inconsistency. Quality criterions for green olives were accepted as color and free of mechanical defects, wrinkling, surface blemishes and rotting. In this study, it was aimed to classify fresh table olives using different classifiers and NIR spectroscopy readings and also to compare the classifiers. For this purpose, green (Ayvalik variety) olives were classified based on their surface feature properties such as defect-free, with bruised defect and with fly defect using FT-NIR spectroscopy and classification algorithms such as artificial neural networks, ident and cluster. Bruker multi-purpose analyzer (MPA) FT-NIR spectrometer (Bruker Optik, GmbH, Ettlingen Germany) was used for spectral measurements. The spectrometer was equipped with InGaAs detectors (TE-InGaAs internal for reflectance and RT-InGaAs external for transmittance) and a 20-watt high intensity tungsten–halogen NIR light source. Reflectance measurements were performed with a fiber optic probe (type IN 261) which covered the wavelengths between 780–2500 nm, while transmittance measurements were performed between 800 and 1725 nm. Thirty-two scans were acquired for each reflectance spectrum in about 15.32 s while 128 scans were obtained for transmittance in about 62 s. Resolution was 8 cm⁻¹ for both spectral measurement modes. Instrument control was done using OPUS software (Bruker Optik, GmbH, Ettlingen Germany). Classification applications were performed using three classifiers; Backpropagation Neural Networks, ident and cluster classification algorithms. For these classification applications, Neural Network tool box in Matlab, ident and cluster modules in OPUS software were used. Classifications were performed considering different scenarios; two quality conditions at once (good vs bruised, good vs fly defect) and three quality conditions at once (good, bruised and fly defect). Two spectrometer readings were used in classification applications; reflectance and transmittance. Classification results obtained using artificial neural networks algorithm in discriminating good olives from bruised olives, from olives with fly defect and from the olive group including both bruised and fly defected olives with success rates respectively changing between 97 and 99%, 61 and 94% and between 58.67 and 92%. On the other hand, classification results obtained for discriminating good olives from bruised ones and also for discriminating good olives from fly defected olives using the ident method ranged between 75-97.5% and 32.5-57.5%, respectfully; results obtained for the same classification applications using the cluster method ranged between 52.5-97.5% and between 22.5-57.5%.Keywords: artificial neural networks, statistical classifiers, NIR spectroscopy, reflectance, transmittance
Procedia PDF Downloads 248167 Gait Biometric for Person Re-Identification
Authors: Lavanya Srinivasan
Abstract:
Biometric identification is to identify unique features in a person like fingerprints, iris, ear, and voice recognition that need the subject's permission and physical contact. Gait biometric is used to identify the unique gait of the person by extracting moving features. The main advantage of gait biometric to identify the gait of a person at a distance, without any physical contact. In this work, the gait biometric is used for person re-identification. The person walking naturally compared with the same person walking with bag, coat, and case recorded using longwave infrared, short wave infrared, medium wave infrared, and visible cameras. The videos are recorded in rural and in urban environments. The pre-processing technique includes human identified using YOLO, background subtraction, silhouettes extraction, and synthesis Gait Entropy Image by averaging the silhouettes. The moving features are extracted from the Gait Entropy Energy Image. The extracted features are dimensionality reduced by the principal component analysis and recognised using different classifiers. The comparative results with the different classifier show that linear discriminant analysis outperforms other classifiers with 95.8% for visible in the rural dataset and 94.8% for longwave infrared in the urban dataset.Keywords: biometric, gait, silhouettes, YOLO
Procedia PDF Downloads 173