Search results for: text classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3235

Search results for: text classification

2875 Adapting Tools for Text Monitoring and for Scenario Analysis Related to the Field of Social Disasters

Authors: Svetlana Cojocaru, Mircea Petic, Inga Titchiev

Abstract:

Humanity faces more and more often with different social disasters, which in turn can generate new accidents and catastrophes. To mitigate their consequences, it is important to obtain early possible signals about the events which are or can occur and to prepare the corresponding scenarios that could be applied. Our research is focused on solving two problems in this domain: identifying signals related that an accident occurred or may occur and mitigation of some consequences of disasters. To solve the first problem, methods of selecting and processing texts from global network Internet are developed. Information in Romanian is of special interest for us. In order to obtain the mentioned tools, we should follow several steps, divided into preparatory stage and processing stage. Throughout the first stage, we manually collected over 724 news articles and classified them into 10 categories of social disasters. It constitutes more than 150 thousand words. Using this information, a controlled vocabulary of more than 300 keywords was elaborated, that will help in the process of classification and identification of the texts related to the field of social disasters. To solve the second problem, the formalism of Petri net has been used. We deal with the problem of inhabitants’ evacuation in useful time. The analysis methods such as reachability or coverability tree and invariants technique to determine dynamic properties of the modeled systems will be used. To perform a case study of properties of extended evacuation system by adding time, the analysis modules of PIPE such as Generalized Stochastic Petri Nets (GSPN) Analysis, Simulation, State Space Analysis, and Invariant Analysis have been used. These modules helped us to obtain the average number of persons situated in the rooms and the other quantitative properties and characteristics related to its dynamics.

Keywords: lexicon of disasters, modelling, Petri nets, text annotation, social disasters

Procedia PDF Downloads 181
2874 Causal Relation Identification Using Convolutional Neural Networks and Knowledge Based Features

Authors: Tharini N. de Silva, Xiao Zhibo, Zhao Rui, Mao Kezhi

Abstract:

Causal relation identification is a crucial task in information extraction and knowledge discovery. In this work, we present two approaches to causal relation identification. The first is a classification model trained on a set of knowledge-based features. The second is a deep learning based approach training a model using convolutional neural networks to classify causal relations. We experiment with several different convolutional neural networks (CNN) models based on previous work on relation extraction as well as our own research. Our models are able to identify both explicit and implicit causal relations as well as the direction of the causal relation. The results of our experiments show a higher accuracy than previously achieved for causal relation identification tasks.

Keywords: causal realtion extraction, relation extracton, convolutional neural network, text representation

Procedia PDF Downloads 691
2873 Multimodal Sentiment Analysis With Web Based Application

Authors: Shreyansh Singh, Afroz Ahmed

Abstract:

Sentiment Analysis intends to naturally reveal the hidden mentality that we hold towards an entity. The total of this assumption over a populace addresses sentiment surveying and has various applications. Current text-based sentiment analysis depends on the development of word embeddings and Machine Learning models that take in conclusion from enormous text corpora. Sentiment Analysis from text is presently generally utilized for consumer loyalty appraisal and brand insight investigation. With the expansion of online media, multimodal assessment investigation is set to carry new freedoms with the appearance of integral information streams for improving and going past text-based feeling examination using the new transforms methods. Since supposition can be distinguished through compelling follows it leaves, like facial and vocal presentations, multimodal opinion investigation offers good roads for examining facial and vocal articulations notwithstanding the record or printed content. These methodologies use the Recurrent Neural Networks (RNNs) with the LSTM modes to increase their performance. In this study, we characterize feeling and the issue of multimodal assessment investigation and audit ongoing advancements in multimodal notion examination in various spaces, including spoken surveys, pictures, video websites, human-machine, and human-human connections. Difficulties and chances of this arising field are additionally examined, promoting our theory that multimodal feeling investigation holds critical undiscovered potential.

Keywords: sentiment analysis, RNN, LSTM, word embeddings

Procedia PDF Downloads 95
2872 A Word-to-Vector Formulation for Word Representation

Authors: Sandra Rizkallah, Amir F. Atiya

Abstract:

This work presents a novel word to vector representation that is based on embedding the words into a sphere, whereby the dot product of the corresponding vectors represents the similarity between any two words. Embedding the vectors into a sphere enabled us to take into consideration the antonymity between words, not only the synonymity, because of the suitability to handle the polarity nature of words. For example, a word and its antonym can be represented as a vector and its negative. Moreover, we have managed to extract an adequate vocabulary. The obtained results show that the proposed approach can capture the essence of the language, and can be generalized to estimate a correct similarity of any new pair of words.

Keywords: natural language processing, word to vector, text similarity, text mining

Procedia PDF Downloads 243
2871 Comparative Analysis of Spectral Estimation Methods for Brain-Computer Interfaces

Authors: Rafik Djemili, Hocine Bourouba, M. C. Amara Korba

Abstract:

In this paper, we present a method in order to classify EEG signals for Brain-Computer Interfaces (BCI). EEG signals are first processed by means of spectral estimation methods to derive reliable features before classification step. Spectral estimation methods used are standard periodogram and the periodogram calculated by the Welch method; both methods are compared with Logarithm of Band Power (logBP) features. In the method proposed, we apply Linear Discriminant Analysis (LDA) followed by Support Vector Machine (SVM). Classification accuracy reached could be as high as 85%, which proves the effectiveness of classification of EEG signals based BCI using spectral methods.

Keywords: brain-computer interface, motor imagery, electroencephalogram, linear discriminant analysis, support vector machine

Procedia PDF Downloads 477
2870 Optimizing Perennial Plants Image Classification by Fine-Tuning Deep Neural Networks

Authors: Khairani Binti Supyan, Fatimah Khalid, Mas Rina Mustaffa, Azreen Bin Azman, Amirul Azuani Romle

Abstract:

Perennial plant classification plays a significant role in various agricultural and environmental applications, assisting in plant identification, disease detection, and biodiversity monitoring. Nevertheless, attaining high accuracy in perennial plant image classification remains challenging due to the complex variations in plant appearance, the diverse range of environmental conditions under which images are captured, and the inherent variability in image quality stemming from various factors such as lighting conditions, camera settings, and focus. This paper proposes an adaptation approach to optimize perennial plant image classification by fine-tuning the pre-trained DNNs model. This paper explores the efficacy of fine-tuning prevalent architectures, namely VGG16, ResNet50, and InceptionV3, leveraging transfer learning to tailor the models to the specific characteristics of perennial plant datasets. A subset of the MYLPHerbs dataset consisted of 6 perennial plant species of 13481 images under various environmental conditions that were used in the experiments. Different strategies for fine-tuning, including adjusting learning rates, training set sizes, data augmentation, and architectural modifications, were investigated. The experimental outcomes underscore the effectiveness of fine-tuning deep neural networks for perennial plant image classification, with ResNet50 showcasing the highest accuracy of 99.78%. Despite ResNet50's superior performance, both VGG16 and InceptionV3 achieved commendable accuracy of 99.67% and 99.37%, respectively. The overall outcomes reaffirm the robustness of the fine-tuning approach across different deep neural network architectures, offering insights into strategies for optimizing model performance in the domain of perennial plant image classification.

Keywords: perennial plants, image classification, deep neural networks, fine-tuning, transfer learning, VGG16, ResNet50, InceptionV3

Procedia PDF Downloads 31
2869 Obstacle Classification Method Based on 2D LIDAR Database

Authors: Moohyun Lee, Soojung Hur, Yongwan Park

Abstract:

In this paper is proposed a method uses only LIDAR system to classification an obstacle and determine its type by establishing database for classifying obstacles based on LIDAR. The existing LIDAR system, in determining the recognition of obstruction in an autonomous vehicle, has an advantage in terms of accuracy and shorter recognition time. However, it was difficult to determine the type of obstacle and therefore accurate path planning based on the type of obstacle was not possible. In order to overcome this problem, a method of classifying obstacle type based on existing LIDAR and using the width of obstacle materials was proposed. However, width measurement was not sufficient to improve accuracy. In this research, the width data was used to do the first classification; database for LIDAR intensity data by four major obstacle materials on the road were created; comparison is made to the LIDAR intensity data of actual obstacle materials; and determine the obstacle type by finding the one with highest similarity values. An experiment using an actual autonomous vehicle under real environment shows that data declined in quality in comparison to 3D LIDAR and it was possible to classify obstacle materials using 2D LIDAR.

Keywords: obstacle, classification, database, LIDAR, segmentation, intensity

Procedia PDF Downloads 314
2868 Fake News Detection for Korean News Using Machine Learning Techniques

Authors: Tae-Uk Yun, Pullip Chung, Kee-Young Kwahk, Hyunchul Ahn

Abstract:

Fake news is defined as the news articles that are intentionally and verifiably false, and could mislead readers. Spread of fake news may provoke anxiety, chaos, fear, or irrational decisions of the public. Thus, detecting fake news and preventing its spread has become very important issue in our society. However, due to the huge amount of fake news produced every day, it is almost impossible to identify it by a human. Under this context, researchers have tried to develop automated fake news detection using machine learning techniques over the past years. But, there have been no prior studies proposed an automated fake news detection method for Korean news to our best knowledge. In this study, we aim to detect Korean fake news using text mining and machine learning techniques. Our proposed method consists of two steps. In the first step, the news contents to be analyzed is convert to quantified values using various text mining techniques (topic modeling, TF-IDF, and so on). After that, in step 2, classifiers are trained using the values produced in step 1. As the classifiers, machine learning techniques such as logistic regression, backpropagation network, support vector machine, and deep neural network can be applied. To validate the effectiveness of the proposed method, we collected about 200 short Korean news from Seoul National University’s FactCheck. which provides with detailed analysis reports from 20 media outlets and links to source documents for each case. Using this dataset, we will identify which text features are important as well as which classifiers are effective in detecting Korean fake news.

Keywords: fake news detection, Korean news, machine learning, text mining

Procedia PDF Downloads 245
2867 Metamorphic Computer Virus Classification Using Hidden Markov Model

Authors: Babak Bashari Rad

Abstract:

A metamorphic computer virus uses different code transformation techniques to mutate its body in duplicated instances. Characteristics and function of new instances are mostly similar to their parents, but they cannot be easily detected by the majority of antivirus in market, as they depend on string signature-based detection techniques. The purpose of this research is to propose a Hidden Markov Model for classification of metamorphic viruses in executable files. In the proposed solution, portable executable files are inspected to extract the instructions opcodes needed for the examination of code. A Hidden Markov Model trained on portable executable files is employed to classify the metamorphic viruses of the same family. The proposed model is able to generate and recognize common statistical features of mutated code. The model has been evaluated by examining the model on a test data set. The performance of the model has been practically tested and evaluated based on False Positive Rate, Detection Rate and Overall Accuracy. The result showed an acceptable performance with high average of 99.7% Detection Rate.

Keywords: malware classification, computer virus classification, metamorphic virus, metamorphic malware, Hidden Markov Model

Procedia PDF Downloads 290
2866 Road Vehicle Recognition Using Magnetic Sensing Feature Extraction and Classification

Authors: Xiao Chen, Xiaoying Kong, Min Xu

Abstract:

This paper presents a road vehicle detection approach for the intelligent transportation system. This approach mainly uses low-cost magnetic sensor and associated data collection system to collect magnetic signals. This system can measure the magnetic field changing, and it also can detect and count vehicles. We extend Mel Frequency Cepstral Coefficients to analyze vehicle magnetic signals. Vehicle type features are extracted using representation of cepstrum, frame energy, and gap cepstrum of magnetic signals. We design a 2-dimensional map algorithm using Vector Quantization to classify vehicle magnetic features to four typical types of vehicles in Australian suburbs: sedan, VAN, truck, and bus. Experiments results show that our approach achieves a high level of accuracy for vehicle detection and classification.

Keywords: vehicle classification, signal processing, road traffic model, magnetic sensing

Procedia PDF Downloads 296
2865 Classification of Contexts for Mentioning Love in Interviews with Victims of the Holocaust

Authors: Marina Yurievna Aleksandrova

Abstract:

Research of the Holocaust retains value not only for history but also for sociology and psychology. One of the most important fields of study is how people were coping during and after this traumatic event. The aim of this paper is to identify the main contexts of the topic of love and to determine which contexts are more characteristic for different groups of victims of the Holocaust (gender, nationality, age). In this research, transcripts of interviews with Holocaust victims that were collected during 1946 for the "Voices of the Holocaust" project were used as data. Main contexts were analyzed with methods of network analysis and latent semantic analysis and classified by gender, age, and nationality with random forest. The results show that love is articulated and described significantly differently for male and female informants, nationality is shown results with lower values of quality metrics, as well as the age.

Keywords: Holocaust, latent semantic analysis, network analysis, text-mining, random forest

Procedia PDF Downloads 159
2864 Comparative Study of Accuracy of Land Cover/Land Use Mapping Using Medium Resolution Satellite Imagery: A Case Study

Authors: M. C. Paliwal, A. K. Jain, S. K. Katiyar

Abstract:

Classification of satellite imagery is very important for the assessment of its accuracy. In order to determine the accuracy of the classified image, usually the assumed-true data are derived from ground truth data using Global Positioning System. The data collected from satellite imagery and ground truth data is then compared to find out the accuracy of data and error matrices are prepared. Overall and individual accuracies are calculated using different methods. The study illustrates advanced classification and accuracy assessment of land use/land cover mapping using satellite imagery. IRS-1C-LISS IV data were used for classification of satellite imagery. The satellite image was classified using the software in fourteen classes namely water bodies, agricultural fields, forest land, urban settlement, barren land and unclassified area etc. Classification of satellite imagery and calculation of accuracy was done by using ERDAS-Imagine software to find out the best method. This study is based on the data collected for Bhopal city boundaries of Madhya Pradesh State of India.

Keywords: resolution, accuracy assessment, land use mapping, satellite imagery, ground truth data, error matrices

Procedia PDF Downloads 478
2863 Translation and Ideology: New Perspectives

Authors: Hamza Salih

Abstract:

Since translation is no longer viewed as a mere replacement of linguistic codes from one language to another, it has increasingly been considered, especially with the advent of the cultural turn in the late 70's, in relation to the broader external context in which it takes place. According to scholars in the field, the translation process is determined by the political, economic and cultural values which exert external pressures on the translator. Correspondingly, the relationship between translation as an act of re-writing the original text and ideology has already been established. This paper addresses the issue of how ideology comes into play in the translational process and what strategies the translator adopts to foreground or circumvent ideological constraints. Along with this, the paper will touch upon the notions of censorship, manipulation, subversion and domestication which are deemed of relevance to this very topic. In fact, after the domination of the empirically-oriented linguistic approaches in translation studies, the relationship between translation and ideology has to be foregrounded to draw attention to the fact that the translation process is not a mere text-to-text linguistic transfer, but, on the contrary, takes place in the midst of economic, political, cultural and religious variables, which some scholars subsume under the category ideology.

Keywords: translation, language, ideology, subversion, censorship and manipulation

Procedia PDF Downloads 226
2862 MSIpred: A Python 2 Package for the Classification of Tumor Microsatellite Instability from Tumor Mutation Annotation Data Using a Support Vector Machine

Authors: Chen Wang, Chun Liang

Abstract:

Microsatellite instability (MSI) is characterized by high degree of polymorphism in microsatellite (MS) length due to a deficiency in mismatch repair (MMR) system. MSI is associated with several tumor types and its status can be considered as an important indicator for tumor prognostic. Conventional clinical diagnosis of MSI examines PCR products of a panel of MS markers using electrophoresis (MSI-PCR) which is laborious, time consuming, and less reliable. MSIpred, a python 2 package for automatic classification of MSI was released by this study. It computes important somatic mutation features from files in mutation annotation format (MAF) generated from paired tumor-normal exome sequencing data, subsequently using these to predict tumor MSI status with a support vector machine (SVM) classifier trained by MAF files of 1074 tumors belonging to four types. Evaluation of MSIpred on an independent 358-tumor test set achieved overall accuracy of over 98% and area under receiver operating characteristic (ROC) curve of 0.967. These results indicated that MSIpred is a robust pan-cancer MSI classification tool and can serve as a complementary diagnostic to MSI-PCR in MSI diagnosis.

Keywords: microsatellite instability, pan-cancer classification, somatic mutation, support vector machine

Procedia PDF Downloads 146
2861 The Effect of Feature Selection on Pattern Classification

Authors: Chih-Fong Tsai, Ya-Han Hu

Abstract:

The aim of feature selection (or dimensionality reduction) is to filter out unrepresentative features (or variables) making the classifier perform better than the one without feature selection. Since there are many well-known feature selection algorithms, and different classifiers based on different selection results may perform differently, very few studies consider examining the effect of performing different feature selection algorithms on the classification performances by different classifiers over different types of datasets. In this paper, two widely used algorithms, which are the genetic algorithm (GA) and information gain (IG), are used to perform feature selection. On the other hand, three well-known classifiers are constructed, which are the CART decision tree (DT), multi-layer perceptron (MLP) neural network, and support vector machine (SVM). Based on 14 different types of datasets, the experimental results show that in most cases IG is a better feature selection algorithm than GA. In addition, the combinations of IG with DT and IG with SVM perform best and second best for small and large scale datasets.

Keywords: data mining, feature selection, pattern classification, dimensionality reduction

Procedia PDF Downloads 641
2860 How Is a Machine-Translated Literary Text Organized in Coherence? An Analysis Based upon Theme-Rheme Structure

Authors: Jiang Niu, Yue Jiang

Abstract:

With the ultimate goal to automatically generate translated texts with high quality, machine translation has made tremendous improvements. However, its translations of literary works are still plagued with problems in coherence, esp. the translation between distant language pairs. One of the causes of the problems is probably the lack of linguistic knowledge to be incorporated into the training of machine translation systems. In order to enable readers to better understand the problems of machine translation in coherence, to seek out the potential knowledge to be incorporated, and thus to improve the quality of machine translation products, this study applies Theme-Rheme structure to examine how a machine-translated literary text is organized and developed in terms of coherence. Theme-Rheme structure in Systemic Functional Linguistics is a useful tool for analysis of textual coherence. Theme is the departure point of a clause and Rheme is the rest of the clause. In a text, as Themes and Rhemes may be connected with each other in meaning, they form thematic and rhematic progressions throughout the text. Based on this structure, we can look into how a text is organized and developed in terms of coherence. Methodologically, we chose Chinese and English as the language pair to be studied. Specifically, we built a comparable corpus with two modes of English translations, viz. machine translation (MT) and human translation (HT) of one Chinese literary source text. The translated texts were annotated with Themes, Rhemes and their progressions throughout the texts. The annotated texts were analyzed from two respects, the different types of Themes functioning differently in achieving coherence, and the different types of thematic and rhematic progressions functioning differently in constructing texts. By analyzing and contrasting the two modes of translations, it is found that compared with the HT, 1) the MT features “pseudo-coherence”, with lots of ill-connected fragments of information using “and”; 2) the MT system produces a static and less interconnected text that reads like a list; these two points, in turn, lead to the less coherent organization and development of the MT than that of the HT; 3) novel to traditional and previous studies, Rhemes do contribute to textual connection and coherence though less than Themes do and thus are worthy of notice in further studies. Hence, the findings suggest that Theme-Rheme structure be applied to measuring and assessing the coherence of machine translation, to being incorporated into the training of the machine translation system, and Rheme be taken into account when studying the textual coherence of both MT and HT.

Keywords: coherence, corpus-based, literary translation, machine translation, Theme-Rheme structure

Procedia PDF Downloads 180
2859 Application of Data Mining Techniques for Tourism Knowledge Discovery

Authors: Teklu Urgessa, Wookjae Maeng, Joong Seek Lee

Abstract:

Application of five implementations of three data mining classification techniques was experimented for extracting important insights from tourism data. The aim was to find out the best performing algorithm among the compared ones for tourism knowledge discovery. Knowledge discovery process from data was used as a process model. 10-fold cross validation method is used for testing purpose. Various data preprocessing activities were performed to get the final dataset for model building. Classification models of the selected algorithms were built with different scenarios on the preprocessed dataset. The outperformed algorithm tourism dataset was Random Forest (76%) before applying information gain based attribute selection and J48 (C4.5) (75%) after selection of top relevant attributes to the class (target) attribute. In terms of time for model building, attribute selection improves the efficiency of all algorithms. Artificial Neural Network (multilayer perceptron) showed the highest improvement (90%). The rules extracted from the decision tree model are presented, which showed intricate, non-trivial knowledge/insight that would otherwise not be discovered by simple statistical analysis with mediocre accuracy of the machine using classification algorithms.

Keywords: classification algorithms, data mining, knowledge discovery, tourism

Procedia PDF Downloads 271
2858 Accuracy Improvement of Traffic Participant Classification Using Millimeter-Wave Radar by Leveraging Simulator Based on Domain Adaptation

Authors: Tokihiko Akita, Seiichi Mita

Abstract:

A millimeter-wave radar is the most robust against adverse environments, making it an essential environment recognition sensor for automated driving. However, the reflection signal is sparse and unstable, so it is difficult to obtain the high recognition accuracy. Deep learning provides high accuracy even for them in recognition, but requires large scale datasets with ground truth. Specially, it takes a lot of cost to annotate for a millimeter-wave radar. For the solution, utilizing a simulator that can generate an annotated huge dataset is effective. Simulation of the radar is more difficult to match with real world data than camera image, and recognition by deep learning with higher-order features using the simulator causes further deviation. We have challenged to improve the accuracy of traffic participant classification by fusing simulator and real-world data with domain adaptation technique. Experimental results with the domain adaptation network created by us show that classification accuracy can be improved even with a few real-world data.

Keywords: millimeter-wave radar, object classification, deep learning, simulation, domain adaptation

Procedia PDF Downloads 66
2857 Attribute Index and Classification Method of Earthquake Damage Photographs of Engineering Structure

Authors: Ming Lu, Xiaojun Li, Bodi Lu, Juehui Xing

Abstract:

Earthquake damage phenomenon of each large earthquake gives comprehensive and profound real test to the dynamic performance and failure mechanism of different engineering structures. Cognitive engineering structure characteristics through seismic damage phenomenon are often far superior to expensive shaking table experiments. After the earthquake, people will record a variety of different types of engineering damage photos. However, a large number of earthquake damage photographs lack sufficient information and reduce their using value. To improve the research value and the use efficiency of engineering seismic damage photographs, this paper objects to explore and show seismic damage background information, which includes the earthquake magnitude, earthquake intensity, and the damaged structure characteristics. From the research requirement in earthquake engineering field, the authors use the 2008 China Wenchuan M8.0 earthquake photographs, and provide four kinds of attribute indexes and classification, which are seismic information, structure types, earthquake damage parts and disaster causation factors. The final object is to set up an engineering structural seismic damage database based on these four attribute indicators and classification, and eventually build a website providing seismic damage photographs.

Keywords: attribute index, classification method, earthquake damage picture, engineering structure

Procedia PDF Downloads 739
2856 Classification of Cosmological Wormhole Solutions in the Framework of General Relativity

Authors: Usamah Al-Ali

Abstract:

We explore the effect of expanding space on the exoticity of the matter supporting a traversable Lorentzian wormhole of zero radial tide whose line element is given by ds2 = dt^2 − a^2(t)[ dr^2/(1 − kr2 −b(r)/r)+ r2dΩ^2 in the context of General Relativity. This task is achieved by deriving the Einstein field equations for anisotropic matter field corresponding to the considered cosmological wormhole metric and performing a classification of their solutions on the basis of a variable equations of state (EoS) of the form p = ω(r)ρ. Explicit forms of the shape function b(r) and the scale factor a(t) arising in the classification are utilized to construct the corresponding energy-momentum tensor where the energy conditions for each case is investigated. While the violation of energy conditions is inevitable in case of static wormholes, the classification we performed leads to interesting solutions in which this violation is either reduced or eliminated.

Keywords: general relativity, Einstein field equations, energy conditions, cosmological wormhole

Procedia PDF Downloads 45
2855 Emotion Detection in Twitter Messages Using Combination of Long Short-Term Memory and Convolutional Deep Neural Networks

Authors: Bahareh Golchin, Nooshin Riahi

Abstract:

One of the most significant issues as attended a lot in recent years is that of recognizing the sentiments and emotions in social media texts. The analysis of sentiments and emotions is intended to recognize the conceptual information such as the opinions, feelings, attitudes and emotions of people towards the products, services, organizations, people, topics, events and features in the written text. These indicate the greatness of the problem space. In the real world, businesses and organizations are always looking for tools to gather ideas, emotions, and directions of people about their products, services, or events related to their own. This article uses the Twitter social network, one of the most popular social networks with about 420 million active users, to extract data. Using this social network, users can share their information and opinions about personal issues, policies, products, events, etc. It can be used with appropriate classification of emotional states due to the availability of its data. In this study, supervised learning and deep neural network algorithms are used to classify the emotional states of Twitter users. The use of deep learning methods to increase the learning capacity of the model is an advantage due to the large amount of available data. Tweets collected on various topics are classified into four classes using a combination of two Bidirectional Long Short Term Memory network and a Convolutional network. The results obtained from this study with an average accuracy of 93%, show good results extracted from the proposed framework and improved accuracy compared to previous work.

Keywords: emotion classification, sentiment analysis, social networks, deep neural networks

Procedia PDF Downloads 114
2854 Application of Argumentation for Improving the Classification Accuracy in Inductive Concept Formation

Authors: Vadim Vagin, Marina Fomina, Oleg Morosin

Abstract:

This paper contains the description of argumentation approach for the problem of inductive concept formation. It is proposed to use argumentation, based on defeasible reasoning with justification degrees, to improve the quality of classification models, obtained by generalization algorithms. The experiment’s results on both clear and noisy data are also presented.

Keywords: argumentation, justification degrees, inductive concept formation, noise, generalization

Procedia PDF Downloads 414
2853 Comparison of Various Classification Techniques Using WEKA for Colon Cancer Detection

Authors: Beema Akbar, Varun P. Gopi, V. Suresh Babu

Abstract:

Colon cancer causes the deaths of about half a million people every year. The common method of its detection is histopathological tissue analysis, it leads to tiredness and workload to the pathologist. A novel method is proposed that combines both structural and statistical pattern recognition used for the detection of colon cancer. This paper presents a comparison among the different classifiers such as Multilayer Perception (MLP), Sequential Minimal Optimization (SMO), Bayesian Logistic Regression (BLR) and k-star by using classification accuracy and error rate based on the percentage split method. The result shows that the best algorithm in WEKA is MLP classifier with an accuracy of 83.333% and kappa statistics is 0.625. The MLP classifier which has a lower error rate, will be preferred as more powerful classification capability.

Keywords: colon cancer, histopathological image, structural and statistical pattern recognition, multilayer perception

Procedia PDF Downloads 552
2852 Tomato-Weed Classification by RetinaNet One-Step Neural Network

Authors: Dionisio Andujar, Juan lópez-Correa, Hugo Moreno, Angela Ri

Abstract:

The increased number of weeds in tomato crops highly lower yields. Weed identification with the aim of machine learning is important to carry out site-specific control. The last advances in computer vision are a powerful tool to face the problem. The analysis of RGB (Red, Green, Blue) images through Artificial Neural Networks had been rapidly developed in the past few years, providing new methods for weed classification. The development of the algorithms for crop and weed species classification looks for a real-time classification system using Object Detection algorithms based on Convolutional Neural Networks. The site study was located in commercial corn fields. The classification system has been tested. The procedure can detect and classify weed seedlings in tomato fields. The input to the Neural Network was a set of 10,000 RGB images with a natural infestation of Cyperus rotundus l., Echinochloa crus galli L., Setaria italica L., Portulaca oeracea L., and Solanum nigrum L. The validation process was done with a random selection of RGB images containing the aforementioned species. The mean average precision (mAP) was established as the metric for object detection. The results showed agreements higher than 95 %. The system will provide the input for an online spraying system. Thus, this work plays an important role in Site Specific Weed Management by reducing herbicide use in a single step.

Keywords: deep learning, object detection, cnn, tomato, weeds

Procedia PDF Downloads 81
2851 Model of Optimal Centroids Approach for Multivariate Data Classification

Authors: Pham Van Nha, Le Cam Binh

Abstract:

Particle swarm optimization (PSO) is a population-based stochastic optimization algorithm. PSO was inspired by the natural behavior of birds and fish in migration and foraging for food. PSO is considered as a multidisciplinary optimization model that can be applied in various optimization problems. PSO’s ideas are simple and easy to understand but PSO is only applied in simple model problems. We think that in order to expand the applicability of PSO in complex problems, PSO should be described more explicitly in the form of a mathematical model. In this paper, we represent PSO in a mathematical model and apply in the multivariate data classification. First, PSOs general mathematical model (MPSO) is analyzed as a universal optimization model. Then, Model of Optimal Centroids (MOC) is proposed for the multivariate data classification. Experiments were conducted on some benchmark data sets to prove the effectiveness of MOC compared with several proposed schemes.

Keywords: analysis of optimization, artificial intelligence based optimization, optimization for learning and data analysis, global optimization

Procedia PDF Downloads 182
2850 Applying Semi-Automatic Digital Aerial Survey Technology and Canopy Characters Classification for Surface Vegetation Interpretation of Archaeological Sites

Authors: Yung-Chung Chuang

Abstract:

The cultural layers of archaeological sites are mainly affected by surface land use, land cover, and root system of surface vegetation. For this reason, continuous monitoring of land use and land cover change is important for archaeological sites protection and management. However, in actual operation, on-site investigation and orthogonal photograph interpretation require a lot of time and manpower. For this reason, it is necessary to perform a good alternative for surface vegetation survey in an automated or semi-automated manner. In this study, we applied semi-automatic digital aerial survey technology and canopy characters classification with very high-resolution aerial photographs for surface vegetation interpretation of archaeological sites. The main idea is based on different landscape or forest type can easily be distinguished with canopy characters (e.g., specific texture distribution, shadow effects and gap characters) extracted by semi-automatic image classification. A novel methodology to classify the shape of canopy characters using landscape indices and multivariate statistics was also proposed. Non-hierarchical cluster analysis was used to assess the optimal number of canopy character clusters and canonical discriminant analysis was used to generate the discriminant functions for canopy character classification (seven categories). Therefore, people could easily predict the forest type and vegetation land cover by corresponding to the specific canopy character category. The results showed that the semi-automatic classification could effectively extract the canopy characters of forest and vegetation land cover. As for forest type and vegetation type prediction, the average prediction accuracy reached 80.3%~91.7% with different sizes of test frame. It represented this technology is useful for archaeological site survey, and can improve the classification efficiency and data update rate.

Keywords: digital aerial survey, canopy characters classification, archaeological sites, multivariate statistics

Procedia PDF Downloads 116
2849 Spontaneous and Posed Smile Detection: Deep Learning, Traditional Machine Learning, and Human Performance

Authors: Liang Wang, Beste F. Yuksel, David Guy Brizan

Abstract:

A computational model of affect that can distinguish between spontaneous and posed smiles with no errors on a large, popular data set using deep learning techniques is presented in this paper. A Long Short-Term Memory (LSTM) classifier, a type of Recurrent Neural Network, is utilized and compared to human classification. Results showed that while human classification (mean of 0.7133) was above chance, the LSTM model was more accurate than human classification and other comparable state-of-the-art systems. Additionally, a high accuracy rate was maintained with small amounts of training videos (70 instances). The derivation of important features to further understand the success of our computational model were analyzed, and it was inferred that thousands of pairs of points within the eyes and mouth are important throughout all time segments in a smile. This suggests that distinguishing between a posed and spontaneous smile is a complex task, one which may account for the difficulty and lower accuracy of human classification compared to machine learning models.

Keywords: affective computing, affect detection, computer vision, deep learning, human-computer interaction, machine learning, posed smile detection, spontaneous smile detection

Procedia PDF Downloads 105
2848 An AK-Chart for the Non-Normal Data

Authors: Chia-Hau Liu, Tai-Yue Wang

Abstract:

Traditional multivariate control charts assume that measurement from manufacturing processes follows a multivariate normal distribution. However, this assumption may not hold or may be difficult to verify because not all the measurement from manufacturing processes are normal distributed in practice. This study develops a new multivariate control chart for monitoring the processes with non-normal data. We propose a mechanism based on integrating the one-class classification method and the adaptive technique. The adaptive technique is used to improve the sensitivity to small shift on one-class classification in statistical process control. In addition, this design provides an easy way to allocate the value of type I error so it is easier to be implemented. Finally, the simulation study and the real data from industry are used to demonstrate the effectiveness of the propose control charts.

Keywords: multivariate control chart, statistical process control, one-class classification method, non-normal data

Procedia PDF Downloads 397
2847 Electroencephalogram Based Alzheimer Disease Classification using Machine and Deep Learning Methods

Authors: Carlos Roncero-Parra, Alfonso Parreño-Torres, Jorge Mateo Sotos, Alejandro L. Borja

Abstract:

In this research, different methods based on machine/deep learning algorithms are presented for the classification and diagnosis of patients with mental disorders such as alzheimer. For this purpose, the signals obtained from 32 unipolar electrodes identified by non-invasive EEG were examined, and their basic properties were obtained. More specifically, different well-known machine learning based classifiers have been used, i.e., support vector machine (SVM), Bayesian linear discriminant analysis (BLDA), decision tree (DT), Gaussian Naïve Bayes (GNB), K-nearest neighbor (KNN) and Convolutional Neural Network (CNN). A total of 668 patients from five different hospitals have been studied in the period from 2011 to 2021. The best accuracy is obtained was around 93 % in both ADM and ADA classifications. It can be concluded that such a classification will enable the training of algorithms that can be used to identify and classify different mental disorders with high accuracy.

Keywords: alzheimer, machine learning, deep learning, EEG

Procedia PDF Downloads 88
2846 Constraining the Potential Nickel Laterite Area Using Geographic Information System-Based Multi-Criteria Rating in Surigao Del Sur

Authors: Reiner-Ace P. Mateo, Vince Paolo F. Obille

Abstract:

The traditional method of classifying the potential mineral resources requires a significant amount of time and money. In this paper, an alternative way to classify potential mineral resources with GIS application in Surigao del Sur. The three (3) analog map data inputs integrated to GIS are geologic map, topographic map, and land cover/vegetation map. The indicators used in the classification of potential nickel laterite integrated from the analog map data inputs are a geologic indicator, which is the presence of ultramafic rock from the geologic map; slope indicator and the presence of plateau edges from the topographic map; areas of forest land, grassland, and shrublands from the land cover/vegetation map. The potential mineral of the area was classified from low up to very high potential. The produced mineral potential classification map of Surigao del Sur has an estimated 4.63% low nickel laterite potential, 42.15% medium nickel laterite potential, 43.34% high nickel laterite potential, and 9.88% very high nickel laterite from its ultramafic terrains. For the validation of the produced map, it was compared with known occurrences of nickel laterite in the area using a nickel mining tenement map from the area with the application of remote sensing. Three (3) prominent nickel mining companies were delineated in the study area. The generated potential classification map of nickel-laterite in Surigao Del Sur may be of aid to the mining companies which are currently in the exploration phase in the study area. Also, the currently operating nickel mines in the study area can help to validate the reliability of the mineral classification map produced.

Keywords: mineral potential classification, nickel laterites, GIS, remote sensing, Surigao del Sur

Procedia PDF Downloads 99