Search results for: DNA sequences classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2640

Search results for: DNA sequences classification

2370 A Text Classification Approach Based on Natural Language Processing and Machine Learning Techniques

Authors: Rim Messaoudi, Nogaye-Gueye Gning, François Azelart

Abstract:

Automatic text classification applies mostly natural language processing (NLP) and other AI-guided techniques to automatically classify text in a faster and more accurate manner. This paper discusses the subject of using predictive maintenance to manage incident tickets inside the sociality. It focuses on proposing a tool that treats and analyses comments and notes written by administrators after resolving an incident ticket. The goal here is to increase the quality of these comments. Additionally, this tool is based on NLP and machine learning techniques to realize the textual analytics of the extracted data. This approach was tested using real data taken from the French National Railways (SNCF) company and was given a high-quality result.

Keywords: machine learning, text classification, NLP techniques, semantic representation

Procedia PDF Downloads 76
2369 Well-Defined Polypeptides: Synthesis and Selective Attachment of Poly(ethylene glycol) Functionalities

Authors: Cristina Lavilla, Andreas Heise

Abstract:

The synthesis of sequence-controlled polymers has received increasing attention in the last years. Well-defined polyacrylates, polyacrylamides and styrene-maleimide copolymers have been synthesized by sequential or kinetic addition of comonomers. However this approach has not yet been introduced to the synthesis of polypeptides, which are in fact polymers developed by nature in a sequence-controlled way. Polypeptides are natural materials that possess the ability to self-assemble into complex and highly ordered structures. Their folding and properties arise from precisely controlled sequences and compositions in their constituent amino acid monomers. So far, solid-phase peptide synthesis is the only technique that allows preparing short peptide sequences with excellent sequence control, but also requires extensive protection/deprotection steps and it is a difficult technique to scale-up. A new strategy towards sequence control in the synthesis of polypeptides is introduced, based on the sequential addition of α-amino acid-N-carboxyanhydrides (NCAs). The living ring-opening process is conducted to full conversion and no purification or deprotection is needed before addition of a new amino acid. The length of every block is predefined by the NCA:initiator ratio in every step. This method yields polypeptides with a specific sequence and controlled molecular weights. A series of polypeptides with varying block sequences have been synthesized with the aim to identify structure-property relationships. All of them are able to adopt secondary structures similar to natural polypeptides, and display properties in the solid state and in solution that are characteristic of the primary structure. By design the prepared polypeptides allow selective modification of individual block sequences, which has been exploited to introduce functionalities in defined positions along the polypeptide chain. Poly(ethylene glycol)(PEG) was the functionality chosen, as it is known to favor hydrophilicity and also yield thermoresponsive materials. After PEGylation, hydrophilicity of the polypeptides is enhanced, and their thermal response in H2O has been studied. Noteworthy differences in the behavior of the polypeptides having different sequences have been found. Circular dichroism measurements confirmed that the α-helical conformation is stable over the examined temperature range (5-90 °C). It is concluded that PEG units are the main responsible of the changes in H-bonding interactions with H2O upon variation of temperature, and the position of these functional units along the backbone is a factor of utmost importance in the resulting properties of the α-helical polypeptides.

Keywords: α-amino acid N-carboxyanhydrides, multiblock copolymers, poly(ethylene glycol), polypeptides, ring-opening polymerization, sequence control

Procedia PDF Downloads 177
2368 Wolof Voice Response Recognition System: A Deep Learning Model for Wolof Audio Classification

Authors: Krishna Mohan Bathula, Fatou Bintou Loucoubar, FNU Kaleemunnisa, Christelle Scharff, Mark Anthony De Castro

Abstract:

Voice recognition algorithms such as automatic speech recognition and text-to-speech systems with African languages can play an important role in bridging the digital divide of Artificial Intelligence in Africa, contributing to the establishment of a fully inclusive information society. This paper proposes a Deep Learning model that can classify the user responses as inputs for an interactive voice response system. A dataset with Wolof language words ‘yes’ and ‘no’ is collected as audio recordings. A two stage Data Augmentation approach is adopted for enhancing the dataset size required by the deep neural network. Data preprocessing and feature engineering with Mel-Frequency Cepstral Coefficients are implemented. Convolutional Neural Networks (CNNs) have proven to be very powerful in image classification and are promising for audio processing when sounds are transformed into spectra. For performing voice response classification, the recordings are transformed into sound frequency feature spectra and then applied image classification methodology using a deep CNN model. The inference model of this trained and reusable Wolof voice response recognition system can be integrated with many applications associated with both web and mobile platforms.

Keywords: automatic speech recognition, interactive voice response, voice response recognition, wolof word classification

Procedia PDF Downloads 92
2367 A Deep Learning Approach to Subsection Identification in Electronic Health Records

Authors: Nitin Shravan, Sudarsun Santhiappan, B. Sivaselvan

Abstract:

Subsection identification, in the context of Electronic Health Records (EHRs), is identifying the important sections for down-stream tasks like auto-coding. In this work, we classify the text present in EHRs according to their information, using machine learning and deep learning techniques. We initially describe briefly about the problem and formulate it as a text classification problem. Then, we discuss upon the methods from the literature. We try two approaches - traditional feature extraction based machine learning methods and deep learning methods. Through experiments on a private dataset, we establish that the deep learning methods perform better than the feature extraction based Machine Learning Models.

Keywords: deep learning, machine learning, semantic clinical classification, subsection identification, text classification

Procedia PDF Downloads 192
2366 Comparative Analysis of Spectral Estimation Methods for Brain-Computer Interfaces

Authors: Rafik Djemili, Hocine Bourouba, M. C. Amara Korba

Abstract:

In this paper, we present a method in order to classify EEG signals for Brain-Computer Interfaces (BCI). EEG signals are first processed by means of spectral estimation methods to derive reliable features before classification step. Spectral estimation methods used are standard periodogram and the periodogram calculated by the Welch method; both methods are compared with Logarithm of Band Power (logBP) features. In the method proposed, we apply Linear Discriminant Analysis (LDA) followed by Support Vector Machine (SVM). Classification accuracy reached could be as high as 85%, which proves the effectiveness of classification of EEG signals based BCI using spectral methods.

Keywords: brain-computer interface, motor imagery, electroencephalogram, linear discriminant analysis, support vector machine

Procedia PDF Downloads 480
2365 Optimizing Perennial Plants Image Classification by Fine-Tuning Deep Neural Networks

Authors: Khairani Binti Supyan, Fatimah Khalid, Mas Rina Mustaffa, Azreen Bin Azman, Amirul Azuani Romle

Abstract:

Perennial plant classification plays a significant role in various agricultural and environmental applications, assisting in plant identification, disease detection, and biodiversity monitoring. Nevertheless, attaining high accuracy in perennial plant image classification remains challenging due to the complex variations in plant appearance, the diverse range of environmental conditions under which images are captured, and the inherent variability in image quality stemming from various factors such as lighting conditions, camera settings, and focus. This paper proposes an adaptation approach to optimize perennial plant image classification by fine-tuning the pre-trained DNNs model. This paper explores the efficacy of fine-tuning prevalent architectures, namely VGG16, ResNet50, and InceptionV3, leveraging transfer learning to tailor the models to the specific characteristics of perennial plant datasets. A subset of the MYLPHerbs dataset consisted of 6 perennial plant species of 13481 images under various environmental conditions that were used in the experiments. Different strategies for fine-tuning, including adjusting learning rates, training set sizes, data augmentation, and architectural modifications, were investigated. The experimental outcomes underscore the effectiveness of fine-tuning deep neural networks for perennial plant image classification, with ResNet50 showcasing the highest accuracy of 99.78%. Despite ResNet50's superior performance, both VGG16 and InceptionV3 achieved commendable accuracy of 99.67% and 99.37%, respectively. The overall outcomes reaffirm the robustness of the fine-tuning approach across different deep neural network architectures, offering insights into strategies for optimizing model performance in the domain of perennial plant image classification.

Keywords: perennial plants, image classification, deep neural networks, fine-tuning, transfer learning, VGG16, ResNet50, InceptionV3

Procedia PDF Downloads 35
2364 Obstacle Classification Method Based on 2D LIDAR Database

Authors: Moohyun Lee, Soojung Hur, Yongwan Park

Abstract:

In this paper is proposed a method uses only LIDAR system to classification an obstacle and determine its type by establishing database for classifying obstacles based on LIDAR. The existing LIDAR system, in determining the recognition of obstruction in an autonomous vehicle, has an advantage in terms of accuracy and shorter recognition time. However, it was difficult to determine the type of obstacle and therefore accurate path planning based on the type of obstacle was not possible. In order to overcome this problem, a method of classifying obstacle type based on existing LIDAR and using the width of obstacle materials was proposed. However, width measurement was not sufficient to improve accuracy. In this research, the width data was used to do the first classification; database for LIDAR intensity data by four major obstacle materials on the road were created; comparison is made to the LIDAR intensity data of actual obstacle materials; and determine the obstacle type by finding the one with highest similarity values. An experiment using an actual autonomous vehicle under real environment shows that data declined in quality in comparison to 3D LIDAR and it was possible to classify obstacle materials using 2D LIDAR.

Keywords: obstacle, classification, database, LIDAR, segmentation, intensity

Procedia PDF Downloads 317
2363 Performance Analysis with the Combination of Visualization and Classification Technique for Medical Chatbot

Authors: Shajida M., Sakthiyadharshini N. P., Kamalesh S., Aswitha B.

Abstract:

Natural Language Processing (NLP) continues to play a strategic part in complaint discovery and medicine discovery during the current epidemic. This abstract provides an overview of performance analysis with a combination of visualization and classification techniques of NLP for a medical chatbot. Sentiment analysis is an important aspect of NLP that is used to determine the emotional tone behind a piece of text. This technique has been applied to various domains, including medical chatbots. In this, we have compared the combination of the decision tree with heatmap and Naïve Bayes with Word Cloud. The performance of the chatbot was evaluated using accuracy, and the results indicate that the combination of visualization and classification techniques significantly improves the chatbot's performance.

Keywords: sentimental analysis, NLP, medical chatbot, decision tree, heatmap, naïve bayes, word cloud

Procedia PDF Downloads 53
2362 Metamorphic Computer Virus Classification Using Hidden Markov Model

Authors: Babak Bashari Rad

Abstract:

A metamorphic computer virus uses different code transformation techniques to mutate its body in duplicated instances. Characteristics and function of new instances are mostly similar to their parents, but they cannot be easily detected by the majority of antivirus in market, as they depend on string signature-based detection techniques. The purpose of this research is to propose a Hidden Markov Model for classification of metamorphic viruses in executable files. In the proposed solution, portable executable files are inspected to extract the instructions opcodes needed for the examination of code. A Hidden Markov Model trained on portable executable files is employed to classify the metamorphic viruses of the same family. The proposed model is able to generate and recognize common statistical features of mutated code. The model has been evaluated by examining the model on a test data set. The performance of the model has been practically tested and evaluated based on False Positive Rate, Detection Rate and Overall Accuracy. The result showed an acceptable performance with high average of 99.7% Detection Rate.

Keywords: malware classification, computer virus classification, metamorphic virus, metamorphic malware, Hidden Markov Model

Procedia PDF Downloads 296
2361 Road Vehicle Recognition Using Magnetic Sensing Feature Extraction and Classification

Authors: Xiao Chen, Xiaoying Kong, Min Xu

Abstract:

This paper presents a road vehicle detection approach for the intelligent transportation system. This approach mainly uses low-cost magnetic sensor and associated data collection system to collect magnetic signals. This system can measure the magnetic field changing, and it also can detect and count vehicles. We extend Mel Frequency Cepstral Coefficients to analyze vehicle magnetic signals. Vehicle type features are extracted using representation of cepstrum, frame energy, and gap cepstrum of magnetic signals. We design a 2-dimensional map algorithm using Vector Quantization to classify vehicle magnetic features to four typical types of vehicles in Australian suburbs: sedan, VAN, truck, and bus. Experiments results show that our approach achieves a high level of accuracy for vehicle detection and classification.

Keywords: vehicle classification, signal processing, road traffic model, magnetic sensing

Procedia PDF Downloads 300
2360 Comparative Study of Accuracy of Land Cover/Land Use Mapping Using Medium Resolution Satellite Imagery: A Case Study

Authors: M. C. Paliwal, A. K. Jain, S. K. Katiyar

Abstract:

Classification of satellite imagery is very important for the assessment of its accuracy. In order to determine the accuracy of the classified image, usually the assumed-true data are derived from ground truth data using Global Positioning System. The data collected from satellite imagery and ground truth data is then compared to find out the accuracy of data and error matrices are prepared. Overall and individual accuracies are calculated using different methods. The study illustrates advanced classification and accuracy assessment of land use/land cover mapping using satellite imagery. IRS-1C-LISS IV data were used for classification of satellite imagery. The satellite image was classified using the software in fourteen classes namely water bodies, agricultural fields, forest land, urban settlement, barren land and unclassified area etc. Classification of satellite imagery and calculation of accuracy was done by using ERDAS-Imagine software to find out the best method. This study is based on the data collected for Bhopal city boundaries of Madhya Pradesh State of India.

Keywords: resolution, accuracy assessment, land use mapping, satellite imagery, ground truth data, error matrices

Procedia PDF Downloads 486
2359 MSIpred: A Python 2 Package for the Classification of Tumor Microsatellite Instability from Tumor Mutation Annotation Data Using a Support Vector Machine

Authors: Chen Wang, Chun Liang

Abstract:

Microsatellite instability (MSI) is characterized by high degree of polymorphism in microsatellite (MS) length due to a deficiency in mismatch repair (MMR) system. MSI is associated with several tumor types and its status can be considered as an important indicator for tumor prognostic. Conventional clinical diagnosis of MSI examines PCR products of a panel of MS markers using electrophoresis (MSI-PCR) which is laborious, time consuming, and less reliable. MSIpred, a python 2 package for automatic classification of MSI was released by this study. It computes important somatic mutation features from files in mutation annotation format (MAF) generated from paired tumor-normal exome sequencing data, subsequently using these to predict tumor MSI status with a support vector machine (SVM) classifier trained by MAF files of 1074 tumors belonging to four types. Evaluation of MSIpred on an independent 358-tumor test set achieved overall accuracy of over 98% and area under receiver operating characteristic (ROC) curve of 0.967. These results indicated that MSIpred is a robust pan-cancer MSI classification tool and can serve as a complementary diagnostic to MSI-PCR in MSI diagnosis.

Keywords: microsatellite instability, pan-cancer classification, somatic mutation, support vector machine

Procedia PDF Downloads 153
2358 The Effect of Feature Selection on Pattern Classification

Authors: Chih-Fong Tsai, Ya-Han Hu

Abstract:

The aim of feature selection (or dimensionality reduction) is to filter out unrepresentative features (or variables) making the classifier perform better than the one without feature selection. Since there are many well-known feature selection algorithms, and different classifiers based on different selection results may perform differently, very few studies consider examining the effect of performing different feature selection algorithms on the classification performances by different classifiers over different types of datasets. In this paper, two widely used algorithms, which are the genetic algorithm (GA) and information gain (IG), are used to perform feature selection. On the other hand, three well-known classifiers are constructed, which are the CART decision tree (DT), multi-layer perceptron (MLP) neural network, and support vector machine (SVM). Based on 14 different types of datasets, the experimental results show that in most cases IG is a better feature selection algorithm than GA. In addition, the combinations of IG with DT and IG with SVM perform best and second best for small and large scale datasets.

Keywords: data mining, feature selection, pattern classification, dimensionality reduction

Procedia PDF Downloads 646
2357 Application of Data Mining Techniques for Tourism Knowledge Discovery

Authors: Teklu Urgessa, Wookjae Maeng, Joong Seek Lee

Abstract:

Application of five implementations of three data mining classification techniques was experimented for extracting important insights from tourism data. The aim was to find out the best performing algorithm among the compared ones for tourism knowledge discovery. Knowledge discovery process from data was used as a process model. 10-fold cross validation method is used for testing purpose. Various data preprocessing activities were performed to get the final dataset for model building. Classification models of the selected algorithms were built with different scenarios on the preprocessed dataset. The outperformed algorithm tourism dataset was Random Forest (76%) before applying information gain based attribute selection and J48 (C4.5) (75%) after selection of top relevant attributes to the class (target) attribute. In terms of time for model building, attribute selection improves the efficiency of all algorithms. Artificial Neural Network (multilayer perceptron) showed the highest improvement (90%). The rules extracted from the decision tree model are presented, which showed intricate, non-trivial knowledge/insight that would otherwise not be discovered by simple statistical analysis with mediocre accuracy of the machine using classification algorithms.

Keywords: classification algorithms, data mining, knowledge discovery, tourism

Procedia PDF Downloads 272
2356 Accuracy Improvement of Traffic Participant Classification Using Millimeter-Wave Radar by Leveraging Simulator Based on Domain Adaptation

Authors: Tokihiko Akita, Seiichi Mita

Abstract:

A millimeter-wave radar is the most robust against adverse environments, making it an essential environment recognition sensor for automated driving. However, the reflection signal is sparse and unstable, so it is difficult to obtain the high recognition accuracy. Deep learning provides high accuracy even for them in recognition, but requires large scale datasets with ground truth. Specially, it takes a lot of cost to annotate for a millimeter-wave radar. For the solution, utilizing a simulator that can generate an annotated huge dataset is effective. Simulation of the radar is more difficult to match with real world data than camera image, and recognition by deep learning with higher-order features using the simulator causes further deviation. We have challenged to improve the accuracy of traffic participant classification by fusing simulator and real-world data with domain adaptation technique. Experimental results with the domain adaptation network created by us show that classification accuracy can be improved even with a few real-world data.

Keywords: millimeter-wave radar, object classification, deep learning, simulation, domain adaptation

Procedia PDF Downloads 72
2355 Genomic and Evolutionary Diversity of Long Terminal Repeat (LTR) Retrotransposons in Date Palm (Phoenix dactylifera)

Authors: Faisal Nouroz, Mukaramin Mukaramin

Abstract:

Of the transposable elements (TEs), the retrotransposons are the most copious elements identified from many sequenced genomes. They have played a major role in genome evolution, rearrangement, and expansions based on their copy and paste mode of proliferation. They are further divided into LTR and Non-LTR retrotransposons. The purpose of the current study was to identify the LTR REs in sequenced Phoenix dactylifera genome and to study their structural diversity. A total of 150 P. dactylifera BAC sequences with > 60kb sizes were randomly retrieved from National Center for Biotechnology Information (NCBI) database and screened for the presence of LTR retrotransposons. Seven bacterial artificial chromosomes (BAC) sequences showed full-length LTR Retrotransposons with 4 Copia and 3 Gypsy families having variable copy numbers in respective families. Reverse transcriptase (RT) domain was found as the most conserved domain among Copia and Gypsy superfamilies and was used to deduce evolutionary analysis. The amino acid residues among various RT sequences showed variability in their percentages indicating post divergence evolution. Amino acid Leucine was found in highest proportions followed by Lysine, while Methionine and Tryptophan were in lowest percentages. The phylogenetic analysis based on RT domains confirmed that although having most conserved RT regions, several evolutionary events occurred causing nucleotide polymorphisms and hence clustering of Gypsy and Copia superfamilies into their respective lineages. The study will be helpful in identification and annotation of these elements in other species and genera and their distribution patterns on chromosomes by fluorescent in situ hybridization techniques.

Keywords: transposable elements, Phoenix dactylifera, retrotransposons, phylogenetic analysis

Procedia PDF Downloads 116
2354 Attribute Index and Classification Method of Earthquake Damage Photographs of Engineering Structure

Authors: Ming Lu, Xiaojun Li, Bodi Lu, Juehui Xing

Abstract:

Earthquake damage phenomenon of each large earthquake gives comprehensive and profound real test to the dynamic performance and failure mechanism of different engineering structures. Cognitive engineering structure characteristics through seismic damage phenomenon are often far superior to expensive shaking table experiments. After the earthquake, people will record a variety of different types of engineering damage photos. However, a large number of earthquake damage photographs lack sufficient information and reduce their using value. To improve the research value and the use efficiency of engineering seismic damage photographs, this paper objects to explore and show seismic damage background information, which includes the earthquake magnitude, earthquake intensity, and the damaged structure characteristics. From the research requirement in earthquake engineering field, the authors use the 2008 China Wenchuan M8.0 earthquake photographs, and provide four kinds of attribute indexes and classification, which are seismic information, structure types, earthquake damage parts and disaster causation factors. The final object is to set up an engineering structural seismic damage database based on these four attribute indicators and classification, and eventually build a website providing seismic damage photographs.

Keywords: attribute index, classification method, earthquake damage picture, engineering structure

Procedia PDF Downloads 744
2353 Classification of Cosmological Wormhole Solutions in the Framework of General Relativity

Authors: Usamah Al-Ali

Abstract:

We explore the effect of expanding space on the exoticity of the matter supporting a traversable Lorentzian wormhole of zero radial tide whose line element is given by ds2 = dt^2 − a^2(t)[ dr^2/(1 − kr2 −b(r)/r)+ r2dΩ^2 in the context of General Relativity. This task is achieved by deriving the Einstein field equations for anisotropic matter field corresponding to the considered cosmological wormhole metric and performing a classification of their solutions on the basis of a variable equations of state (EoS) of the form p = ω(r)ρ. Explicit forms of the shape function b(r) and the scale factor a(t) arising in the classification are utilized to construct the corresponding energy-momentum tensor where the energy conditions for each case is investigated. While the violation of energy conditions is inevitable in case of static wormholes, the classification we performed leads to interesting solutions in which this violation is either reduced or eliminated.

Keywords: general relativity, Einstein field equations, energy conditions, cosmological wormhole

Procedia PDF Downloads 48
2352 Assessment on Rumen Microbial Diversity of Bali Cattle Using 16S rRNA Sequencing

Authors: Asmuddin Natsir, A. Mujnisa, Syahriani Syahrir, Marhamah Nadir, Nurul Purnomo

Abstract:

Bacteria, protozoa, Archaea, and fungi are the dominant microorganisms found in the rumen ecosystem that has an important role in converting feed ingredients into components that can be digested and utilized by the livestock host. This study was conducted to assess the diversity of rumen bacteria of bali cattle raised under traditional farming condition. Three adult bali cattle were used in this experiment. The rumen fluid samples from the three experimental animals were obtained by the Stomach Tube method before the morning feeding. The results of study indicated that the Illumina sequencing was successful in identifying 301,589 sequences, averaging 100,533 sequences, from three rumen fluid samples of three cattle. Furthermore, based on the SILVA taxonomic database, there were 19 kinds of phyla that had been successfully identified. Of the 19 phyla, there were only two dominant groups across the three samples, namely Bacteroidetes and Firmicutes, with an average percentage of 83.68% and 13.43%, respectively. Other groups such as Synergistetes, Spirochaetae, Planctomycetes can also be identified but in relatively small percentage. At the genus level, there were 157 sequences obtained from all three samples. Of this number, the most dominant group was Prevotella 1 with a percentage of 71.82% followed by 6.94% of Christencenellaceae R-7 group. Other groups such as Prevotellaceae UCG-001, Ruminococcaceae NK4A214 group, Sphaerochaeta, Ruminococcus 2, Rikenellaceae RC9 gut group, Quinella were also identified but with very low percentages. The sequencing results were able to detect the presence of 3.06% and 3.92% respectively for uncultured rumen bacterium and uncultured bacterium. In conclusion, the results of this experiment can provide an opportunity for a better understanding of the rumen bacterial diversity of the bali cattle raised under traditional farming condition and insight regarding the uncultured rumen bacterium and uncultured bacterium that need to be further explored.

Keywords: 16S rRNA sequencing, bali cattle, rumen microbial diversity, uncultured rumen bacterium

Procedia PDF Downloads 312
2351 Application of Argumentation for Improving the Classification Accuracy in Inductive Concept Formation

Authors: Vadim Vagin, Marina Fomina, Oleg Morosin

Abstract:

This paper contains the description of argumentation approach for the problem of inductive concept formation. It is proposed to use argumentation, based on defeasible reasoning with justification degrees, to improve the quality of classification models, obtained by generalization algorithms. The experiment’s results on both clear and noisy data are also presented.

Keywords: argumentation, justification degrees, inductive concept formation, noise, generalization

Procedia PDF Downloads 415
2350 Comparison of Various Classification Techniques Using WEKA for Colon Cancer Detection

Authors: Beema Akbar, Varun P. Gopi, V. Suresh Babu

Abstract:

Colon cancer causes the deaths of about half a million people every year. The common method of its detection is histopathological tissue analysis, it leads to tiredness and workload to the pathologist. A novel method is proposed that combines both structural and statistical pattern recognition used for the detection of colon cancer. This paper presents a comparison among the different classifiers such as Multilayer Perception (MLP), Sequential Minimal Optimization (SMO), Bayesian Logistic Regression (BLR) and k-star by using classification accuracy and error rate based on the percentage split method. The result shows that the best algorithm in WEKA is MLP classifier with an accuracy of 83.333% and kappa statistics is 0.625. The MLP classifier which has a lower error rate, will be preferred as more powerful classification capability.

Keywords: colon cancer, histopathological image, structural and statistical pattern recognition, multilayer perception

Procedia PDF Downloads 556
2349 Tomato-Weed Classification by RetinaNet One-Step Neural Network

Authors: Dionisio Andujar, Juan lópez-Correa, Hugo Moreno, Angela Ri

Abstract:

The increased number of weeds in tomato crops highly lower yields. Weed identification with the aim of machine learning is important to carry out site-specific control. The last advances in computer vision are a powerful tool to face the problem. The analysis of RGB (Red, Green, Blue) images through Artificial Neural Networks had been rapidly developed in the past few years, providing new methods for weed classification. The development of the algorithms for crop and weed species classification looks for a real-time classification system using Object Detection algorithms based on Convolutional Neural Networks. The site study was located in commercial corn fields. The classification system has been tested. The procedure can detect and classify weed seedlings in tomato fields. The input to the Neural Network was a set of 10,000 RGB images with a natural infestation of Cyperus rotundus l., Echinochloa crus galli L., Setaria italica L., Portulaca oeracea L., and Solanum nigrum L. The validation process was done with a random selection of RGB images containing the aforementioned species. The mean average precision (mAP) was established as the metric for object detection. The results showed agreements higher than 95 %. The system will provide the input for an online spraying system. Thus, this work plays an important role in Site Specific Weed Management by reducing herbicide use in a single step.

Keywords: deep learning, object detection, cnn, tomato, weeds

Procedia PDF Downloads 85
2348 Model of Optimal Centroids Approach for Multivariate Data Classification

Authors: Pham Van Nha, Le Cam Binh

Abstract:

Particle swarm optimization (PSO) is a population-based stochastic optimization algorithm. PSO was inspired by the natural behavior of birds and fish in migration and foraging for food. PSO is considered as a multidisciplinary optimization model that can be applied in various optimization problems. PSO’s ideas are simple and easy to understand but PSO is only applied in simple model problems. We think that in order to expand the applicability of PSO in complex problems, PSO should be described more explicitly in the form of a mathematical model. In this paper, we represent PSO in a mathematical model and apply in the multivariate data classification. First, PSOs general mathematical model (MPSO) is analyzed as a universal optimization model. Then, Model of Optimal Centroids (MOC) is proposed for the multivariate data classification. Experiments were conducted on some benchmark data sets to prove the effectiveness of MOC compared with several proposed schemes.

Keywords: analysis of optimization, artificial intelligence based optimization, optimization for learning and data analysis, global optimization

Procedia PDF Downloads 188
2347 Applying Semi-Automatic Digital Aerial Survey Technology and Canopy Characters Classification for Surface Vegetation Interpretation of Archaeological Sites

Authors: Yung-Chung Chuang

Abstract:

The cultural layers of archaeological sites are mainly affected by surface land use, land cover, and root system of surface vegetation. For this reason, continuous monitoring of land use and land cover change is important for archaeological sites protection and management. However, in actual operation, on-site investigation and orthogonal photograph interpretation require a lot of time and manpower. For this reason, it is necessary to perform a good alternative for surface vegetation survey in an automated or semi-automated manner. In this study, we applied semi-automatic digital aerial survey technology and canopy characters classification with very high-resolution aerial photographs for surface vegetation interpretation of archaeological sites. The main idea is based on different landscape or forest type can easily be distinguished with canopy characters (e.g., specific texture distribution, shadow effects and gap characters) extracted by semi-automatic image classification. A novel methodology to classify the shape of canopy characters using landscape indices and multivariate statistics was also proposed. Non-hierarchical cluster analysis was used to assess the optimal number of canopy character clusters and canonical discriminant analysis was used to generate the discriminant functions for canopy character classification (seven categories). Therefore, people could easily predict the forest type and vegetation land cover by corresponding to the specific canopy character category. The results showed that the semi-automatic classification could effectively extract the canopy characters of forest and vegetation land cover. As for forest type and vegetation type prediction, the average prediction accuracy reached 80.3%~91.7% with different sizes of test frame. It represented this technology is useful for archaeological site survey, and can improve the classification efficiency and data update rate.

Keywords: digital aerial survey, canopy characters classification, archaeological sites, multivariate statistics

Procedia PDF Downloads 120
2346 Spontaneous and Posed Smile Detection: Deep Learning, Traditional Machine Learning, and Human Performance

Authors: Liang Wang, Beste F. Yuksel, David Guy Brizan

Abstract:

A computational model of affect that can distinguish between spontaneous and posed smiles with no errors on a large, popular data set using deep learning techniques is presented in this paper. A Long Short-Term Memory (LSTM) classifier, a type of Recurrent Neural Network, is utilized and compared to human classification. Results showed that while human classification (mean of 0.7133) was above chance, the LSTM model was more accurate than human classification and other comparable state-of-the-art systems. Additionally, a high accuracy rate was maintained with small amounts of training videos (70 instances). The derivation of important features to further understand the success of our computational model were analyzed, and it was inferred that thousands of pairs of points within the eyes and mouth are important throughout all time segments in a smile. This suggests that distinguishing between a posed and spontaneous smile is a complex task, one which may account for the difficulty and lower accuracy of human classification compared to machine learning models.

Keywords: affective computing, affect detection, computer vision, deep learning, human-computer interaction, machine learning, posed smile detection, spontaneous smile detection

Procedia PDF Downloads 108
2345 Diversity and Phylogenetic Placement of Seven Inocybe (Inocybaceae, Fungi) from Benin

Authors: Hyppolite Aignon, Souleymane Yorou, Martin Ryberg, Anneli Svanholm

Abstract:

Climate change and human actions cause the extinction of wild mushrooms. In Benin, the diversity of fungi is large and may still contain species new to science but the inventory effort remains low and focuses on particularly edible species (Russula, Lactarius, Lactifluus, and also Amanita). In addition, inventories have started recently and some groups of fungi are not sufficiently sampled, however, the degradation of fungal habitat continues to increase and some species are already disappearing. (Yorou and De Kesel, 2011), however, the degradation of fungi habitat continues to increase and some species may disappear without being known. This genus (Inocybe) overlooked has a worldwide distribution and includes more than 700 species with many undiscovered or poorly known species worldwide and particularly in tropical Africa. It is therefore important to orient the inventory to other genera or important families such as Inocybe (Fungi, Agaricales) in order to highlight their diversity and also to know their phylogenetic positions with a combined approach of gene regions. This study aims to evaluate the species richness and phylogenetic position of Inocybe species and affiliated taxa in West Africa. Thus, in North Benin, we visited the Forest Reserve of Ouémé Supérieur, the Okpara forest and the Alibori Supérieur Forest Reserve. In the center, we targeted the Forest Reserve of Toui-Kilibo. The surveys have been carried during the raining season in the study area meaning from June to October. A total of 24 taxa were collected, photographed and described. The DNA was extracted, the Polymerase Chain Reaction was carried out using primers (ITS1-F, ITS4-B) for Internal transcribed spacer (ITS), (LROR, LWRB, LR7, LR5) for nuclear ribosomal (LSU), (RPB2-f5F, RPB2-b6F, RPB2- b6R2, RPB2-b7R) for RNA polymerase II gene (RPB2) and sequenced. The ITS sequences of the 24 collections of Inocybaceae were edited in Staden and all the sequences were aligned and edited with Aliview v1.17. The sequences were examined by eye for sufficient similarity to be considered the same species. 13 different species were present in the collections. In addition, sequences similar to the ITS sequences of the thirteen final species were searched using BLAST. The nLSU and RPB2 markers for these species have been inserted in a complete alignment, where species from all major Inocybaceae clades as well as from all continents except Antarctica are present. Our new sequences for nLSU and RPB2 have been manually aligned in this dataset. Phylogenetic analysis was performed using the RAxML v7.2.6 maximum likelihood software. Bootstrap replications have been set to 100 and no partitioning of the dataset has been performed. The resulting tree was viewed and edited with FigTree v1.4.3. The preliminary tree resulting from the analysis of maximum likelihood shows us that these species coming from Benin are much diversified and are distributed in four different clades (Inosperma, Inocybe, Mallocybe and Pseudosperma) on the seven clades of Inocybaceae but the phylogeny position of 7 is currently known. This study marks the diversity of Inocybe in Benin and the investigations will continue and a protection plan will be developed in the coming years.

Keywords: Benin, diversity, Inocybe, phylogeny placement

Procedia PDF Downloads 121
2344 An AK-Chart for the Non-Normal Data

Authors: Chia-Hau Liu, Tai-Yue Wang

Abstract:

Traditional multivariate control charts assume that measurement from manufacturing processes follows a multivariate normal distribution. However, this assumption may not hold or may be difficult to verify because not all the measurement from manufacturing processes are normal distributed in practice. This study develops a new multivariate control chart for monitoring the processes with non-normal data. We propose a mechanism based on integrating the one-class classification method and the adaptive technique. The adaptive technique is used to improve the sensitivity to small shift on one-class classification in statistical process control. In addition, this design provides an easy way to allocate the value of type I error so it is easier to be implemented. Finally, the simulation study and the real data from industry are used to demonstrate the effectiveness of the propose control charts.

Keywords: multivariate control chart, statistical process control, one-class classification method, non-normal data

Procedia PDF Downloads 407
2343 Electroencephalogram Based Alzheimer Disease Classification using Machine and Deep Learning Methods

Authors: Carlos Roncero-Parra, Alfonso Parreño-Torres, Jorge Mateo Sotos, Alejandro L. Borja

Abstract:

In this research, different methods based on machine/deep learning algorithms are presented for the classification and diagnosis of patients with mental disorders such as alzheimer. For this purpose, the signals obtained from 32 unipolar electrodes identified by non-invasive EEG were examined, and their basic properties were obtained. More specifically, different well-known machine learning based classifiers have been used, i.e., support vector machine (SVM), Bayesian linear discriminant analysis (BLDA), decision tree (DT), Gaussian Naïve Bayes (GNB), K-nearest neighbor (KNN) and Convolutional Neural Network (CNN). A total of 668 patients from five different hospitals have been studied in the period from 2011 to 2021. The best accuracy is obtained was around 93 % in both ADM and ADA classifications. It can be concluded that such a classification will enable the training of algorithms that can be used to identify and classify different mental disorders with high accuracy.

Keywords: alzheimer, machine learning, deep learning, EEG

Procedia PDF Downloads 99
2342 Constraining the Potential Nickel Laterite Area Using Geographic Information System-Based Multi-Criteria Rating in Surigao Del Sur

Authors: Reiner-Ace P. Mateo, Vince Paolo F. Obille

Abstract:

The traditional method of classifying the potential mineral resources requires a significant amount of time and money. In this paper, an alternative way to classify potential mineral resources with GIS application in Surigao del Sur. The three (3) analog map data inputs integrated to GIS are geologic map, topographic map, and land cover/vegetation map. The indicators used in the classification of potential nickel laterite integrated from the analog map data inputs are a geologic indicator, which is the presence of ultramafic rock from the geologic map; slope indicator and the presence of plateau edges from the topographic map; areas of forest land, grassland, and shrublands from the land cover/vegetation map. The potential mineral of the area was classified from low up to very high potential. The produced mineral potential classification map of Surigao del Sur has an estimated 4.63% low nickel laterite potential, 42.15% medium nickel laterite potential, 43.34% high nickel laterite potential, and 9.88% very high nickel laterite from its ultramafic terrains. For the validation of the produced map, it was compared with known occurrences of nickel laterite in the area using a nickel mining tenement map from the area with the application of remote sensing. Three (3) prominent nickel mining companies were delineated in the study area. The generated potential classification map of nickel-laterite in Surigao Del Sur may be of aid to the mining companies which are currently in the exploration phase in the study area. Also, the currently operating nickel mines in the study area can help to validate the reliability of the mineral classification map produced.

Keywords: mineral potential classification, nickel laterites, GIS, remote sensing, Surigao del Sur

Procedia PDF Downloads 102
2341 Performance Enrichment of Deep Feed Forward Neural Network and Deep Belief Neural Networks for Fault Detection of Automobile Gearbox Using Vibration Signal

Authors: T. Praveenkumar, Kulpreet Singh, Divy Bhanpuriya, M. Saimurugan

Abstract:

This study analysed the classification accuracy for gearbox faults using Machine Learning Techniques. Gearboxes are widely used for mechanical power transmission in rotating machines. Its rotating components such as bearings, gears, and shafts tend to wear due to prolonged usage, causing fluctuating vibrations. Increasing the dependability of mechanical components like a gearbox is hampered by their sealed design, which makes visual inspection difficult. One way of detecting impending failure is to detect a change in the vibration signature. The current study proposes various machine learning algorithms, with aid of these vibration signals for obtaining the fault classification accuracy of an automotive 4-Speed synchromesh gearbox. Experimental data in the form of vibration signals were acquired from a 4-Speed synchromesh gearbox using Data Acquisition System (DAQs). Statistical features were extracted from the acquired vibration signal under various operating conditions. Then the extracted features were given as input to the algorithms for fault classification. Supervised Machine Learning algorithms such as Support Vector Machines (SVM) and unsupervised algorithms such as Deep Feed Forward Neural Network (DFFNN), Deep Belief Networks (DBN) algorithms are used for fault classification. The fusion of DBN & DFFNN classifiers were architected to further enhance the classification accuracy and to reduce the computational complexity. The fault classification accuracy for each algorithm was thoroughly studied, tabulated, and graphically analysed for fused and individual algorithms. In conclusion, the fusion of DBN and DFFNN algorithm yielded the better classification accuracy and was selected for fault detection due to its faster computational processing and greater efficiency.

Keywords: deep belief networks, DBN, deep feed forward neural network, DFFNN, fault diagnosis, fusion of algorithm, vibration signal

Procedia PDF Downloads 91