Search results for: classification and regression tree
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 5862

Search results for: classification and regression tree

5322 The Profit Trend of Cosmetics Products Using Bootstrap Edgeworth Approximation

Authors: Edlira Donefski, Lorenc Ekonomi, Tina Donefski

Abstract:

Edgeworth approximation is one of the most important statistical methods that has a considered contribution in the reduction of the sum of standard deviation of the independent variables’ coefficients in a Quantile Regression Model. This model estimates the conditional median or other quantiles. In this paper, we have applied approximating statistical methods in an economical problem. We have created and generated a quantile regression model to see how the profit gained is connected with the realized sales of the cosmetic products in a real data, taken from a local business. The Linear Regression of the generated profit and the realized sales was not free of autocorrelation and heteroscedasticity, so this is the reason that we have used this model instead of Linear Regression. Our aim is to analyze in more details the relation between the variables taken into study: the profit and the finalized sales and how to minimize the standard errors of the independent variable involved in this study, the level of realized sales. The statistical methods that we have applied in our work are Edgeworth Approximation for Independent and Identical distributed (IID) cases, Bootstrap version of the Model and the Edgeworth approximation for Bootstrap Quantile Regression Model. The graphics and the results that we have presented here identify the best approximating model of our study.

Keywords: bootstrap, edgeworth approximation, IID, quantile

Procedia PDF Downloads 155
5321 Random Subspace Ensemble of CMAC Classifiers

Authors: Somaiyeh Dehghan, Mohammad Reza Kheirkhahan Haghighi

Abstract:

The rapid growth of domains that have data with a large number of features, while the number of samples is limited has caused difficulty in constructing strong classifiers. To reduce the dimensionality of the feature space becomes an essential step in classification task. Random subspace method (or attribute bagging) is an ensemble classifier that consists of several classifiers that each base learner in ensemble has subset of features. In the present paper, we introduce Random Subspace Ensemble of CMAC neural network (RSE-CMAC), each of which has training with subset of features. Then we use this model for classification task. For evaluation performance of our model, we compare it with bagging algorithm on 36 UCI datasets. The results reveal that the new model has better performance.

Keywords: classification, random subspace, ensemble, CMAC neural network

Procedia PDF Downloads 327
5320 Crop Classification using Unmanned Aerial Vehicle Images

Authors: Iqra Yaseen

Abstract:

One of the well-known areas of computer science and engineering, image processing in the context of computer vision has been essential to automation. In remote sensing, medical science, and many other fields, it has made it easier to uncover previously undiscovered facts. Grading of diverse items is now possible because of neural network algorithms, categorization, and digital image processing. Its use in the classification of agricultural products, particularly in the grading of seeds or grains and their cultivars, is widely recognized. A grading and sorting system enables the preservation of time, consistency, and uniformity. Global population growth has led to an increase in demand for food staples, biofuel, and other agricultural products. To meet this demand, available resources must be used and managed more effectively. Image processing is rapidly growing in the field of agriculture. Many applications have been developed using this approach for crop identification and classification, land and disease detection and for measuring other parameters of crop. Vegetation localization is the base of performing these task. Vegetation helps to identify the area where the crop is present. The productivity of the agriculture industry can be increased via image processing that is based upon Unmanned Aerial Vehicle photography and satellite. In this paper we use the machine learning techniques like Convolutional Neural Network, deep learning, image processing, classification, You Only Live Once to UAV imaging dataset to divide the crop into distinct groups and choose the best way to use it.

Keywords: image processing, UAV, YOLO, CNN, deep learning, classification

Procedia PDF Downloads 100
5319 Application of Remote Sensing and GIS in Assessing Land Cover Changes within Granite Quarries around Brits Area, South Africa

Authors: Refilwe Moeletsi

Abstract:

Dimension stone quarrying around Brits and Belfast areas started in the early 1930s and has been growing rapidly since then. Environmental impacts associated with these quarries have not been documented, and hence this study aims at detecting any change in the environment that might have been caused by these activities. Landsat images that were used to assess land use/land cover changes in Brits quarries from 1998 - 2015. A supervised classification using maximum likelihood classifier was applied to classify each image into different land use/land cover types. Classification accuracy was assessed using Google Earth™ as a source of reference data. Post-classification change detection method was used to determine changes. The results revealed significant increase in granite quarries and corresponding decrease in vegetation cover within the study region.

Keywords: remote sensing, GIS, change detection, granite quarries

Procedia PDF Downloads 308
5318 Effect of Ramp Rate on the Preparation of Activated Carbon from Saudi Date Tree Fronds (Agro Waste) by Physical Activation Method

Authors: Muhammad Shoaib, Hassan M Al-Swaidan

Abstract:

Saudi Arabia is the major date producer in the world. In order to maximize the production from date tree, pruning of the date trees is required annually. Large amount of this agriculture waste material (palm tree fronds) is available in Saudi Arabia and considered as an ideal source as a precursor for production of activated carbon (AC). The single step procedure for the preparation of micro porous activated carbon (AC) from Saudi date tree fronds using mixture of gases (N2 and CO2) is carried out at carbonization/activation temperature at 850°C and at different ramp rates of 10, 20 and 30 degree per minute. Alloy 330 horizontal reactor is used for tube furnace. Flow rate of nitrogen and carbon dioxide gases are kept at 150 ml/min and 50 ml/min respectively during the preparation. Characterization results reveal that the BET surface area, pore volume, and average pore diameter of the resulting activated carbon generally decreases with the increase in ramp rate. The activated carbon prepared at a ramp rate of 10 degrees/minute attains larger surface area and can offer higher potential to produce activated carbon of greater adsorption capacity from agriculture wastes such as date fronds. The BET surface areas of the activated carbons prepared at a ramp rate of 10, 20 and 30 degree/minute after 30 minutes activation time are 1094, 1020 and 515 m2/g, respectively. Scanning electron microscopy (SEM) for surface morphology, and FTIR for functional groups was carried out that also verified the same trend. Moreover, by increasing the ramp rate from 10 and 20 degrees/min the yield remains same, i.e. 18%, whereas at a ramp rate of 30 degrees/min the yield increases from 18 to 20%. Thus, it is feasible to produce high-quality micro porous activated carbon from date frond agro waste using N2 carbonization followed by physical activation with CO2 and N2 mixture. This micro porous activated carbon can be used as adsorbent of heavy metals from wastewater, NOx SOx emission adsorption from ambient air and electricity generation plants, purification of gases, sewage treatment and many other applications.

Keywords: activated carbon, date tree fronds, agricultural waste, applied chemistry

Procedia PDF Downloads 275
5317 Hyperspectral Data Classification Algorithm Based on the Deep Belief and Self-Organizing Neural Network

Authors: Li Qingjian, Li Ke, He Chun, Huang Yong

Abstract:

In this paper, the method of combining the Pohl Seidman's deep belief network with the self-organizing neural network is proposed to classify the target. This method is mainly aimed at the high nonlinearity of the hyperspectral image, the high sample dimension and the difficulty in designing the classifier. The main feature of original data is extracted by deep belief network. In the process of extracting features, adding known labels samples to fine tune the network, enriching the main characteristics. Then, the extracted feature vectors are classified into the self-organizing neural network. This method can effectively reduce the dimensions of data in the spectrum dimension in the preservation of large amounts of raw data information, to solve the traditional clustering and the long training time when labeled samples less deep learning algorithm for training problems, improve the classification accuracy and robustness. Through the data simulation, the results show that the proposed network structure can get a higher classification precision in the case of a small number of known label samples.

Keywords: DBN, SOM, pattern classification, hyperspectral, data compression

Procedia PDF Downloads 338
5316 Automatic Method for Classification of Informative and Noninformative Images in Colonoscopy Video

Authors: Nidhal K. Azawi, John M. Gauch

Abstract:

Colorectal cancer is one of the leading causes of cancer death in the US and the world, which is why millions of colonoscopy examinations are performed annually. Unfortunately, noise, specular highlights, and motion artifacts corrupt many images in a typical colonoscopy exam. The goal of our research is to produce automated techniques to detect and correct or remove these noninformative images from colonoscopy videos, so physicians can focus their attention on informative images. In this research, we first automatically extract features from images. Then we use machine learning and deep neural network to classify colonoscopy images as either informative or noninformative. Our results show that we achieve image classification accuracy between 92-98%. We also show how the removal of noninformative images together with image alignment can aid in the creation of image panoramas and other visualizations of colonoscopy images.

Keywords: colonoscopy classification, feature extraction, image alignment, machine learning

Procedia PDF Downloads 249
5315 Predicting Groundwater Areas Using Data Mining Techniques: Groundwater in Jordan as Case Study

Authors: Faisal Aburub, Wael Hadi

Abstract:

Data mining is the process of extracting useful or hidden information from a large database. Extracted information can be used to discover relationships among features, where data objects are grouped according to logical relationships; or to predict unseen objects to one of the predefined groups. In this paper, we aim to investigate four well-known data mining algorithms in order to predict groundwater areas in Jordan. These algorithms are Support Vector Machines (SVMs), Naïve Bayes (NB), K-Nearest Neighbor (kNN) and Classification Based on Association Rule (CBA). The experimental results indicate that the SVMs algorithm outperformed other algorithms in terms of classification accuracy, precision and F1 evaluation measures using the datasets of groundwater areas that were collected from Jordanian Ministry of Water and Irrigation.

Keywords: classification, data mining, evaluation measures, groundwater

Procedia PDF Downloads 275
5314 Spatio-Temporal Assessment of Urban Growth and Land Use Change in Islamabad Using Object-Based Classification Method

Authors: Rabia Shabbir, Sheikh Saeed Ahmad, Amna Butt

Abstract:

Rapid land use changes have taken place in Islamabad, the capital city of Pakistan, over the past decades due to accelerated urbanization and industrialization. In this study, land use changes in the metropolitan area of Islamabad was observed by the combined use of GIS and satellite remote sensing for a time period of 15 years. High-resolution Google Earth images were downloaded from 2000-2015, and object-based classification method was used for accurate classification using eCognition software. The information regarding urban settlements, industrial area, barren land, agricultural area, vegetation, water, and transportation infrastructure was extracted. The results showed that the city experienced a spatial expansion, rapid urban growth, land use change and expanding transportation infrastructure. The study concluded the integration of GIS and remote sensing as an effective approach for analyzing the spatial pattern of urban growth and land use change.

Keywords: land use change, urban growth, Islamabad, object-based classification, Google Earth, remote sensing, GIS

Procedia PDF Downloads 147
5313 Effectiveness of ISSR Technique in Revealing Genetic Diversity of Phaseolus vulgaris L. Representing Various Parts of the World

Authors: Mohamed El-Shikh

Abstract:

Phaseolus vulgaris L. is the world’s second most important bean after soybeans; used for human food and animal feed. It has generally been linked to reduced risk of cardiovascular disease, diabetes mellitus, obesity, cancer and diseases of digestive tract. The effectiveness of ISSR in achievement of the genetic diversity among 60 common bean accessions; represent various germplasms around the world was investigated. In general, the studied Phaseolus vulgaris accessions were divided into 2 major groups. All of the South-American accessions were separated into the second major group. These accessions may have different genetic features that are distinct from the rest of the accessions clustered in the major group. Asia and Europe accessions (1-20) seem to be more genetically similar (99%) to each other as they clustered in the same sub-group. The American and African varieties showed similarities as well and clustered in the same sub-tree group. In contrast, Asian and American accessions No. 22 and 23 showed a high level of genetic similarities, although these were isolated from different regions. The phylogenetic tree showed that all the Asian accessions (along with Australian No. 59 and 60) were similar except Indian and Yemen accessions No. 9 and 20. Only Netherlands accession No. 3 was different from the rest of European accessions. Morocco accession No. 52 was genetically different from the rest of the African accessions. Canadian accession No. 44 seems to be different from the other North American accessions including Guatemala, Mexico and USA.

Keywords: phylogenetic tree, Phaseolus vulgaris, ISSR technique, genetics

Procedia PDF Downloads 405
5312 Analyzing Tools and Techniques for Classification In Educational Data Mining: A Survey

Authors: D. I. George Amalarethinam, A. Emima

Abstract:

Educational Data Mining (EDM) is one of the newest topics to emerge in recent years, and it is concerned with developing methods for analyzing various types of data gathered from the educational circle. EDM methods and techniques with machine learning algorithms are used to extract meaningful and usable information from huge databases. For scientists and researchers, realistic applications of Machine Learning in the EDM sectors offer new frontiers and present new problems. One of the most important research areas in EDM is predicting student success. The prediction algorithms and techniques must be developed to forecast students' performance, which aids the tutor, institution to boost the level of student’s performance. This paper examines various classification techniques in prediction methods and data mining tools used in EDM.

Keywords: classification technique, data mining, EDM methods, prediction methods

Procedia PDF Downloads 114
5311 Normalized Laplacian Eigenvalues of Graphs

Authors: Shaowei Sun

Abstract:

Let G be a graph with vertex set V(G)={v_1,v_2,...,v_n} and edge set E(G). For any vertex v belong to V(G), let d_v denote the degree of v. The normalized Laplacian matrix of the graph G is the matrix where the non-diagonal (i,j)-th entry is -1/(d_id_j) when vertex i is adjacent to vertex j and 0 when they are not adjacent, and the diagonal (i,i)-th entry is the di. In this paper, we discuss some bounds on the largest and the second smallest normalized Laplacian eigenvalue of trees and graphs. As following, we found some new bounds on the second smallest normalized Laplacian eigenvalue of tree T in terms of graph parameters. Moreover, we use Sage to give some conjectures on the second largest and the third smallest normalized eigenvalues of graph.

Keywords: graph, normalized Laplacian eigenvalues, normalized Laplacian matrix, tree

Procedia PDF Downloads 326
5310 Morphological Processing of Punjabi Text for Sentiment Analysis of Farmer Suicides

Authors: Jaspreet Singh, Gurvinder Singh, Prabhsimran Singh, Rajinder Singh, Prithvipal Singh, Karanjeet Singh Kahlon, Ravinder Singh Sawhney

Abstract:

Morphological evaluation of Indian languages is one of the burgeoning fields in the area of Natural Language Processing (NLP). The evaluation of a language is an eminent task in the era of information retrieval and text mining. The extraction and classification of knowledge from text can be exploited for sentiment analysis and morphological evaluation. This study coalesce morphological evaluation and sentiment analysis for the task of classification of farmer suicide cases reported in Punjab state of India. The pre-processing of Punjabi text involves morphological evaluation and normalization of Punjabi word tokens followed by the training of proposed model using deep learning classification on Punjabi language text extracted from online Punjabi news reports. The class-wise accuracies of sentiment prediction for four negatively oriented classes of farmer suicide cases are 93.85%, 88.53%, 83.3%, and 95.45% respectively. The overall accuracy of sentiment classification obtained using proposed framework on 275 Punjabi text documents is found to be 90.29%.

Keywords: deep neural network, farmer suicides, morphological processing, punjabi text, sentiment analysis

Procedia PDF Downloads 317
5309 A Nonlinear Feature Selection Method for Hyperspectral Image Classification

Authors: Pei-Jyun Hsieh, Cheng-Hsuan Li, Bor-Chen Kuo

Abstract:

For hyperspectral image classification, feature reduction is an important pre-processing for avoiding the Hughes phenomena due to the difficulty for collecting training samples. Hence, lots of researches developed feature selection methods such as F-score, HSIC (Hilbert-Schmidt Independence Criterion), and etc., to improve hyperspectral image classification. However, most of them only consider the class separability in the original space, i.e., a linear class separability. In this study, we proposed a nonlinear class separability measure based on kernel trick for selecting an appropriate feature subset. The proposed nonlinear class separability was formed by a generalized RBF kernel with different bandwidths with respect to different features. Moreover, it considered the within-class separability and the between-class separability. A genetic algorithm was applied to tune these bandwidths such that the smallest with-class separability and the largest between-class separability simultaneously. This indicates the corresponding feature space is more suitable for classification. In addition, the corresponding nonlinear classification boundary can separate classes very well. These optimal bandwidths also show the importance of bands for hyperspectral image classification. The reciprocals of these bandwidths can be viewed as weights of bands. The smaller bandwidth, the larger weight of the band, and the more importance for classification. Hence, the descending order of the reciprocals of the bands gives an order for selecting the appropriate feature subsets. In the experiments, three hyperspectral image data sets, the Indian Pine Site data set, the PAVIA data set, and the Salinas A data set, were used to demonstrate the selected feature subsets by the proposed nonlinear feature selection method are more appropriate for hyperspectral image classification. Only ten percent of samples were randomly selected to form the training dataset. All non-background samples were used to form the testing dataset. The support vector machine was applied to classify these testing samples based on selected feature subsets. According to the experiments on the Indian Pine Site data set with 220 bands, the highest accuracies by applying the proposed method, F-score, and HSIC are 0.8795, 0.8795, and 0.87404, respectively. However, the proposed method selects 158 features. F-score and HSIC select 168 features and 217 features, respectively. Moreover, the classification accuracies increase dramatically only using first few features. The classification accuracies with respect to feature subsets of 10 features, 20 features, 50 features, and 110 features are 0.69587, 0.7348, 0.79217, and 0.84164, respectively. Furthermore, only using half selected features (110 features) of the proposed method, the corresponding classification accuracy (0.84168) is approximate to the highest classification accuracy, 0.8795. For other two hyperspectral image data sets, the PAVIA data set and Salinas A data set, we can obtain the similar results. These results illustrate our proposed method can efficiently find feature subsets to improve hyperspectral image classification. One can apply the proposed method to determine the suitable feature subset first according to specific purposes. Then researchers can only use the corresponding sensors to obtain the hyperspectral image and classify the samples. This can not only improve the classification performance but also reduce the cost for obtaining hyperspectral images.

Keywords: hyperspectral image classification, nonlinear feature selection, kernel trick, support vector machine

Procedia PDF Downloads 260
5308 Personal Information Classification Based on Deep Learning in Automatic Form Filling System

Authors: Shunzuo Wu, Xudong Luo, Yuanxiu Liao

Abstract:

Recently, the rapid development of deep learning makes artificial intelligence (AI) penetrate into many fields, replacing manual work there. In particular, AI systems also become a research focus in the field of automatic office. To meet real needs in automatic officiating, in this paper we develop an automatic form filling system. Specifically, it uses two classical neural network models and several word embedding models to classify various relevant information elicited from the Internet. When training the neural network models, we use less noisy and balanced data for training. We conduct a series of experiments to test my systems and the results show that our system can achieve better classification results.

Keywords: artificial intelligence and office, NLP, deep learning, text classification

Procedia PDF Downloads 196
5307 Efficacy of Deep Learning for Below-Canopy Reconstruction of Satellite and Aerial Sensing Point Clouds through Fractal Tree Symmetry

Authors: Dhanuj M. Gandikota

Abstract:

Sensor-derived three-dimensional (3D) point clouds of trees are invaluable in remote sensing analysis for the accurate measurement of key structural metrics, bio-inventory values, spatial planning/visualization, and ecological modeling. Machine learning (ML) holds the potential in addressing the restrictive tradeoffs in cost, spatial coverage, resolution, and information gain that exist in current point cloud sensing methods. Terrestrial laser scanning (TLS) remains the highest fidelity source of both canopy and below-canopy structural features, but usage is limited in both coverage and cost, requiring manual deployment to map out large, forested areas. While aerial laser scanning (ALS) remains a reliable avenue of LIDAR active remote sensing, ALS is also cost-restrictive in deployment methods. Space-borne photogrammetry from high-resolution satellite constellations is an avenue of passive remote sensing with promising viability in research for the accurate construction of vegetation 3-D point clouds. It provides both the lowest comparative cost and the largest spatial coverage across remote sensing methods. However, both space-borne photogrammetry and ALS demonstrate technical limitations in the capture of valuable below-canopy point cloud data. Looking to minimize these tradeoffs, we explored a class of powerful ML algorithms called Deep Learning (DL) that show promise in recent research on 3-D point cloud reconstruction and interpolation. Our research details the efficacy of applying these DL techniques to reconstruct accurate below-canopy point clouds from space-borne and aerial remote sensing through learned patterns of tree species fractal symmetry properties and the supplementation of locally sourced bio-inventory metrics. From our dataset, consisting of tree point clouds obtained from TLS, we deconstructed the point clouds of each tree into those that would be obtained through ALS and satellite photogrammetry of varying resolutions. We fed this ALS/satellite point cloud dataset, along with the simulated local bio-inventory metrics, into the DL point cloud reconstruction architectures to generate the full 3-D tree point clouds (the truth values are denoted by the full TLS tree point clouds containing the below-canopy information). Point cloud reconstruction accuracy was validated both through the measurement of error from the original TLS point clouds as well as the error of extraction of key structural metrics, such as crown base height, diameter above root crown, and leaf/wood volume. The results of this research additionally demonstrate the supplemental performance gain of using minimum locally sourced bio-inventory metric information as an input in ML systems to reach specified accuracy thresholds of tree point cloud reconstruction. This research provides insight into methods for the rapid, cost-effective, and accurate construction of below-canopy tree 3-D point clouds, as well as the supported potential of ML and DL to learn complex, unmodeled patterns of fractal tree growth symmetry.

Keywords: deep learning, machine learning, satellite, photogrammetry, aerial laser scanning, terrestrial laser scanning, point cloud, fractal symmetry

Procedia PDF Downloads 98
5306 Auto Classification of Multiple ECG Arrhythmic Detection via Machine Learning Techniques: A Review

Authors: Ng Liang Shen, Hau Yuan Wen

Abstract:

Arrhythmia analysis of ECG signal plays a major role in diagnosing most of the cardiac diseases. Therefore, a single arrhythmia detection of an electrocardiographic (ECG) record can determine multiple pattern of various algorithms and match accordingly each ECG beats based on Machine Learning supervised learning. These researchers used different features and classification methods to classify different arrhythmia types. A major problem in these studies is the fact that the symptoms of the disease do not show all the time in the ECG record. Hence, a successful diagnosis might require the manual investigation of several hours of ECG records. The point of this paper presents investigations cardiovascular ailment in Electrocardiogram (ECG) Signals for Cardiac Arrhythmia utilizing examination of ECG irregular wave frames via heart beat as correspond arrhythmia which with Machine Learning Pattern Recognition.

Keywords: electrocardiogram, ECG, classification, machine learning, pattern recognition, detection, QRS

Procedia PDF Downloads 370
5305 Cytotoxic Effect of Neem Seed Extract (Azadirachta indica) in Comparison with Artificial Insecticide Novastar on Haemocytes (THC and DHC) of Musca domestica

Authors: Muhammad Zaheer Awan, Adnan Qadir, Zeeshan Anjum

Abstract:

Housefly, Musca domestica Linnaeus is ubiquitous and hazardous for Homo sapiens and livestock in sundry venerations. Musca domestica cart 100 different pathogens, such as typhoid, salmonella, bacillary dysentery, tuberculosis, anthrax and parasitic worms. The flies in rural areas usually carry more pathogens. Houseflies feed on liquid or semi-liquid substances besides solid materials which are softened by saliva. Neem botanically known as Azadirachta indica belongs to the family Meliaceae and is an indigenous tree to Pakistan. The neem tree is also one such tree which has been revered by the Pakistanis and Kashmiris for its medicinal properties. Present study showed neem seed extract has potentially toxic ability that affect Total Haemocyte Count (THC) and Differential Haemocytes Count (DHC) in insect’s blood cells, of the housefly. A significant variation in haemolymph density was observed just after application, 30 minutes and 60 minutes post treatment in term of THC and DHC in comparison with novastar. The study strappingly acclaim use of neem seed extract as insecticide as compare to artificial insecticides.

Keywords: neem, Azadirachta indica, Musca domestica, differential haemocyte count (DHC), total haemocytes count (DHC), novastar

Procedia PDF Downloads 199
5304 Land Use/Land Cover Mapping Using Landsat 8 and Sentinel-2 in a Mediterranean Landscape

Authors: Moschos Vogiatzis, K. Perakis

Abstract:

Spatial-explicit and up-to-date land use/land cover information is fundamental for spatial planning, land management, sustainable development, and sound decision-making. In the last decade, many satellite-derived land cover products at different spatial, spectral, and temporal resolutions have been developed, such as the European Copernicus Land Cover product. However, more efficient and detailed information for land use/land cover is required at the regional or local scale. A typical Mediterranean basin with a complex landscape comprised of various forest types, crops, artificial surfaces, and wetlands was selected to test and develop our approach. In this study, we investigate the improvement of Copernicus Land Cover product (CLC2018) using Landsat 8 and Sentinel-2 pixel-based classification based on all available existing geospatial data (Forest Maps, LPIS, Natura2000 habitats, cadastral parcels, etc.). We examined and compared the performance of the Random Forest classifier for land use/land cover mapping. In total, 10 land use/land cover categories were recognized in Landsat 8 and 11 in Sentinel-2A. A comparison of the overall classification accuracies for 2018 shows that Landsat 8 classification accuracy was slightly higher than Sentinel-2A (82,99% vs. 80,30%). We concluded that the main land use/land cover types of CLC2018, even within a heterogeneous area, can be successfully mapped and updated according to CLC nomenclature. Future research should be oriented toward integrating spatiotemporal information from seasonal bands and spectral indexes in the classification process.

Keywords: classification, land use/land cover, mapping, random forest

Procedia PDF Downloads 117
5303 Terrain Classification for Ground Robots Based on Acoustic Features

Authors: Bernd Kiefer, Abraham Gebru Tesfay, Dietrich Klakow

Abstract:

The motivation of our work is to detect different terrain types traversed by a robot based on acoustic data from the robot-terrain interaction. Different acoustic features and classifiers were investigated, such as Mel-frequency cepstral coefficient and Gamma-tone frequency cepstral coefficient for the feature extraction, and Gaussian mixture model and Feed forward neural network for the classification. We analyze the system’s performance by comparing our proposed techniques with some other features surveyed from distinct related works. We achieve precision and recall values between 87% and 100% per class, and an average accuracy at 95.2%. We also study the effect of varying audio chunk size in the application phase of the models and find only a mild impact on performance.

Keywords: acoustic features, autonomous robots, feature extraction, terrain classification

Procedia PDF Downloads 359
5302 Short Text Classification Using Part of Speech Feature to Analyze Students' Feedback of Assessment Components

Authors: Zainab Mutlaq Ibrahim, Mohamed Bader-El-Den, Mihaela Cocea

Abstract:

Students' textual feedback can hold unique patterns and useful information about learning process, it can hold information about advantages and disadvantages of teaching methods, assessment components, facilities, and other aspects of teaching. The results of analysing such a feedback can form a key point for institutions’ decision makers to advance and update their systems accordingly. This paper proposes a data mining framework for analysing end of unit general textual feedback using part of speech feature (PoS) with four machine learning algorithms: support vector machines, decision tree, random forest, and naive bays. The proposed framework has two tasks: first, to use the above algorithms to build an optimal model that automatically classifies the whole data set into two subsets, one subset is tailored to assessment practices (assessment related), and the other one is the non-assessment related data. Second task to use the same algorithms to build an optimal model for whole data set, and the new data subsets to automatically detect their sentiment. The significance of this paper is to compare the performance of the above four algorithms using part of speech feature to the performance of the same algorithms using n-grams feature. The paper follows Knowledge Discovery and Data Mining (KDDM) framework to construct the classification and sentiment analysis models, which is understanding the assessment domain, cleaning and pre-processing the data set, selecting and running the data mining algorithm, interpreting mined patterns, and consolidating the discovered knowledge. The results of this paper experiments show that both models which used both features performed very well regarding first task. But regarding the second task, models that used part of speech feature has underperformed in comparison with models that used unigrams and bigrams.

Keywords: assessment, part of speech, sentiment analysis, student feedback

Procedia PDF Downloads 140
5301 The Implementation of the Multi-Agent Classification System (MACS) in Compliance with FIPA Specifications

Authors: Mohamed R. Mhereeg

Abstract:

The paper discusses the implementation of the MultiAgent classification System (MACS) and utilizing it to provide an automated and accurate classification of end users developing applications in the spreadsheet domain. However, different technologies have been brought together to build MACS. The strength of the system is the integration of the agent technology with the FIPA specifications together with other technologies, which are the .NET widows service based agents, the Windows Communication Foundation (WCF) services, the Service Oriented Architecture (SOA), and Oracle Data Mining (ODM). Microsoft's .NET windows service based agents were utilized to develop the monitoring agents of MACS, the .NET WCF services together with SOA approach allowed the distribution and communication between agents over the WWW. The Monitoring Agents (MAs) were configured to execute automatically to monitor excel spreadsheets development activities by content. Data gathered by the Monitoring Agents from various resources over a period of time was collected and filtered by a Database Updater Agent (DUA) residing in the .NET client application of the system. This agent then transfers and stores the data in Oracle server database via Oracle stored procedures for further processing that leads to the classification of the end user developers.

Keywords: MACS, implementation, multi-agent, SOA, autonomous, WCF

Procedia PDF Downloads 269
5300 A Text Classification Approach Based on Natural Language Processing and Machine Learning Techniques

Authors: Rim Messaoudi, Nogaye-Gueye Gning, François Azelart

Abstract:

Automatic text classification applies mostly natural language processing (NLP) and other AI-guided techniques to automatically classify text in a faster and more accurate manner. This paper discusses the subject of using predictive maintenance to manage incident tickets inside the sociality. It focuses on proposing a tool that treats and analyses comments and notes written by administrators after resolving an incident ticket. The goal here is to increase the quality of these comments. Additionally, this tool is based on NLP and machine learning techniques to realize the textual analytics of the extracted data. This approach was tested using real data taken from the French National Railways (SNCF) company and was given a high-quality result.

Keywords: machine learning, text classification, NLP techniques, semantic representation

Procedia PDF Downloads 94
5299 Wolof Voice Response Recognition System: A Deep Learning Model for Wolof Audio Classification

Authors: Krishna Mohan Bathula, Fatou Bintou Loucoubar, FNU Kaleemunnisa, Christelle Scharff, Mark Anthony De Castro

Abstract:

Voice recognition algorithms such as automatic speech recognition and text-to-speech systems with African languages can play an important role in bridging the digital divide of Artificial Intelligence in Africa, contributing to the establishment of a fully inclusive information society. This paper proposes a Deep Learning model that can classify the user responses as inputs for an interactive voice response system. A dataset with Wolof language words ‘yes’ and ‘no’ is collected as audio recordings. A two stage Data Augmentation approach is adopted for enhancing the dataset size required by the deep neural network. Data preprocessing and feature engineering with Mel-Frequency Cepstral Coefficients are implemented. Convolutional Neural Networks (CNNs) have proven to be very powerful in image classification and are promising for audio processing when sounds are transformed into spectra. For performing voice response classification, the recordings are transformed into sound frequency feature spectra and then applied image classification methodology using a deep CNN model. The inference model of this trained and reusable Wolof voice response recognition system can be integrated with many applications associated with both web and mobile platforms.

Keywords: automatic speech recognition, interactive voice response, voice response recognition, wolof word classification

Procedia PDF Downloads 109
5298 A Deep Learning Approach to Subsection Identification in Electronic Health Records

Authors: Nitin Shravan, Sudarsun Santhiappan, B. Sivaselvan

Abstract:

Subsection identification, in the context of Electronic Health Records (EHRs), is identifying the important sections for down-stream tasks like auto-coding. In this work, we classify the text present in EHRs according to their information, using machine learning and deep learning techniques. We initially describe briefly about the problem and formulate it as a text classification problem. Then, we discuss upon the methods from the literature. We try two approaches - traditional feature extraction based machine learning methods and deep learning methods. Through experiments on a private dataset, we establish that the deep learning methods perform better than the feature extraction based Machine Learning Models.

Keywords: deep learning, machine learning, semantic clinical classification, subsection identification, text classification

Procedia PDF Downloads 209
5297 Comparative Analysis of Spectral Estimation Methods for Brain-Computer Interfaces

Authors: Rafik Djemili, Hocine Bourouba, M. C. Amara Korba

Abstract:

In this paper, we present a method in order to classify EEG signals for Brain-Computer Interfaces (BCI). EEG signals are first processed by means of spectral estimation methods to derive reliable features before classification step. Spectral estimation methods used are standard periodogram and the periodogram calculated by the Welch method; both methods are compared with Logarithm of Band Power (logBP) features. In the method proposed, we apply Linear Discriminant Analysis (LDA) followed by Support Vector Machine (SVM). Classification accuracy reached could be as high as 85%, which proves the effectiveness of classification of EEG signals based BCI using spectral methods.

Keywords: brain-computer interface, motor imagery, electroencephalogram, linear discriminant analysis, support vector machine

Procedia PDF Downloads 495
5296 Long-Term Resilience Performance Assessment of Dual and Singular Water Distribution Infrastructures Using a Complex Systems Approach

Authors: Kambiz Rasoulkhani, Jeanne Cole, Sybil Sharvelle, Ali Mostafavi

Abstract:

Dual water distribution systems have been proposed as solutions to enhance the sustainability and resilience of urban water systems by improving performance and decreasing energy consumption. The objective of this study was to evaluate the long-term resilience and robustness of dual water distribution systems versus singular water distribution systems under various stressors such as demand fluctuation, aging infrastructure, and funding constraints. To this end, the long-term dynamics of these infrastructure systems was captured using a simulation model that integrates institutional agency decision-making processes with physical infrastructure degradation to evaluate the long-term transformation of water infrastructure. A set of model parameters that varies for dual and singular distribution infrastructure based on the system attributes, such as pipes length and material, energy intensity, water demand, water price, average pressure and flow rate, as well as operational expenditures, were considered and input in the simulation model. Accordingly, the model was used to simulate various scenarios of demand changes, funding levels, water price growth, and renewal strategies. The long-term resilience and robustness of each distribution infrastructure were evaluated based on various performance measures including network average condition, break frequency, network leakage, and energy use. An ecologically-based resilience approach was used to examine regime shifts and tipping points in the long-term performance of the systems under different stressors. Also, Classification and Regression Tree analysis was adopted to assess the robustness of each system under various scenarios. Using data from the City of Fort Collins, the long-term resilience and robustness of the dual and singular water distribution systems were evaluated over a 100-year analysis horizon for various scenarios. The results of the analysis enabled: (i) comparison between dual and singular water distribution systems in terms of long-term performance, resilience, and robustness; (ii) identification of renewal strategies and decision factors that enhance the long-term resiliency and robustness of dual and singular water distribution systems under different stressors.

Keywords: complex systems, dual water distribution systems, long-term resilience performance, multi-agent modeling, sustainable and resilient water systems

Procedia PDF Downloads 286
5295 Optimizing Perennial Plants Image Classification by Fine-Tuning Deep Neural Networks

Authors: Khairani Binti Supyan, Fatimah Khalid, Mas Rina Mustaffa, Azreen Bin Azman, Amirul Azuani Romle

Abstract:

Perennial plant classification plays a significant role in various agricultural and environmental applications, assisting in plant identification, disease detection, and biodiversity monitoring. Nevertheless, attaining high accuracy in perennial plant image classification remains challenging due to the complex variations in plant appearance, the diverse range of environmental conditions under which images are captured, and the inherent variability in image quality stemming from various factors such as lighting conditions, camera settings, and focus. This paper proposes an adaptation approach to optimize perennial plant image classification by fine-tuning the pre-trained DNNs model. This paper explores the efficacy of fine-tuning prevalent architectures, namely VGG16, ResNet50, and InceptionV3, leveraging transfer learning to tailor the models to the specific characteristics of perennial plant datasets. A subset of the MYLPHerbs dataset consisted of 6 perennial plant species of 13481 images under various environmental conditions that were used in the experiments. Different strategies for fine-tuning, including adjusting learning rates, training set sizes, data augmentation, and architectural modifications, were investigated. The experimental outcomes underscore the effectiveness of fine-tuning deep neural networks for perennial plant image classification, with ResNet50 showcasing the highest accuracy of 99.78%. Despite ResNet50's superior performance, both VGG16 and InceptionV3 achieved commendable accuracy of 99.67% and 99.37%, respectively. The overall outcomes reaffirm the robustness of the fine-tuning approach across different deep neural network architectures, offering insights into strategies for optimizing model performance in the domain of perennial plant image classification.

Keywords: perennial plants, image classification, deep neural networks, fine-tuning, transfer learning, VGG16, ResNet50, InceptionV3

Procedia PDF Downloads 58
5294 Obstacle Classification Method Based on 2D LIDAR Database

Authors: Moohyun Lee, Soojung Hur, Yongwan Park

Abstract:

In this paper is proposed a method uses only LIDAR system to classification an obstacle and determine its type by establishing database for classifying obstacles based on LIDAR. The existing LIDAR system, in determining the recognition of obstruction in an autonomous vehicle, has an advantage in terms of accuracy and shorter recognition time. However, it was difficult to determine the type of obstacle and therefore accurate path planning based on the type of obstacle was not possible. In order to overcome this problem, a method of classifying obstacle type based on existing LIDAR and using the width of obstacle materials was proposed. However, width measurement was not sufficient to improve accuracy. In this research, the width data was used to do the first classification; database for LIDAR intensity data by four major obstacle materials on the road were created; comparison is made to the LIDAR intensity data of actual obstacle materials; and determine the obstacle type by finding the one with highest similarity values. An experiment using an actual autonomous vehicle under real environment shows that data declined in quality in comparison to 3D LIDAR and it was possible to classify obstacle materials using 2D LIDAR.

Keywords: obstacle, classification, database, LIDAR, segmentation, intensity

Procedia PDF Downloads 340
5293 Metamorphic Computer Virus Classification Using Hidden Markov Model

Authors: Babak Bashari Rad

Abstract:

A metamorphic computer virus uses different code transformation techniques to mutate its body in duplicated instances. Characteristics and function of new instances are mostly similar to their parents, but they cannot be easily detected by the majority of antivirus in market, as they depend on string signature-based detection techniques. The purpose of this research is to propose a Hidden Markov Model for classification of metamorphic viruses in executable files. In the proposed solution, portable executable files are inspected to extract the instructions opcodes needed for the examination of code. A Hidden Markov Model trained on portable executable files is employed to classify the metamorphic viruses of the same family. The proposed model is able to generate and recognize common statistical features of mutated code. The model has been evaluated by examining the model on a test data set. The performance of the model has been practically tested and evaluated based on False Positive Rate, Detection Rate and Overall Accuracy. The result showed an acceptable performance with high average of 99.7% Detection Rate.

Keywords: malware classification, computer virus classification, metamorphic virus, metamorphic malware, Hidden Markov Model

Procedia PDF Downloads 312