Search results for: vegetation classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2651

Search results for: vegetation classification

2351 Recurrent Neural Networks with Deep Hierarchical Mixed Structures for Chinese Document Classification

Authors: Zhaoxin Luo, Michael Zhu

Abstract:

In natural languages, there are always complex semantic hierarchies. Obtaining the feature representation based on these complex semantic hierarchies becomes the key to the success of the model. Several RNN models have recently been proposed to use latent indicators to obtain the hierarchical structure of documents. However, the model that only uses a single-layer latent indicator cannot achieve the true hierarchical structure of the language, especially a complex language like Chinese. In this paper, we propose a deep layered model that stacks arbitrarily many RNN layers equipped with latent indicators. After using EM and training it hierarchically, our model solves the computational problem of stacking RNN layers and makes it possible to stack arbitrarily many RNN layers. Our deep hierarchical model not only achieves comparable results to large pre-trained models on the Chinese short text classification problem but also achieves state of art results on the Chinese long text classification problem.

Keywords: nature language processing, recurrent neural network, hierarchical structure, document classification, Chinese

Procedia PDF Downloads 61
2350 Effects of Drought and Anthropism on Vegetation and Soil Elements in the Steppe of Algeria: Case of the Station of Tadmit (Wilaya of Djelfa)

Authors: L. Benseghir, H. Kadi-Hanifi

Abstract:

Vegetation of the high steppic plains of southern Algiers region has ever been used by human occupation. The harsh climatic context characterized by long periods of drought and an ovine livestock in constant growth lead us to devote a particular attention to the biodiversity of those living environment. The diachronic study made in Tadmit (50 km south of the district of Djelfa) about the specific recording led us to notice that: The floristic recording of Tadmit is not reduced in time but fluctuate, depending on the pasture intensity, the annual rainfall and especially by the protection area of the following two years from January 2004. The forming specific recording of the station undergo significant changes from a period to another. Those changes in floristic list concern nearly 50% of the initial flora that could disappear or be replaced by new species. Finally, the alfa steppe is in a marked decline and is substituted by new facies that were privileged by the overgrazing, stranding or clearance.

Keywords: overgrazing, diachronic study, protection area, climate, desertification

Procedia PDF Downloads 264
2349 A Novel PSO Based Decision Tree Classification

Authors: Ali Farzan

Abstract:

Classification of data objects or patterns is a major part in most of Decision making systems. One of the popular and commonly used classification methods is Decision Tree (DT). It is a hierarchical decision making system by which a binary tree is constructed and starting from root, at each node some of the classes is rejected until reaching the leaf nods. Each leaf node is a representative of one specific class. Finding the splitting criteria in each node for constructing or training the tree is a major problem. Particle Swarm Optimization (PSO) has been adopted as a metaheuristic searching method for finding the best splitting criteria. Result of evaluating the proposed method over benchmark datasets indicates the higher accuracy of the new PSO based decision tree.

Keywords: decision tree, particle swarm optimization, splitting criteria, metaheuristic

Procedia PDF Downloads 401
2348 Soil Quality Status under Dryland Vegetation of Yabello District, Southern Ethiopia

Authors: Mohammed Abaoli, Omer Kara

Abstract:

The current research has investigated the soil quality status under dryland vegetation of Yabello district, Southern Ethiopia in which we should identify the nature and extent of salinity problem of the area for further research bases. About 48 soil samples were taken from 0-30, 31-60, 61-90 and 91-120 cm soil depths by opening 12 representative soil profile pits at 1.5 m depth. Soil color, texture, bulk density, Soil Organic Carbon (SOC), Cation Exchange Capacity (CEC), Na, K, Mg, Ca, CaCO3, gypsum (CaSO4), pH, Sodium Adsorption Ratio (SAR), Exchangeable Sodium Percentage (ESP) were analyzed. The dominant soil texture was silty-clay-loam.  Bulk density varied from 1.1 to 1.31 g/cm3. High SOC content was observed in 0-30 cm. The soil pH ranged from 7.1 to 8.6. The electrical conductivity shows indirect relationship with soil depth while CaCO3 and CaSO4 concentrations were observed in a direct relationship with depth. About 41% are non-saline, 38.31% saline, 15.23% saline-sodic and 5.46% sodic soils. Na concentration in saline soils was greater than Ca and Mg in all the soil depths. Ca and Mg contents were higher above 60 cm soil depth in non-saline soils. The concentrations of SO2-4 and HCO-3 were observed to be higher at the most lower depth than upper. SAR value tends to be higher at lower depths in saline and saline-sodic soils, but decreases at lower depth of the non-saline soils. The distribution of ESP above 60 cm depth was in an increasing order in saline and saline-sodic soils. The result of the research has shown the direction to which extent of salinity we should consider for the Commiphora plant species we want to grow on the area. 

Keywords: commiphora species, dryland vegetation, ecological significance, soil quality, salinity problem

Procedia PDF Downloads 193
2347 Impact Assessment of Tropical Cyclone Hudhud on Visakhapatnam, Andhra Pradesh

Authors: Vivek Ganesh

Abstract:

Tropical cyclones are some of the most damaging events. They occur in yearly cycles and affect the coastal population with three dangerous effects: heavy rain, strong wind and storm surge. In order to estimate the area and the population affected by a cyclone, all the three types of physical impacts must be taken into account. Storm surge is an abnormal rise of water above the astronomical tides, generated by strong winds and drop in the atmospheric pressure. The main aim of the study is to identify the impact by comparing three different months data. The technique used here is NDVI classification technique for change detection and other techniques like storm surge modelling for finding the tide height. Current study emphasize on recent very severe cyclonic storm Hud Hud of category 3 hurricane which had developed on 8 October 2014 and hit the coast on 12 October 2014 which caused significant changes on land and coast of Visakhapatnam, Andhra Pradesh. In the present study, we have used Remote Sensing and GIS tools for investigating and quantifying the changes in vegetation and settlement.

Keywords: inundation map, NDVI map, storm tide map, track map

Procedia PDF Downloads 261
2346 Application of Support Vector Machines in Fault Detection and Diagnosis of Power Transmission Lines

Authors: I. A. Farhat, M. Bin Hasan

Abstract:

A developed approach for the protection of power transmission lines using Support Vector Machines (SVM) technique is presented. In this paper, the SVM technique is utilized for the classification and isolation of faults in power transmission lines. Accurate fault classification and location results are obtained for all possible types of short circuit faults. As in distance protection, the approach utilizes the voltage and current post-fault samples as inputs. The main advantage of the method introduced here is that the method could easily be extended to any power transmission line.

Keywords: fault detection, classification, diagnosis, power transmission line protection, support vector machines (SVM)

Procedia PDF Downloads 553
2345 Statistical Classification, Downscaling and Uncertainty Assessment for Global Climate Model Outputs

Authors: Queen Suraajini Rajendran, Sai Hung Cheung

Abstract:

Statistical down scaling models are required to connect the global climate model outputs and the local weather variables for climate change impact prediction. For reliable climate change impact studies, the uncertainty associated with the model including natural variability, uncertainty in the climate model(s), down scaling model, model inadequacy and in the predicted results should be quantified appropriately. In this work, a new approach is developed by the authors for statistical classification, statistical down scaling and uncertainty assessment and is applied to Singapore rainfall. It is a robust Bayesian uncertainty analysis methodology and tools based on coupling dependent modeling error with classification and statistical down scaling models in a way that the dependency among modeling errors will impact the results of both classification and statistical down scaling model calibration and uncertainty analysis for future prediction. Singapore data are considered here and the uncertainty and prediction results are obtained. From the results obtained, directions of research for improvement are briefly presented.

Keywords: statistical downscaling, global climate model, climate change, uncertainty

Procedia PDF Downloads 365
2344 Automatic Moment-Based Texture Segmentation

Authors: Tudor Barbu

Abstract:

An automatic moment-based texture segmentation approach is proposed in this paper. First, we describe the related work in this computer vision domain. Our texture feature extraction, the first part of the texture recognition process, produces a set of moment-based feature vectors. For each image pixel, a texture feature vector is computed as a sequence of area moments. Second, an automatic pixel classification approach is proposed. The feature vectors are clustered using some unsupervised classification algorithm, the optimal number of clusters being determined using a measure based on validation indexes. From the resulted pixel classes one determines easily the desired texture regions of the image.

Keywords: image segmentation, moment-based, texture analysis, automatic classification, validation indexes

Procedia PDF Downloads 413
2343 Using Gene Expression Programming in Learning Process of Rough Neural Networks

Authors: Sanaa Rashed Abdallah, Yasser F. Hassan

Abstract:

The paper will introduce an approach where a rough sets, gene expression programming and rough neural networks are used cooperatively for learning and classification support. The Objective of gene expression programming rough neural networks (GEP-RNN) approach is to obtain new classified data with minimum error in training and testing process. Starting point of gene expression programming rough neural networks (GEP-RNN) approach is an information system and the output from this approach is a structure of rough neural networks which is including the weights and thresholds with minimum classification error.

Keywords: rough sets, gene expression programming, rough neural networks, classification

Procedia PDF Downloads 379
2342 A Statistical Approach to Classification of Agricultural Regions

Authors: Hasan Vural

Abstract:

Turkey is a favorable country to produce a great variety of agricultural products because of her different geographic and climatic conditions which have been used to divide the country into four main and seven sub regions. This classification into seven regions traditionally has been used in order to data collection and publication especially related with agricultural production. Afterwards, nine agricultural regions were considered. Recently, the governmental body which is responsible of data collection and dissemination (Turkish Institute of Statistics-TIS) has used 12 classes which include 11 sub regions and Istanbul province. This study aims to evaluate these classification efforts based on the acreage of ten main crops in a ten years time period (1996-2005). The panel data grouped in 11 subregions has been evaluated by cluster and multivariate statistical methods. It was concluded that from the agricultural production point of view, it will be rather meaningful to consider three main and eight sub-agricultural regions throughout the country.

Keywords: agricultural region, factorial analysis, cluster analysis,

Procedia PDF Downloads 412
2341 The Change of Urban Land Use/Cover Using Object Based Approach for Southern Bali

Authors: I. Gusti A. A. Rai Asmiwyati, Robert J. Corner, Ashraf M. Dewan

Abstract:

Change on land use/cover (LULC) dominantly affects spatial structure and function. It can have such impacts by disrupting social culture practice and disturbing physical elements. Thus, it has become essential to understand of the dynamics in time and space of LULC as it can be used as a critical input for developing sustainable LULC. This study was an attempt to map and monitor the LULC change in Bali Indonesia from 2003 to 2013. Using object based classification to improve the accuracy, and change detection, multi temporal land use/cover data were extracted from a set of ASTER satellite image. The overall accuracies of the classification maps of 2003 and 2013 were 86.99% and 80.36%, respectively. Built up area and paddy field were the dominant type of land use/cover in both years. Patch increase dominantly in 2003 illustrated the rapid paddy field fragmentation and the huge occurring transformation. This approach is new for the case of diverse urban features of Bali that has been growing fast and increased the classification accuracy than the manual pixel based classification.

Keywords: land use/cover, urban, Bali, ASTER

Procedia PDF Downloads 539
2340 Winter Wheat Yield Forecasting Using Sentinel-2 Imagery at the Early Stages

Authors: Chunhua Liao, Jinfei Wang, Bo Shan, Yang Song, Yongjun He, Taifeng Dong

Abstract:

Winter wheat is one of the main crops in Canada. Forecasting of within-field variability of yield in winter wheat at the early stages is essential for precision farming. However, the crop yield modelling based on high spatial resolution satellite data is generally affected by the lack of continuous satellite observations, resulting in reducing the generalization ability of the models and increasing the difficulty of crop yield forecasting at the early stages. In this study, the correlations between Sentinel-2 data (vegetation indices and reflectance) and yield data collected by combine harvester were investigated and a generalized multivariate linear regression (MLR) model was built and tested with data acquired in different years. It was found that the four-band reflectance (blue, green, red, near-infrared) performed better than their vegetation indices (NDVI, EVI, WDRVI and OSAVI) in wheat yield prediction. The optimum phenological stage for wheat yield prediction with highest accuracy was at the growing stages from the end of the flowering to the beginning of the filling stage. The best MLR model was therefore built to predict wheat yield before harvest using Sentinel-2 data acquired at the end of the flowering stage. Further, to improve the ability of the yield prediction at the early stages, three simple unsupervised domain adaptation (DA) methods were adopted to transform the reflectance data at the early stages to the optimum phenological stage. The winter wheat yield prediction using multiple vegetation indices showed higher accuracy than using single vegetation index. The optimum stage for winter wheat yield forecasting varied with different fields when using vegetation indices, while it was consistent when using multispectral reflectance and the optimum stage for winter wheat yield prediction was at the end of flowering stage. The average testing RMSE of the MLR model at the end of the flowering stage was 604.48 kg/ha. Near the booting stage, the average testing RMSE of yield prediction using the best MLR was reduced to 799.18 kg/ha when applying the mean matching domain adaptation approach to transform the data to the target domain (at the end of the flowering) compared to that using the original data based on the models developed at the booting stage directly (“MLR at the early stage”) (RMSE =1140.64 kg/ha). This study demonstrated that the simple mean matching (MM) performed better than other DA methods and it was found that “DA then MLR at the optimum stage” performed better than “MLR directly at the early stages” for winter wheat yield forecasting at the early stages. The results indicated that the DA had a great potential in near real-time crop yield forecasting at the early stages. This study indicated that the simple domain adaptation methods had a great potential in crop yield prediction at the early stages using remote sensing data.

Keywords: wheat yield prediction, domain adaptation, Sentinel-2, within-field scale

Procedia PDF Downloads 63
2339 From Type-I to Type-II Fuzzy System Modeling for Diagnosis of Hepatitis

Authors: Shahabeddin Sotudian, M. H. Fazel Zarandi, I. B. Turksen

Abstract:

Hepatitis is one of the most common and dangerous diseases that affects humankind, and exposes millions of people to serious health risks every year. Diagnosis of Hepatitis has always been a challenge for physicians. This paper presents an effective method for diagnosis of hepatitis based on interval Type-II fuzzy. This proposed system includes three steps: pre-processing (feature selection), Type-I and Type-II fuzzy classification, and system evaluation. KNN-FD feature selection is used as the preprocessing step in order to exclude irrelevant features and to improve classification performance and efficiency in generating the classification model. In the fuzzy classification step, an “indirect approach” is used for fuzzy system modeling by implementing the exponential compactness and separation index for determining the number of rules in the fuzzy clustering approach. Therefore, we first proposed a Type-I fuzzy system that had an accuracy of approximately 90.9%. In the proposed system, the process of diagnosis faces vagueness and uncertainty in the final decision. Thus, the imprecise knowledge was managed by using interval Type-II fuzzy logic. The results that were obtained show that interval Type-II fuzzy has the ability to diagnose hepatitis with an average accuracy of 93.94%. The classification accuracy obtained is the highest one reached thus far. The aforementioned rate of accuracy demonstrates that the Type-II fuzzy system has a better performance in comparison to Type-I and indicates a higher capability of Type-II fuzzy system for modeling uncertainty.

Keywords: hepatitis disease, medical diagnosis, type-I fuzzy logic, type-II fuzzy logic, feature selection

Procedia PDF Downloads 303
2338 DeClEx-Processing Pipeline for Tumor Classification

Authors: Gaurav Shinde, Sai Charan Gongiguntla, Prajwal Shirur, Ahmed Hambaba

Abstract:

Health issues are significantly increasing, putting a substantial strain on healthcare services. This has accelerated the integration of machine learning in healthcare, particularly following the COVID-19 pandemic. The utilization of machine learning in healthcare has grown significantly. We introduce DeClEx, a pipeline that ensures that data mirrors real-world settings by incorporating Gaussian noise and blur and employing autoencoders to learn intermediate feature representations. Subsequently, our convolutional neural network, paired with spatial attention, provides comparable accuracy to state-of-the-art pre-trained models while achieving a threefold improvement in training speed. Furthermore, we provide interpretable results using explainable AI techniques. We integrate denoising and deblurring, classification, and explainability in a single pipeline called DeClEx.

Keywords: machine learning, healthcare, classification, explainability

Procedia PDF Downloads 50
2337 A Survey of Skin Cancer Detection and Classification from Skin Lesion Images Using Deep Learning

Authors: Joseph George, Anne Kotteswara Roa

Abstract:

Skin disease is one of the most common and popular kinds of health issues faced by people nowadays. Skin cancer (SC) is one among them, and its detection relies on the skin biopsy outputs and the expertise of the doctors, but it consumes more time and some inaccurate results. At the early stage, skin cancer detection is a challenging task, and it easily spreads to the whole body and leads to an increase in the mortality rate. Skin cancer is curable when it is detected at an early stage. In order to classify correct and accurate skin cancer, the critical task is skin cancer identification and classification, and it is more based on the cancer disease features such as shape, size, color, symmetry and etc. More similar characteristics are present in many skin diseases; hence it makes it a challenging issue to select important features from a skin cancer dataset images. Hence, the skin cancer diagnostic accuracy is improved by requiring an automated skin cancer detection and classification framework; thereby, the human expert’s scarcity is handled. Recently, the deep learning techniques like Convolutional neural network (CNN), Deep belief neural network (DBN), Artificial neural network (ANN), Recurrent neural network (RNN), and Long and short term memory (LSTM) have been widely used for the identification and classification of skin cancers. This survey reviews different DL techniques for skin cancer identification and classification. The performance metrics such as precision, recall, accuracy, sensitivity, specificity, and F-measures are used to evaluate the effectiveness of SC identification using DL techniques. By using these DL techniques, the classification accuracy increases along with the mitigation of computational complexities and time consumption.

Keywords: skin cancer, deep learning, performance measures, accuracy, datasets

Procedia PDF Downloads 125
2336 Random Subspace Ensemble of CMAC Classifiers

Authors: Somaiyeh Dehghan, Mohammad Reza Kheirkhahan Haghighi

Abstract:

The rapid growth of domains that have data with a large number of features, while the number of samples is limited has caused difficulty in constructing strong classifiers. To reduce the dimensionality of the feature space becomes an essential step in classification task. Random subspace method (or attribute bagging) is an ensemble classifier that consists of several classifiers that each base learner in ensemble has subset of features. In the present paper, we introduce Random Subspace Ensemble of CMAC neural network (RSE-CMAC), each of which has training with subset of features. Then we use this model for classification task. For evaluation performance of our model, we compare it with bagging algorithm on 36 UCI datasets. The results reveal that the new model has better performance.

Keywords: classification, random subspace, ensemble, CMAC neural network

Procedia PDF Downloads 327
2335 Influence of Water Physicochemical Properties and Vegetation Type on the Distribution of Schistosomiasis Intermediate Host Snails in Nelson Mandela Bay

Authors: Prince S. Campbell, Janine B. Adams, Melusi Thwala, Opeoluwa Oyedele, Paula E. Melariri

Abstract:

Schistosomiasis is an infectious water-borne disease that holds substantial medical and veterinary importance and is transmitted by Schistosoma flatworms. The transmission and spread of the disease are geographically and temporally confined to water bodies (rivers, lakes, lagoons, dams, etc.) inhabited by its obligate intermediate host snails and human water contact. Human infection with the parasite occurs via skin penetration subsequent to exposure to water infested with schistosome cercariae. Environmental factors play a crucial role in the spread of the disease, as the survival of intermediate host snails is dependent on favourable conditions. These factors include physical and chemical components of water, including pH, salinity, temperature, electrical conductivity, dissolved oxygen, turbidity, water hardness, total dissolved solids, and velocity, as well as biological factors such as predator-prey interactions, competition, food availability, and the presence and density of aquatic vegetation. This study evaluated the physicochemical properties of the water bodies, vegetation type, distribution, and habitat presence of the snail intermediate host. A quantitative cross-sectional research design approach was employed in this study. Eight sampling sites were selected based on their proximity to residential areas. Snails and water physicochemical properties were collected over different seasons for 9 months. A simple dip method was used for surface water samples and measurements were done using multiparameter meters. Snails captured using a 300 µm mesh scoop net and predominant plant species were gathered and transported to experts for identification. Vegetation composition and cover were visually estimated and recorded at each sampling point. Data was analysed using R software (version 4.3.1). A total of 844 freshwater snails were collected, with Physa genera accounting for 95.9% of the snails. Bulinus and Biomphalaria snails, which serve as intermediate hosts for the disease, accounted for (0.9%) and (0.6%) respectively. Indicator macrophytes such as Eicchornia crassipes, Stuckenia pectinate, Typha capensis, and floating macroalgae were found in several water bodies. A negative and weak correlation existed between the number of snails and physicochemical properties such as electrical conductivity (r=-0.240), dissolved oxygen (r=-0.185), hardness (r=-0.210), pH (r=-0.235), salinity (r=-0.242), temperature (r=-0.273), and total dissolved solids (r=-0.236). There was no correlation between the number of snails and turbidity (r=-0.070). Moreover, there was a negative and weak correlation between snails and vegetation coverage (r=-0.127). Findings indicated that snail abundance marginally declined with rising physicochemical concentrations, and the majority of snails were located in regions with less vegetation cover. The reduction in Bulinus and Biomphalaria snail populations may also be attributed to other factors, such as competition among the snails. Snails of the Physa genus were abundant due to their noteworthy resilience in difficult environments. These snails have the potential to function as biological control agents in areas where the disease is endemic, as they outcompete other snails, including schistosomiasis intermediate host snails.

Keywords: intermediate host snails, physicochemical properties, schistosomiasis, vegetation type

Procedia PDF Downloads 8
2334 Hyperspectral Data Classification Algorithm Based on the Deep Belief and Self-Organizing Neural Network

Authors: Li Qingjian, Li Ke, He Chun, Huang Yong

Abstract:

In this paper, the method of combining the Pohl Seidman's deep belief network with the self-organizing neural network is proposed to classify the target. This method is mainly aimed at the high nonlinearity of the hyperspectral image, the high sample dimension and the difficulty in designing the classifier. The main feature of original data is extracted by deep belief network. In the process of extracting features, adding known labels samples to fine tune the network, enriching the main characteristics. Then, the extracted feature vectors are classified into the self-organizing neural network. This method can effectively reduce the dimensions of data in the spectrum dimension in the preservation of large amounts of raw data information, to solve the traditional clustering and the long training time when labeled samples less deep learning algorithm for training problems, improve the classification accuracy and robustness. Through the data simulation, the results show that the proposed network structure can get a higher classification precision in the case of a small number of known label samples.

Keywords: DBN, SOM, pattern classification, hyperspectral, data compression

Procedia PDF Downloads 338
2333 Evaluation of Agricultural Drought Impact in the Crop Productivity of East Gojjam Zone

Authors: Walelgn Dilnesa Cherie, Fasikaw Atanaw Zimale, Bekalu W. Asres

Abstract:

The most catastrophic condition for agricultural production is a drought event, which is also one of the most hydro-metrological-related hazards. According to the combined susceptibility of plants to meteorological and hydrological conditions, agricultural drought is defined as the magnitude, severity, and duration of a drought that affects crop production. The accurate and timely assessment of agricultural drought can lead to the development of risk management strategies, appropriate proactive mechanisms for the protection of farmers, and the improvement of food security. The evaluation of agricultural drought in the East Gojjam zone was the primary subject of this study. To identify the agricultural drought, soil moisture anomalies, soil moisture deficit indices, and Normalized Difference Vegetation Indices (NDVI) are used. The measured welting point, field capacity, and soil moisture were utilized to validate the soil water deficit indices computed from the satellite data. The soil moisture and soil water deficit indices in 2013 in all woredas were minimum; this makes vegetation stress also in all woredas. The soil moisture content decreased in 2013/2014/2019, and 2021 in Dejen, 2014, and 2019 in Awobel Woreda. The max/ min values of NDVI in 2013 are minimum; it dominantly shows vegetation stress and an observed agricultural drought that happened in all woredas. The validation process of satellite and in-situ soil moisture and soil water deficit indices shows a good agreement with a value of R²=0.87 and 0.56, respectively. The study area becomes drought detected region, so government officials, policymakers, and environmentalists pay attention to the protection of drought effects.

Keywords: NDVI, agricultural drought, SWDI, soil moisture

Procedia PDF Downloads 82
2332 Automatic Method for Classification of Informative and Noninformative Images in Colonoscopy Video

Authors: Nidhal K. Azawi, John M. Gauch

Abstract:

Colorectal cancer is one of the leading causes of cancer death in the US and the world, which is why millions of colonoscopy examinations are performed annually. Unfortunately, noise, specular highlights, and motion artifacts corrupt many images in a typical colonoscopy exam. The goal of our research is to produce automated techniques to detect and correct or remove these noninformative images from colonoscopy videos, so physicians can focus their attention on informative images. In this research, we first automatically extract features from images. Then we use machine learning and deep neural network to classify colonoscopy images as either informative or noninformative. Our results show that we achieve image classification accuracy between 92-98%. We also show how the removal of noninformative images together with image alignment can aid in the creation of image panoramas and other visualizations of colonoscopy images.

Keywords: colonoscopy classification, feature extraction, image alignment, machine learning

Procedia PDF Downloads 249
2331 Predicting Groundwater Areas Using Data Mining Techniques: Groundwater in Jordan as Case Study

Authors: Faisal Aburub, Wael Hadi

Abstract:

Data mining is the process of extracting useful or hidden information from a large database. Extracted information can be used to discover relationships among features, where data objects are grouped according to logical relationships; or to predict unseen objects to one of the predefined groups. In this paper, we aim to investigate four well-known data mining algorithms in order to predict groundwater areas in Jordan. These algorithms are Support Vector Machines (SVMs), Naïve Bayes (NB), K-Nearest Neighbor (kNN) and Classification Based on Association Rule (CBA). The experimental results indicate that the SVMs algorithm outperformed other algorithms in terms of classification accuracy, precision and F1 evaluation measures using the datasets of groundwater areas that were collected from Jordanian Ministry of Water and Irrigation.

Keywords: classification, data mining, evaluation measures, groundwater

Procedia PDF Downloads 275
2330 Analyzing Tools and Techniques for Classification In Educational Data Mining: A Survey

Authors: D. I. George Amalarethinam, A. Emima

Abstract:

Educational Data Mining (EDM) is one of the newest topics to emerge in recent years, and it is concerned with developing methods for analyzing various types of data gathered from the educational circle. EDM methods and techniques with machine learning algorithms are used to extract meaningful and usable information from huge databases. For scientists and researchers, realistic applications of Machine Learning in the EDM sectors offer new frontiers and present new problems. One of the most important research areas in EDM is predicting student success. The prediction algorithms and techniques must be developed to forecast students' performance, which aids the tutor, institution to boost the level of student’s performance. This paper examines various classification techniques in prediction methods and data mining tools used in EDM.

Keywords: classification technique, data mining, EDM methods, prediction methods

Procedia PDF Downloads 114
2329 Morphological Processing of Punjabi Text for Sentiment Analysis of Farmer Suicides

Authors: Jaspreet Singh, Gurvinder Singh, Prabhsimran Singh, Rajinder Singh, Prithvipal Singh, Karanjeet Singh Kahlon, Ravinder Singh Sawhney

Abstract:

Morphological evaluation of Indian languages is one of the burgeoning fields in the area of Natural Language Processing (NLP). The evaluation of a language is an eminent task in the era of information retrieval and text mining. The extraction and classification of knowledge from text can be exploited for sentiment analysis and morphological evaluation. This study coalesce morphological evaluation and sentiment analysis for the task of classification of farmer suicide cases reported in Punjab state of India. The pre-processing of Punjabi text involves morphological evaluation and normalization of Punjabi word tokens followed by the training of proposed model using deep learning classification on Punjabi language text extracted from online Punjabi news reports. The class-wise accuracies of sentiment prediction for four negatively oriented classes of farmer suicide cases are 93.85%, 88.53%, 83.3%, and 95.45% respectively. The overall accuracy of sentiment classification obtained using proposed framework on 275 Punjabi text documents is found to be 90.29%.

Keywords: deep neural network, farmer suicides, morphological processing, punjabi text, sentiment analysis

Procedia PDF Downloads 317
2328 A Nonlinear Feature Selection Method for Hyperspectral Image Classification

Authors: Pei-Jyun Hsieh, Cheng-Hsuan Li, Bor-Chen Kuo

Abstract:

For hyperspectral image classification, feature reduction is an important pre-processing for avoiding the Hughes phenomena due to the difficulty for collecting training samples. Hence, lots of researches developed feature selection methods such as F-score, HSIC (Hilbert-Schmidt Independence Criterion), and etc., to improve hyperspectral image classification. However, most of them only consider the class separability in the original space, i.e., a linear class separability. In this study, we proposed a nonlinear class separability measure based on kernel trick for selecting an appropriate feature subset. The proposed nonlinear class separability was formed by a generalized RBF kernel with different bandwidths with respect to different features. Moreover, it considered the within-class separability and the between-class separability. A genetic algorithm was applied to tune these bandwidths such that the smallest with-class separability and the largest between-class separability simultaneously. This indicates the corresponding feature space is more suitable for classification. In addition, the corresponding nonlinear classification boundary can separate classes very well. These optimal bandwidths also show the importance of bands for hyperspectral image classification. The reciprocals of these bandwidths can be viewed as weights of bands. The smaller bandwidth, the larger weight of the band, and the more importance for classification. Hence, the descending order of the reciprocals of the bands gives an order for selecting the appropriate feature subsets. In the experiments, three hyperspectral image data sets, the Indian Pine Site data set, the PAVIA data set, and the Salinas A data set, were used to demonstrate the selected feature subsets by the proposed nonlinear feature selection method are more appropriate for hyperspectral image classification. Only ten percent of samples were randomly selected to form the training dataset. All non-background samples were used to form the testing dataset. The support vector machine was applied to classify these testing samples based on selected feature subsets. According to the experiments on the Indian Pine Site data set with 220 bands, the highest accuracies by applying the proposed method, F-score, and HSIC are 0.8795, 0.8795, and 0.87404, respectively. However, the proposed method selects 158 features. F-score and HSIC select 168 features and 217 features, respectively. Moreover, the classification accuracies increase dramatically only using first few features. The classification accuracies with respect to feature subsets of 10 features, 20 features, 50 features, and 110 features are 0.69587, 0.7348, 0.79217, and 0.84164, respectively. Furthermore, only using half selected features (110 features) of the proposed method, the corresponding classification accuracy (0.84168) is approximate to the highest classification accuracy, 0.8795. For other two hyperspectral image data sets, the PAVIA data set and Salinas A data set, we can obtain the similar results. These results illustrate our proposed method can efficiently find feature subsets to improve hyperspectral image classification. One can apply the proposed method to determine the suitable feature subset first according to specific purposes. Then researchers can only use the corresponding sensors to obtain the hyperspectral image and classify the samples. This can not only improve the classification performance but also reduce the cost for obtaining hyperspectral images.

Keywords: hyperspectral image classification, nonlinear feature selection, kernel trick, support vector machine

Procedia PDF Downloads 260
2327 Personal Information Classification Based on Deep Learning in Automatic Form Filling System

Authors: Shunzuo Wu, Xudong Luo, Yuanxiu Liao

Abstract:

Recently, the rapid development of deep learning makes artificial intelligence (AI) penetrate into many fields, replacing manual work there. In particular, AI systems also become a research focus in the field of automatic office. To meet real needs in automatic officiating, in this paper we develop an automatic form filling system. Specifically, it uses two classical neural network models and several word embedding models to classify various relevant information elicited from the Internet. When training the neural network models, we use less noisy and balanced data for training. We conduct a series of experiments to test my systems and the results show that our system can achieve better classification results.

Keywords: artificial intelligence and office, NLP, deep learning, text classification

Procedia PDF Downloads 196
2326 Multi-Level Air Quality Classification in China Using Information Gain and Support Vector Machine

Authors: Bingchun Liu, Pei-Chann Chang, Natasha Huang, Dun Li

Abstract:

Machine Learning and Data Mining are the two important tools for extracting useful information and knowledge from large datasets. In machine learning, classification is a wildly used technique to predict qualitative variables and is generally preferred over regression from an operational point of view. Due to the enormous increase in air pollution in various countries especially China, Air Quality Classification has become one of the most important topics in air quality research and modelling. This study aims at introducing a hybrid classification model based on information theory and Support Vector Machine (SVM) using the air quality data of four cities in China namely Beijing, Guangzhou, Shanghai and Tianjin from Jan 1, 2014 to April 30, 2016. China's Ministry of Environmental Protection has classified the daily air quality into 6 levels namely Serious Pollution, Severe Pollution, Moderate Pollution, Light Pollution, Good and Excellent based on their respective Air Quality Index (AQI) values. Using the information theory, information gain (IG) is calculated and feature selection is done for both categorical features and continuous numeric features. Then SVM Machine Learning algorithm is implemented on the selected features with cross-validation. The final evaluation reveals that the IG and SVM hybrid model performs better than SVM (alone), Artificial Neural Network (ANN) and K-Nearest Neighbours (KNN) models in terms of accuracy as well as complexity.

Keywords: machine learning, air quality classification, air quality index, information gain, support vector machine, cross-validation

Procedia PDF Downloads 232
2325 Auto Classification of Multiple ECG Arrhythmic Detection via Machine Learning Techniques: A Review

Authors: Ng Liang Shen, Hau Yuan Wen

Abstract:

Arrhythmia analysis of ECG signal plays a major role in diagnosing most of the cardiac diseases. Therefore, a single arrhythmia detection of an electrocardiographic (ECG) record can determine multiple pattern of various algorithms and match accordingly each ECG beats based on Machine Learning supervised learning. These researchers used different features and classification methods to classify different arrhythmia types. A major problem in these studies is the fact that the symptoms of the disease do not show all the time in the ECG record. Hence, a successful diagnosis might require the manual investigation of several hours of ECG records. The point of this paper presents investigations cardiovascular ailment in Electrocardiogram (ECG) Signals for Cardiac Arrhythmia utilizing examination of ECG irregular wave frames via heart beat as correspond arrhythmia which with Machine Learning Pattern Recognition.

Keywords: electrocardiogram, ECG, classification, machine learning, pattern recognition, detection, QRS

Procedia PDF Downloads 370
2324 Land Use/Land Cover Mapping Using Landsat 8 and Sentinel-2 in a Mediterranean Landscape

Authors: Moschos Vogiatzis, K. Perakis

Abstract:

Spatial-explicit and up-to-date land use/land cover information is fundamental for spatial planning, land management, sustainable development, and sound decision-making. In the last decade, many satellite-derived land cover products at different spatial, spectral, and temporal resolutions have been developed, such as the European Copernicus Land Cover product. However, more efficient and detailed information for land use/land cover is required at the regional or local scale. A typical Mediterranean basin with a complex landscape comprised of various forest types, crops, artificial surfaces, and wetlands was selected to test and develop our approach. In this study, we investigate the improvement of Copernicus Land Cover product (CLC2018) using Landsat 8 and Sentinel-2 pixel-based classification based on all available existing geospatial data (Forest Maps, LPIS, Natura2000 habitats, cadastral parcels, etc.). We examined and compared the performance of the Random Forest classifier for land use/land cover mapping. In total, 10 land use/land cover categories were recognized in Landsat 8 and 11 in Sentinel-2A. A comparison of the overall classification accuracies for 2018 shows that Landsat 8 classification accuracy was slightly higher than Sentinel-2A (82,99% vs. 80,30%). We concluded that the main land use/land cover types of CLC2018, even within a heterogeneous area, can be successfully mapped and updated according to CLC nomenclature. Future research should be oriented toward integrating spatiotemporal information from seasonal bands and spectral indexes in the classification process.

Keywords: classification, land use/land cover, mapping, random forest

Procedia PDF Downloads 117
2323 Terrain Classification for Ground Robots Based on Acoustic Features

Authors: Bernd Kiefer, Abraham Gebru Tesfay, Dietrich Klakow

Abstract:

The motivation of our work is to detect different terrain types traversed by a robot based on acoustic data from the robot-terrain interaction. Different acoustic features and classifiers were investigated, such as Mel-frequency cepstral coefficient and Gamma-tone frequency cepstral coefficient for the feature extraction, and Gaussian mixture model and Feed forward neural network for the classification. We analyze the system’s performance by comparing our proposed techniques with some other features surveyed from distinct related works. We achieve precision and recall values between 87% and 100% per class, and an average accuracy at 95.2%. We also study the effect of varying audio chunk size in the application phase of the models and find only a mild impact on performance.

Keywords: acoustic features, autonomous robots, feature extraction, terrain classification

Procedia PDF Downloads 359
2322 The Implementation of the Multi-Agent Classification System (MACS) in Compliance with FIPA Specifications

Authors: Mohamed R. Mhereeg

Abstract:

The paper discusses the implementation of the MultiAgent classification System (MACS) and utilizing it to provide an automated and accurate classification of end users developing applications in the spreadsheet domain. However, different technologies have been brought together to build MACS. The strength of the system is the integration of the agent technology with the FIPA specifications together with other technologies, which are the .NET widows service based agents, the Windows Communication Foundation (WCF) services, the Service Oriented Architecture (SOA), and Oracle Data Mining (ODM). Microsoft's .NET windows service based agents were utilized to develop the monitoring agents of MACS, the .NET WCF services together with SOA approach allowed the distribution and communication between agents over the WWW. The Monitoring Agents (MAs) were configured to execute automatically to monitor excel spreadsheets development activities by content. Data gathered by the Monitoring Agents from various resources over a period of time was collected and filtered by a Database Updater Agent (DUA) residing in the .NET client application of the system. This agent then transfers and stores the data in Oracle server database via Oracle stored procedures for further processing that leads to the classification of the end user developers.

Keywords: MACS, implementation, multi-agent, SOA, autonomous, WCF

Procedia PDF Downloads 269