Search results for: species classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 5078

Search results for: species classification

4688 Morphological Processing of Punjabi Text for Sentiment Analysis of Farmer Suicides

Authors: Jaspreet Singh, Gurvinder Singh, Prabhsimran Singh, Rajinder Singh, Prithvipal Singh, Karanjeet Singh Kahlon, Ravinder Singh Sawhney

Abstract:

Morphological evaluation of Indian languages is one of the burgeoning fields in the area of Natural Language Processing (NLP). The evaluation of a language is an eminent task in the era of information retrieval and text mining. The extraction and classification of knowledge from text can be exploited for sentiment analysis and morphological evaluation. This study coalesce morphological evaluation and sentiment analysis for the task of classification of farmer suicide cases reported in Punjab state of India. The pre-processing of Punjabi text involves morphological evaluation and normalization of Punjabi word tokens followed by the training of proposed model using deep learning classification on Punjabi language text extracted from online Punjabi news reports. The class-wise accuracies of sentiment prediction for four negatively oriented classes of farmer suicide cases are 93.85%, 88.53%, 83.3%, and 95.45% respectively. The overall accuracy of sentiment classification obtained using proposed framework on 275 Punjabi text documents is found to be 90.29%.

Keywords: deep neural network, farmer suicides, morphological processing, punjabi text, sentiment analysis

Procedia PDF Downloads 291
4687 Parasitological Study and Its Role in Fisheries Management and Stock Assessment of Boops boops (Lineauses, 1758) along the Tunisian Coast

Authors: I. Chebbi, L. Boudaya, L. Neifar

Abstract:

The bogue, Boops boops is an economically important fishery resource and commonly captured in the Mediterranean, and its diversity in parasites has been used as a tool to differentiate between stocks along with Tunisia since it is widely acceptable in fisheries management. In this study, a total of 90 fish are investigated from three localities off Tunisia, including Kelibia, Mahdia, and Zarzis. Fifteen species of parasites totaling 1270 individuals were harvested from B. boops, whereas ten parasites were used as biological tags. Based on Mahalanobis distance, each parasite species shows a great importance in the discrimination between groups. Tetraphyllidea larvae are the most influential parasites in determining the position of samples belonging to Kelibia. Monogenean species and Hysterothylacium sp. are the most important species for determining the position of samples from Mahdia. Specimens from Zarzis are characterized by the absence of the four Monogenean species and the Tetraphyllidea larvae. Parasites allocate B. boops population correctly to their origin communities with an accuracy of 83.3%. These results were corroborated by the discriminant analyses, highlighted the presence of three stocks, and improved that the parasitological method can be considered as a reliable key to provide imperative information for discriminating among B. boops stocks in Tunisian waters.

Keywords: biological marker, Boops boops, parasite, population structure

Procedia PDF Downloads 106
4686 A Nonlinear Feature Selection Method for Hyperspectral Image Classification

Authors: Pei-Jyun Hsieh, Cheng-Hsuan Li, Bor-Chen Kuo

Abstract:

For hyperspectral image classification, feature reduction is an important pre-processing for avoiding the Hughes phenomena due to the difficulty for collecting training samples. Hence, lots of researches developed feature selection methods such as F-score, HSIC (Hilbert-Schmidt Independence Criterion), and etc., to improve hyperspectral image classification. However, most of them only consider the class separability in the original space, i.e., a linear class separability. In this study, we proposed a nonlinear class separability measure based on kernel trick for selecting an appropriate feature subset. The proposed nonlinear class separability was formed by a generalized RBF kernel with different bandwidths with respect to different features. Moreover, it considered the within-class separability and the between-class separability. A genetic algorithm was applied to tune these bandwidths such that the smallest with-class separability and the largest between-class separability simultaneously. This indicates the corresponding feature space is more suitable for classification. In addition, the corresponding nonlinear classification boundary can separate classes very well. These optimal bandwidths also show the importance of bands for hyperspectral image classification. The reciprocals of these bandwidths can be viewed as weights of bands. The smaller bandwidth, the larger weight of the band, and the more importance for classification. Hence, the descending order of the reciprocals of the bands gives an order for selecting the appropriate feature subsets. In the experiments, three hyperspectral image data sets, the Indian Pine Site data set, the PAVIA data set, and the Salinas A data set, were used to demonstrate the selected feature subsets by the proposed nonlinear feature selection method are more appropriate for hyperspectral image classification. Only ten percent of samples were randomly selected to form the training dataset. All non-background samples were used to form the testing dataset. The support vector machine was applied to classify these testing samples based on selected feature subsets. According to the experiments on the Indian Pine Site data set with 220 bands, the highest accuracies by applying the proposed method, F-score, and HSIC are 0.8795, 0.8795, and 0.87404, respectively. However, the proposed method selects 158 features. F-score and HSIC select 168 features and 217 features, respectively. Moreover, the classification accuracies increase dramatically only using first few features. The classification accuracies with respect to feature subsets of 10 features, 20 features, 50 features, and 110 features are 0.69587, 0.7348, 0.79217, and 0.84164, respectively. Furthermore, only using half selected features (110 features) of the proposed method, the corresponding classification accuracy (0.84168) is approximate to the highest classification accuracy, 0.8795. For other two hyperspectral image data sets, the PAVIA data set and Salinas A data set, we can obtain the similar results. These results illustrate our proposed method can efficiently find feature subsets to improve hyperspectral image classification. One can apply the proposed method to determine the suitable feature subset first according to specific purposes. Then researchers can only use the corresponding sensors to obtain the hyperspectral image and classify the samples. This can not only improve the classification performance but also reduce the cost for obtaining hyperspectral images.

Keywords: hyperspectral image classification, nonlinear feature selection, kernel trick, support vector machine

Procedia PDF Downloads 244
4685 Comparison of Chlorophyll Contents in Common Bean (Phaseolus vulgaris L.) and Runner Bean (P. coccineous L.) Genotypes

Authors: Huseyin Canci

Abstract:

Chlorophylls are green photosynthetic pigment in plants. Therefore, photosynthesis in plants occurs in the leaves. Roles of chlorophylls help plants to get energy from light. The aim of the present study is to compare of chlorophyll contents in some bean species including common bean (Phaseolus vulgaris L.) and runner bean (P. coccineous L.) and genotypes. This research was carried out in fields of Faculty of Agriculture, Akdeniz University in Antalya. Species and genotypes were grown in 2 m single row and 50 cm row spacing. A randomized blocks design was used with two replications. Totally, 124 beans species and genotypes which 122 common beans and 2 runner beans were sown on February, 17th 2014 by hand. Chlorophyll a + b (SPAD values) were determined seedling stage, days to flowering 50% and pod setting stage on bean genotypes. Results showed that there were significant differences for genotypes, stages and interaction of genotypes X stages. There was statistically significant relationships between yield and chlorophyll content of bean species and genotypes.

Keywords: bean, chlorophyll, Phaseolus, SPAD values

Procedia PDF Downloads 214
4684 Personal Information Classification Based on Deep Learning in Automatic Form Filling System

Authors: Shunzuo Wu, Xudong Luo, Yuanxiu Liao

Abstract:

Recently, the rapid development of deep learning makes artificial intelligence (AI) penetrate into many fields, replacing manual work there. In particular, AI systems also become a research focus in the field of automatic office. To meet real needs in automatic officiating, in this paper we develop an automatic form filling system. Specifically, it uses two classical neural network models and several word embedding models to classify various relevant information elicited from the Internet. When training the neural network models, we use less noisy and balanced data for training. We conduct a series of experiments to test my systems and the results show that our system can achieve better classification results.

Keywords: artificial intelligence and office, NLP, deep learning, text classification

Procedia PDF Downloads 157
4683 Monitoring of Quantitative and Qualitative Changes in Combustible Material in the Białowieża Forest

Authors: Damian Czubak

Abstract:

The Białowieża Forest is a very valuable natural area, included in the World Natural Heritage at UNESCO, where, due to infestation by the bark beetle (Ips typographus), norway spruce (Picea abies) have deteriorated. This catastrophic scenario led to an increase in fire danger. This was due to the occurrence of large amounts of dead wood and grass cover, as light penetrated to the bottom of the stands. These factors in a dry state are materials that favour the possibility of fire and the rapid spread of fire. One of the objectives of the study was to monitor the quantitative and qualitative changes of combustible material on the permanent decay plots of spruce stands from 2012-2022. In addition, the size of the area with highly flammable vegetation was monitored and a classification of the stands of the Białowieża Forest by flammability classes was made. The key factor that determines the potential fire hazard of a forest is combustible material. Primarily its type, quantity, moisture content, size and spatial structure. Based on the inventory data on the areas of forest districts in the Białowieża Forest, the average fire load and its changes over the years were calculated. The analysis was carried out taking into account the changes in the health status of the stands and sanitary operations. The quantitative and qualitative assessment of fallen timber and fire load of ground cover used the results of the 2019 and 2021 inventories. Approximately 9,000 circular plots were used for the study. An assessment was made of the amount of potential fuel, understood as ground cover vegetation and dead wood debris. In addition, monitoring of areas with vegetation that poses a high fire risk was conducted using data from 2019 and 2021. All sub-areas were inventoried where vegetation posing a specific fire hazard represented at least 10% of the area with species characteristic of that cover. In addition to the size of the area with fire-prone vegetation, a very important element is the size of the fire load on the indicated plots. On representative plots, the biomass of the land cover was measured on an area of 10 m2 and then the amount of biomass of each component was determined. The resulting element of variability of ground covers in stands was their flammability classification. The classification developed made it possible to track changes in the flammability classes of stands over the period covered by the measurements.

Keywords: classification, combustible material, flammable vegetation, Norway spruce

Procedia PDF Downloads 65
4682 Chemical Properties of Yushania alpina and Bamusa oldhamii Bamboo Species

Authors: Getu Dessalegn Asfaw, Yalew Dessalegn Asfaw

Abstract:

This research aims to examine the chemical composition of bamboo species in Ethiopia under the effect of age and culm height. The chemical composition of bamboo species in Ethiopia has not been investigated so far. The highest to the lowest cellulose and hemicellulose contents are Injibara (Y. alpina), Mekaneselam (Y. alpina), and Kombolcha (B. oldhamii), whereas lignin, extractives, and ash contents are Kombolcha, Mekanesealm, and Injibra, respectively. As a result of this research, the highest and lowest cellulose, hemicelluloses and lignin contents are at the age of 2 and 1 year old, respectively. Whereas extractives and ash contents are decreased at the age of the culm matured. The cellulose, hemicelluloses, lignin, and ash contents of the culm increase from the bottom to top along the height, however, extractive contents decrease from the bottom to top position. The cellulose content of Injibara, Kombolch, and Mekaneselam bamboo was recorded at 51±1.7–53±1.8%, 45±1.6%–48±1.5%, and 48±1.8–51±1.6%, and hemicelluloses content was measured at 20±1.2–23±1.1%, 17±1.0–19±0.9%, and 18±1.0–20±1.0%, lignin content was measured 19±1.0–21±1.1%, 27±1.2–29±1.1%, and 21±1.1–24±1.1%, extractive content was measured 3.9±0.2 –4.5±0.2%, 6.6±0.3–7.8±0.4%, and 4.7±0.2–5.2±0.1%, ash content was measured 1.6±0.1–2.1±0.1%, 2.8±0.1–3.5±0.2%, and 1.9±0.1–2.5±0.1% at the ages of 1–3 years old, respectively. This result demonstrated that bamboo species in Ethiopia can be a source of feedstock for lignocelluloses ethanol and bamboo composite production since they have higher cellulose content.

Keywords: age, bamboo species, culm height, chemical composition

Procedia PDF Downloads 78
4681 Gut Mycobiome Dysbiosis and Its Impact on Intestinal Permeability in Attention-Deficit/Hyperactivity Disorder

Authors: Liang-Jen Wang, Sung-Chou Li, Yuan-Ming Yeh, Sheng-Yu Lee, Ho-Chang Kuo, Chia-Yu Yang

Abstract:

Background: Dysbiosis in the gut microbial community might be involved in the pathophysiology of attention deficit/hyperactivity disorder (ADHD). The fungal component of the gut microbiome, namely the mycobiota, is a hyperdiverse group of multicellular eukaryotes that can influence host intestinal permeability. This study therefore aimed to investigate the impact of fungal mycobiome dysbiosis and intestinal permeability on ADHD. Methods: Faecal samples were collected from 35 children with ADHD and from 35 healthy controls. Total DNA was extracted from the faecal samples, and the internal transcribed spacer (ITS) regions were sequenced using high-throughput next-generation sequencing (NGS). The fungal taxonomic classification was analysed using bioinformatics tools, and the differentially expressed fungal species between the ADHD and healthy control groups were identified. An in vitro permeability assay (Caco-2 cell layer) was used to evaluate the biological effects of fungal dysbiosis on intestinal epithelial barrier function. Results: The β-diversity (the species diversity between two communities), but not α-diversity (the species diversity within a community), reflected the differences in fungal community composition between ADHD and control groups. At the phylum level, the ADHD group displayed a significantly higher abundance of Ascomycota and significantly lower abundance of Basidiomycota than the healthy control group. At the genus level, the abundance of Candida (especially Candida albicans) was significantly increased in ADHD patients compared to the healthy controls. In addition, the in vitro cell assay revealed that C. albicans secretions significantly enhanced the permeability of Caco-2 cells. Conclusions: The current study is the first to explore altered gut mycobiome dysbiosis using the NGS platform in ADHD. The findings from this study indicated that dysbiosis of the fungal mycobiome and intestinal permeability might be associated with susceptibility to ADHD.

Keywords: ADHD, fungus, gut–brain axis, biomarker, child psychiatry

Procedia PDF Downloads 84
4680 Multi-Level Air Quality Classification in China Using Information Gain and Support Vector Machine

Authors: Bingchun Liu, Pei-Chann Chang, Natasha Huang, Dun Li

Abstract:

Machine Learning and Data Mining are the two important tools for extracting useful information and knowledge from large datasets. In machine learning, classification is a wildly used technique to predict qualitative variables and is generally preferred over regression from an operational point of view. Due to the enormous increase in air pollution in various countries especially China, Air Quality Classification has become one of the most important topics in air quality research and modelling. This study aims at introducing a hybrid classification model based on information theory and Support Vector Machine (SVM) using the air quality data of four cities in China namely Beijing, Guangzhou, Shanghai and Tianjin from Jan 1, 2014 to April 30, 2016. China's Ministry of Environmental Protection has classified the daily air quality into 6 levels namely Serious Pollution, Severe Pollution, Moderate Pollution, Light Pollution, Good and Excellent based on their respective Air Quality Index (AQI) values. Using the information theory, information gain (IG) is calculated and feature selection is done for both categorical features and continuous numeric features. Then SVM Machine Learning algorithm is implemented on the selected features with cross-validation. The final evaluation reveals that the IG and SVM hybrid model performs better than SVM (alone), Artificial Neural Network (ANN) and K-Nearest Neighbours (KNN) models in terms of accuracy as well as complexity.

Keywords: machine learning, air quality classification, air quality index, information gain, support vector machine, cross-validation

Procedia PDF Downloads 205
4679 Auto Classification of Multiple ECG Arrhythmic Detection via Machine Learning Techniques: A Review

Authors: Ng Liang Shen, Hau Yuan Wen

Abstract:

Arrhythmia analysis of ECG signal plays a major role in diagnosing most of the cardiac diseases. Therefore, a single arrhythmia detection of an electrocardiographic (ECG) record can determine multiple pattern of various algorithms and match accordingly each ECG beats based on Machine Learning supervised learning. These researchers used different features and classification methods to classify different arrhythmia types. A major problem in these studies is the fact that the symptoms of the disease do not show all the time in the ECG record. Hence, a successful diagnosis might require the manual investigation of several hours of ECG records. The point of this paper presents investigations cardiovascular ailment in Electrocardiogram (ECG) Signals for Cardiac Arrhythmia utilizing examination of ECG irregular wave frames via heart beat as correspond arrhythmia which with Machine Learning Pattern Recognition.

Keywords: electrocardiogram, ECG, classification, machine learning, pattern recognition, detection, QRS

Procedia PDF Downloads 344
4678 Land Use/Land Cover Mapping Using Landsat 8 and Sentinel-2 in a Mediterranean Landscape

Authors: Moschos Vogiatzis, K. Perakis

Abstract:

Spatial-explicit and up-to-date land use/land cover information is fundamental for spatial planning, land management, sustainable development, and sound decision-making. In the last decade, many satellite-derived land cover products at different spatial, spectral, and temporal resolutions have been developed, such as the European Copernicus Land Cover product. However, more efficient and detailed information for land use/land cover is required at the regional or local scale. A typical Mediterranean basin with a complex landscape comprised of various forest types, crops, artificial surfaces, and wetlands was selected to test and develop our approach. In this study, we investigate the improvement of Copernicus Land Cover product (CLC2018) using Landsat 8 and Sentinel-2 pixel-based classification based on all available existing geospatial data (Forest Maps, LPIS, Natura2000 habitats, cadastral parcels, etc.). We examined and compared the performance of the Random Forest classifier for land use/land cover mapping. In total, 10 land use/land cover categories were recognized in Landsat 8 and 11 in Sentinel-2A. A comparison of the overall classification accuracies for 2018 shows that Landsat 8 classification accuracy was slightly higher than Sentinel-2A (82,99% vs. 80,30%). We concluded that the main land use/land cover types of CLC2018, even within a heterogeneous area, can be successfully mapped and updated according to CLC nomenclature. Future research should be oriented toward integrating spatiotemporal information from seasonal bands and spectral indexes in the classification process.

Keywords: classification, land use/land cover, mapping, random forest

Procedia PDF Downloads 102
4677 Hyperspectral Imagery for Tree Speciation and Carbon Mass Estimates

Authors: Jennifer Buz, Alvin Spivey

Abstract:

The most common greenhouse gas emitted through human activities, carbon dioxide (CO2), is naturally consumed by plants during photosynthesis. This process is actively being monetized by companies wishing to offset their carbon dioxide emissions. For example, companies are now able to purchase protections for vegetated land due-to-be clear cut or purchase barren land for reforestation. Therefore, by actively preventing the destruction/decay of plant matter or by introducing more plant matter (reforestation), a company can theoretically offset some of their emissions. One of the biggest issues in the carbon credit market is validating and verifying carbon offsets. There is a need for a system that can accurately and frequently ensure that the areas sold for carbon credits have the vegetation mass (and therefore for carbon offset capability) they claim. Traditional techniques for measuring vegetation mass and determining health are costly and require many person-hours. Orbital Sidekick offers an alternative approach that accurately quantifies carbon mass and assesses vegetation health through satellite hyperspectral imagery, a technique which enables us to remotely identify material composition (including plant species) and condition (e.g., health and growth stage). How much carbon a plant is capable of storing ultimately is tied to many factors, including material density (primarily species-dependent), plant size, and health (trees that are actively decaying are not effectively storing carbon). All of these factors are capable of being observed through satellite hyperspectral imagery. This abstract focuses on speciation. To build a species classification model, we matched pixels in our remote sensing imagery to plants on the ground for which we know the species. To accomplish this, we collaborated with the researchers at the Teakettle Experimental Forest. Our remote sensing data comes from our airborne “Kato” sensor, which flew over the study area and acquired hyperspectral imagery (400-2500 nm, 472 bands) at ~0.5 m/pixel resolution. Coverage of the entire teakettle experimental forest required capturing dozens of individual hyperspectral images. In order to combine these images into a mosaic, we accounted for potential variations of atmospheric conditions throughout the data collection. To do this, we ran an open source atmospheric correction routine called ISOFIT1 (Imaging Spectrometer Optiman FITting), which converted all of our remote sensing data from radiance to reflectance. A database of reflectance spectra for each of the tree species within the study area was acquired using the Teakettle stem map and the geo-referenced hyperspectral images. We found that a wide variety of machine learning classifiers were able to identify the species within our images with high (>95%) accuracy. For the most robust quantification of carbon mass and the best assessment of the health of a vegetated area, speciation is critical. Through the use of high resolution hyperspectral data, ground-truth databases, and complex analytical techniques, we are able to determine the species present within a pixel to a high degree of accuracy. These species identifications will feed directly into our carbon mass model.

Keywords: hyperspectral, satellite, carbon, imagery, python, machine learning, speciation

Procedia PDF Downloads 96
4676 Terrain Classification for Ground Robots Based on Acoustic Features

Authors: Bernd Kiefer, Abraham Gebru Tesfay, Dietrich Klakow

Abstract:

The motivation of our work is to detect different terrain types traversed by a robot based on acoustic data from the robot-terrain interaction. Different acoustic features and classifiers were investigated, such as Mel-frequency cepstral coefficient and Gamma-tone frequency cepstral coefficient for the feature extraction, and Gaussian mixture model and Feed forward neural network for the classification. We analyze the system’s performance by comparing our proposed techniques with some other features surveyed from distinct related works. We achieve precision and recall values between 87% and 100% per class, and an average accuracy at 95.2%. We also study the effect of varying audio chunk size in the application phase of the models and find only a mild impact on performance.

Keywords: acoustic features, autonomous robots, feature extraction, terrain classification

Procedia PDF Downloads 339
4675 The Implementation of the Multi-Agent Classification System (MACS) in Compliance with FIPA Specifications

Authors: Mohamed R. Mhereeg

Abstract:

The paper discusses the implementation of the MultiAgent classification System (MACS) and utilizing it to provide an automated and accurate classification of end users developing applications in the spreadsheet domain. However, different technologies have been brought together to build MACS. The strength of the system is the integration of the agent technology with the FIPA specifications together with other technologies, which are the .NET widows service based agents, the Windows Communication Foundation (WCF) services, the Service Oriented Architecture (SOA), and Oracle Data Mining (ODM). Microsoft's .NET windows service based agents were utilized to develop the monitoring agents of MACS, the .NET WCF services together with SOA approach allowed the distribution and communication between agents over the WWW. The Monitoring Agents (MAs) were configured to execute automatically to monitor excel spreadsheets development activities by content. Data gathered by the Monitoring Agents from various resources over a period of time was collected and filtered by a Database Updater Agent (DUA) residing in the .NET client application of the system. This agent then transfers and stores the data in Oracle server database via Oracle stored procedures for further processing that leads to the classification of the end user developers.

Keywords: MACS, implementation, multi-agent, SOA, autonomous, WCF

Procedia PDF Downloads 258
4674 HPTLC Based Qualitative and Quantitative Evaluation of Uraria picta Desv: A Dashmool Species

Authors: Hari O. Saxena, Ganesh

Abstract:

In the present investigation, chemical fingerprints of methanolic extracts of roots, stem and leaves of Uraria picta were developed using HPTLC technique. These fingerprints will be useful for authentication as well as in differentiating the species from adulterants. These will also serve as a biochemical marker for this valuable species in pharmaceutical industries and plant systemic studies. Roots, stem and leaves of Uraria picta were further evaluated for quantification of an active ingredient lupeol to find out alternatives to roots. Results showed more content of lupeol in stem (0.048%, dry wt.) as compare to roots (0.017%, dry wt.) suggesting the utilization of stem in place of roots. It will avoid uprooting of this prestigious plant which ultimately will promote its conservation.

Keywords: chemical fingerprints, lupeol, quantification, Uraria picta

Procedia PDF Downloads 231
4673 Validation of the X-Ray Densitometry Method for Radial Density Pattern Determination of Acacia seyal var. seyal Tree Species

Authors: Hanadi Mohamed Shawgi Gamal, Claus Thomas Bues

Abstract:

Wood density is a variable influencing many of the technological and quality properties of wood. Understanding the pattern of wood density radial variation is important for its end-use. The X-ray technique, traditionally applied to softwood species to assess the wood quality properties, due to its simple and relatively uniform wood structure. On the other hand, very limited information is available about the validation of using this technique for hardwood species. The suitability of using the X-ray technique for the determination of hardwood density has a special significance in countries like Sudan, where only a few timbers are well known. This will not only save the time consumed by using the traditional methods, but it will also enhance the investigations of the great number of the lesser known species, the thing which will fill the huge cap of lake information of hardwood species growing in Sudan. The current study aimed to evaluate the validation of using the X-ray densitometry technique to determine the radial variation of wood density of Acacia seyal var. seyal. To this, a total of thirty trees were collected randomly from four states in Sudan. The wood density radial trend was determined using the basic density as well as density obtained by the X-ray densitometry method in order to assess the validation of X-ray technique in wood density radial variation determination. The results showed that the pattern of radial trend of density obtained by X-ray technique is very similar to that achieved by basic density. These results confirmed the validation of using the X-ray technique for Acacia seyal var. seyal density radial trend determination. It also promotes the suitability of using this method in other hardwood species.

Keywords: x-ray densitometry, wood density, Acacia seyal var. seyal, radial variation

Procedia PDF Downloads 124
4672 A Text Classification Approach Based on Natural Language Processing and Machine Learning Techniques

Authors: Rim Messaoudi, Nogaye-Gueye Gning, François Azelart

Abstract:

Automatic text classification applies mostly natural language processing (NLP) and other AI-guided techniques to automatically classify text in a faster and more accurate manner. This paper discusses the subject of using predictive maintenance to manage incident tickets inside the sociality. It focuses on proposing a tool that treats and analyses comments and notes written by administrators after resolving an incident ticket. The goal here is to increase the quality of these comments. Additionally, this tool is based on NLP and machine learning techniques to realize the textual analytics of the extracted data. This approach was tested using real data taken from the French National Railways (SNCF) company and was given a high-quality result.

Keywords: machine learning, text classification, NLP techniques, semantic representation

Procedia PDF Downloads 66
4671 Effect of Seasonal Variation on Two Introduced Columbiformes in Awba Dam Tourism Centre, University of Ibadan, Ibadan

Authors: Kolawole F. Farinloye, Samson O. Ojo

Abstract:

Two Columbiformes species were recently introduced to the newly established Awba Dam Tourism Centre [ADTC], hence there is need to investigate the effect of seasonal variation on these species with respect to hematological composition. Blood samples were obtained from superficial ulna vein of the 128 apparently healthy C. livia and C. guinea into tubes containing EDTA as anticoagulant. Thin blood smears (TBS) were prepared, stained and viewed under microscope. Values of Red Blood Cell (RBC) count, White Blood Cell (WBC) count, cholesterol (CH), Uric Acid (UA), Protein (PR), Mean Corpuscular Volume (MCV), Haemoglobin Content (HB), Blood Volume (BV), Plasma Glucose (PG) and Length/Width (L/W) ratio of red blood cells were assessed. The procedure was carried out on a seasonal basis (wet and dry seasons of 2013-2014). Data was analyzed using descriptive and inferential statistics. Lymphocyte count for C. livia was F3, 161 = 13.15, while for C. guinea was F3, 178 = 13.15. Heterophil, H/L ratio and Muscle score values for both species were (rs = -0.38, rs = -0.44), (rs = 0.51, rs = 0.31) (4, 3) respectively. Analyses also demonstrated a low WBC to RBC ratio (0.004: 25.3) in both species during the wet season compared to dry season, respectively. L/W varied significantly among sampling seasons i.e. wet (19.1% of BV, 12.6% of BV, 0.1% of BV) and dry (18.9% of BV, 12.7% of BV, 0.08% of BV). The level of HB in wet season (19.20±8.46108) is lower compared to dry season (19.70±8.48762). T-test also showed (wet=15.625, 0.111), (dry=12.125, 0.146) respectively, hence there is no association between species and haematological parameters. Species introduced were found to be haematologically stable. Although there were slight differences in seasonal composition, however this can be attributed to seasonal variation; suggesting little or no effect of seasons on their blood composition.

Keywords: seasonal variation, Columbiformes, Awba Dam tourism centre, University of Ibadan, Ibadan

Procedia PDF Downloads 300
4670 One Species into Five: Nucleo-Mito Barcoding Reveals Cryptic Species in 'Frankliniella Schultzei Complex': Vector for Tospoviruses

Authors: Vikas Kumar, Kailash Chandra, Kaomud Tyagi

Abstract:

The insect order Thysanoptera includes small insects commonly called thrips. As insect vectors, only thrips are capable of Tospoviruses transmission (genus Tospovirus, family Bunyaviridae) affecting various crops. Currently, fifteen species of subfamily Thripinae (Thripidae) have been reported as vectors for tospoviruses. Frankliniella schultzei, which is reported as act as a vector for at least five tospovirses, have been suspected to be a species complex with more than one species. It is one of the historical unresolved issues where, two species namely, F. schultzei Trybom and F. sulphurea Schmutz were erected from South Africa and Srilanaka respectively. These two species were considered to be valid until 1968 when sulphurea was treated as colour morph (pale form) and synonymised under schultzei (dark form) However, these two have been considered as valid species by some of the thrips workers. Parallel studies have indicated that brown form of schultzei is a vector for tospoviruses while yellow form is a non-vector. However, recent studies have shown that yellow populations have also been documented as vectors. In view of all these facts, it is highly important to have a clear understanding whether these colour forms represent true species or merely different populations with different vector carrying capacities and whether there is some hidden diversity in 'Frankliniella schultzei species complex'. In this study, we aim to study the 'Frankliniella schultzei species complex' with molecular spectacles with DNA data from India and Australia and Africa. A total of fifty-five specimens was collected from diverse locations in India and Australia. We generated molecular data using partial fragments of mitochondrial cytochrome c oxidase I gene (mtCOI) and 28S rRNA gene. For COI dataset, there were seventy-four sequences, out of which data on fifty-five was generated in the current study and others were retrieved from NCBI. All the four different tree construction methods: neighbor-joining, maximum parsimony, maximum likelihood and Bayesian analysis, yielded the same tree topology and produced five cryptic species with high genetic divergence. For, rDNA, there were forty-five sequences, out of which data on thirty-nine was generated in the current study and others were retrieved from NCBI. The four tree building methods yielded four cryptic species with high bootstrap support value/posterior probability. Here we could not retrieve one cryptic species from South Africa as we could not generate data on rDNA from South Africa and sequence for rDNA from African region were not available in the database. The results of multiple species delimitation methods (barcode index numbers, automatic barcode gap discovery, general mixed Yule-coalescent, and Poisson-tree-processes) also supported the phylogenetic data and produced 5 and 4 Molecular Operational Taxonomic Units (MOTUs) for mtCOI and 28S dataset respectively. These results of our study indicate the likelihood that F. sulphurea may be a valid species, however, more morphological and molecular data is required on specimens from type localities of these two species and comparison with type specimens.

Keywords: DNA barcoding, species complex, thrips, species delimitation

Procedia PDF Downloads 110
4669 Wolof Voice Response Recognition System: A Deep Learning Model for Wolof Audio Classification

Authors: Krishna Mohan Bathula, Fatou Bintou Loucoubar, FNU Kaleemunnisa, Christelle Scharff, Mark Anthony De Castro

Abstract:

Voice recognition algorithms such as automatic speech recognition and text-to-speech systems with African languages can play an important role in bridging the digital divide of Artificial Intelligence in Africa, contributing to the establishment of a fully inclusive information society. This paper proposes a Deep Learning model that can classify the user responses as inputs for an interactive voice response system. A dataset with Wolof language words ‘yes’ and ‘no’ is collected as audio recordings. A two stage Data Augmentation approach is adopted for enhancing the dataset size required by the deep neural network. Data preprocessing and feature engineering with Mel-Frequency Cepstral Coefficients are implemented. Convolutional Neural Networks (CNNs) have proven to be very powerful in image classification and are promising for audio processing when sounds are transformed into spectra. For performing voice response classification, the recordings are transformed into sound frequency feature spectra and then applied image classification methodology using a deep CNN model. The inference model of this trained and reusable Wolof voice response recognition system can be integrated with many applications associated with both web and mobile platforms.

Keywords: automatic speech recognition, interactive voice response, voice response recognition, wolof word classification

Procedia PDF Downloads 90
4668 A Deep Learning Approach to Subsection Identification in Electronic Health Records

Authors: Nitin Shravan, Sudarsun Santhiappan, B. Sivaselvan

Abstract:

Subsection identification, in the context of Electronic Health Records (EHRs), is identifying the important sections for down-stream tasks like auto-coding. In this work, we classify the text present in EHRs according to their information, using machine learning and deep learning techniques. We initially describe briefly about the problem and formulate it as a text classification problem. Then, we discuss upon the methods from the literature. We try two approaches - traditional feature extraction based machine learning methods and deep learning methods. Through experiments on a private dataset, we establish that the deep learning methods perform better than the feature extraction based Machine Learning Models.

Keywords: deep learning, machine learning, semantic clinical classification, subsection identification, text classification

Procedia PDF Downloads 186
4667 Impact of Land-Use and Climate Change on the Population Structure and Distribution Range of the Rare and Endangered Dracaena ombet and Dobera glabra in Northern Ethiopia

Authors: Emiru Birhane, Tesfay Gidey, Haftu Abrha, Abrha Brhan, Amanuel Zenebe, Girmay Gebresamuel, Florent Noulèkoun

Abstract:

Dracaena ombet and Dobera glabra are two of the most rare and endangered tree species in dryland areas. Unfortunately, their sustainability is being compromised by different anthropogenic and natural factors. However, the impacts of ongoing land use and climate change on the population structure and distribution of the species are less explored. This study was carried out in the grazing lands and hillside areas of the Desa'a dry Afromontane forest, northern Ethiopia, to characterize the population structure of the species and predict the impact of climate change on their potential distributions. In each land-use type, abundance, diameter at breast height, and height of the trees were collected using 70 sampling plots distributed over seven transects spaced one km apart. The geographic coordinates of each individual tree were also recorded. The results showed that the species populations were characterized by low abundance and unstable population structure. The latter was evinced by a lack of seedlings and mature trees. The study also revealed that the total abundance and dendrometric traits of the trees were significantly different between the two land uses. The hillside areas had a denser abundance of bigger and taller trees than the grazing lands. Climate change predictions using the MaxEnt model highlighted that future temperature increases coupled with reduced precipitation would lead to significant reductions in the suitable habitats of the species in northern Ethiopia. The species' suitable habitats were predicted to decline by 48–83% for D. ombet and 35–87% for D. glabra. Hence, to sustain the species populations, different strategies should be adopted, namely the introduction of alternative livelihoods (e.g., gathering NTFP) to reduce the overexploitation of the species for subsistence income and the protection of the current habitats that will remain suitable in the future using community-based exclosures. Additionally, the preservation of the species' seeds in gene banks is crucial to ensure their long-term conservation.

Keywords: grazing lands, hillside areas, land-use change, MaxEnt, range limitation, rare and endangered tree species

Procedia PDF Downloads 55
4666 Bird-Adapted Filter for Avian Species and Individual Identification Systems Improvement

Authors: Ladislav Ptacek, Jan Vanek, Jan Eisner, Alexandra Pruchova, Pavel Linhart, Ludek Muller, Dana Jirotkova

Abstract:

One of the essential steps of avian song processing is signal filtering. Currently, the standard methods of filtering are the Mel Bank Filter or linear filter distribution. In this article, a new type of bank filter called the Bird-Adapted Filter is introduced; whereby the signal filtering is modifiable, based upon a new mathematical description of audiograms for particular bird species or order, which was named the Avian Audiogram Unified Equation. According to the method, filters may be deliberately distributed by frequency. The filters are more concentrated in bands of higher sensitivity where there is expected to be more information transmitted and vice versa. Further, it is demonstrated a comparison of various filters for automatic individual recognition of chiffchaff (Phylloscopus collybita). The average Equal Error Rate (EER) value for Linear bank filter was 16.23%, for Mel Bank Filter 18.71%, the Bird-Adapted Filter gave 14.29%, and Bird-Adapted Filter with 1/3 modification was 12.95%. This approach would be useful for practical use in automatic systems for avian species and individual identification. Since the Bird-Adapted Filter filtration is based on the measured audiograms of particular species or orders, selecting the distribution according to the avian vocalization provides the most precise filter distribution to date.

Keywords: avian audiogram, bird individual identification, bird song processing, bird species recognition, filter bank

Procedia PDF Downloads 361
4665 Comparative Analysis of Spectral Estimation Methods for Brain-Computer Interfaces

Authors: Rafik Djemili, Hocine Bourouba, M. C. Amara Korba

Abstract:

In this paper, we present a method in order to classify EEG signals for Brain-Computer Interfaces (BCI). EEG signals are first processed by means of spectral estimation methods to derive reliable features before classification step. Spectral estimation methods used are standard periodogram and the periodogram calculated by the Welch method; both methods are compared with Logarithm of Band Power (logBP) features. In the method proposed, we apply Linear Discriminant Analysis (LDA) followed by Support Vector Machine (SVM). Classification accuracy reached could be as high as 85%, which proves the effectiveness of classification of EEG signals based BCI using spectral methods.

Keywords: brain-computer interface, motor imagery, electroencephalogram, linear discriminant analysis, support vector machine

Procedia PDF Downloads 477
4664 Obstacle Classification Method Based on 2D LIDAR Database

Authors: Moohyun Lee, Soojung Hur, Yongwan Park

Abstract:

In this paper is proposed a method uses only LIDAR system to classification an obstacle and determine its type by establishing database for classifying obstacles based on LIDAR. The existing LIDAR system, in determining the recognition of obstruction in an autonomous vehicle, has an advantage in terms of accuracy and shorter recognition time. However, it was difficult to determine the type of obstacle and therefore accurate path planning based on the type of obstacle was not possible. In order to overcome this problem, a method of classifying obstacle type based on existing LIDAR and using the width of obstacle materials was proposed. However, width measurement was not sufficient to improve accuracy. In this research, the width data was used to do the first classification; database for LIDAR intensity data by four major obstacle materials on the road were created; comparison is made to the LIDAR intensity data of actual obstacle materials; and determine the obstacle type by finding the one with highest similarity values. An experiment using an actual autonomous vehicle under real environment shows that data declined in quality in comparison to 3D LIDAR and it was possible to classify obstacle materials using 2D LIDAR.

Keywords: obstacle, classification, database, LIDAR, segmentation, intensity

Procedia PDF Downloads 314
4663 Performance Analysis with the Combination of Visualization and Classification Technique for Medical Chatbot

Authors: Shajida M., Sakthiyadharshini N. P., Kamalesh S., Aswitha B.

Abstract:

Natural Language Processing (NLP) continues to play a strategic part in complaint discovery and medicine discovery during the current epidemic. This abstract provides an overview of performance analysis with a combination of visualization and classification techniques of NLP for a medical chatbot. Sentiment analysis is an important aspect of NLP that is used to determine the emotional tone behind a piece of text. This technique has been applied to various domains, including medical chatbots. In this, we have compared the combination of the decision tree with heatmap and Naïve Bayes with Word Cloud. The performance of the chatbot was evaluated using accuracy, and the results indicate that the combination of visualization and classification techniques significantly improves the chatbot's performance.

Keywords: sentimental analysis, NLP, medical chatbot, decision tree, heatmap, naïve bayes, word cloud

Procedia PDF Downloads 51
4662 Authentication and Traceability of Meat Products from South Indian Market by Species-Specific Polymerase Chain Reaction

Authors: J. U. Santhosh Kumar, V. Krishna, Sebin Sebastian, G. S. Seethapathy, G. Ravikanth, R. Uma Shaanker

Abstract:

Food is one of the basic needs of human beings. It requires the normal function of the body part and a healthy growth. Recently, food adulteration increases day by day to increase the quantity and make more benefit. Animal source foods can provide a variety of micronutrients that are difficult to obtain in adequate quantities from plant source foods alone. Particularly in the meat industry, products from animals are susceptible targets for fraudulent labeling due to the economic profit that results from selling cheaper meat as meat from more profitable and desirable species. This work presents an overview of the main PCR-based techniques applied to date to verify the authenticity of beef meat and meat products from beef species. We were analyzed 25 market beef samples in South India. We examined PCR methods based on the sequence of the cytochrome b gene for source species identification. We found all sample were sold as beef meat as Bos Taurus. However, interestingly Male meats are more valuable high price compare to female meat, due to this reason most of the markets samples are susceptible. We were used sex determination gene of cattle like TSPY(Y-encoded, testis-specific protein TSPY is a Y-specific gene). TSPY homologs exist in several mammalian species, including humans, horses, and cattle. This gene is Y coded testis protein genes, which only amplify the male. We used multiple PCR products form species-specific “fingerprints” on gel electrophoresis, which may be useful for meat authentication. Amplicons were obtained only by the Cattle -specific PCR. We found 13 market meat samples sold as female beef samples. These results suggest that the species-specific PCR methods established in this study would be useful for simple and easy detection of adulteration of meat products.

Keywords: authentication, meat products, species-specific, TSPY

Procedia PDF Downloads 349
4661 Metamorphic Computer Virus Classification Using Hidden Markov Model

Authors: Babak Bashari Rad

Abstract:

A metamorphic computer virus uses different code transformation techniques to mutate its body in duplicated instances. Characteristics and function of new instances are mostly similar to their parents, but they cannot be easily detected by the majority of antivirus in market, as they depend on string signature-based detection techniques. The purpose of this research is to propose a Hidden Markov Model for classification of metamorphic viruses in executable files. In the proposed solution, portable executable files are inspected to extract the instructions opcodes needed for the examination of code. A Hidden Markov Model trained on portable executable files is employed to classify the metamorphic viruses of the same family. The proposed model is able to generate and recognize common statistical features of mutated code. The model has been evaluated by examining the model on a test data set. The performance of the model has been practically tested and evaluated based on False Positive Rate, Detection Rate and Overall Accuracy. The result showed an acceptable performance with high average of 99.7% Detection Rate.

Keywords: malware classification, computer virus classification, metamorphic virus, metamorphic malware, Hidden Markov Model

Procedia PDF Downloads 290
4660 Genomics of Aquatic Adaptation

Authors: Agostinho Antunes

Abstract:

The completion of the human genome sequencing in 2003 opened a new perspective into the importance of whole genome sequencing projects, and currently multiple species are having their genomes completed sequenced, from simple organisms, such as bacteria, to more complex taxa, such as mammals. This voluminous sequencing data generated across multiple organisms provides also the framework to better understand the genetic makeup of such species and related ones, allowing to explore the genetic changes underlining the evolution of diverse phenotypic traits. Here, recent results from our group retrieved from comparative evolutionary genomic analyses of selected marine animal species will be considered to exemplify how gene novelty and gene enhancement by positive selection might have been determinant in the success of adaptive radiations into diverse habitats and lifestyles.

Keywords: comparative genomics, adaptive evolution, bioinformatics, phylogenetics, genome mining

Procedia PDF Downloads 507
4659 Road Vehicle Recognition Using Magnetic Sensing Feature Extraction and Classification

Authors: Xiao Chen, Xiaoying Kong, Min Xu

Abstract:

This paper presents a road vehicle detection approach for the intelligent transportation system. This approach mainly uses low-cost magnetic sensor and associated data collection system to collect magnetic signals. This system can measure the magnetic field changing, and it also can detect and count vehicles. We extend Mel Frequency Cepstral Coefficients to analyze vehicle magnetic signals. Vehicle type features are extracted using representation of cepstrum, frame energy, and gap cepstrum of magnetic signals. We design a 2-dimensional map algorithm using Vector Quantization to classify vehicle magnetic features to four typical types of vehicles in Australian suburbs: sedan, VAN, truck, and bus. Experiments results show that our approach achieves a high level of accuracy for vehicle detection and classification.

Keywords: vehicle classification, signal processing, road traffic model, magnetic sensing

Procedia PDF Downloads 296