Search results for: DNA sequences classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2640

Search results for: DNA sequences classification

2400 Using Gene Expression Programming in Learning Process of Rough Neural Networks

Authors: Sanaa Rashed Abdallah, Yasser F. Hassan

Abstract:

The paper will introduce an approach where a rough sets, gene expression programming and rough neural networks are used cooperatively for learning and classification support. The Objective of gene expression programming rough neural networks (GEP-RNN) approach is to obtain new classified data with minimum error in training and testing process. Starting point of gene expression programming rough neural networks (GEP-RNN) approach is an information system and the output from this approach is a structure of rough neural networks which is including the weights and thresholds with minimum classification error.

Keywords: rough sets, gene expression programming, rough neural networks, classification

Procedia PDF Downloads 355
2399 A Statistical Approach to Classification of Agricultural Regions

Authors: Hasan Vural

Abstract:

Turkey is a favorable country to produce a great variety of agricultural products because of her different geographic and climatic conditions which have been used to divide the country into four main and seven sub regions. This classification into seven regions traditionally has been used in order to data collection and publication especially related with agricultural production. Afterwards, nine agricultural regions were considered. Recently, the governmental body which is responsible of data collection and dissemination (Turkish Institute of Statistics-TIS) has used 12 classes which include 11 sub regions and Istanbul province. This study aims to evaluate these classification efforts based on the acreage of ten main crops in a ten years time period (1996-2005). The panel data grouped in 11 subregions has been evaluated by cluster and multivariate statistical methods. It was concluded that from the agricultural production point of view, it will be rather meaningful to consider three main and eight sub-agricultural regions throughout the country.

Keywords: agricultural region, factorial analysis, cluster analysis,

Procedia PDF Downloads 394
2398 The Change of Urban Land Use/Cover Using Object Based Approach for Southern Bali

Authors: I. Gusti A. A. Rai Asmiwyati, Robert J. Corner, Ashraf M. Dewan

Abstract:

Change on land use/cover (LULC) dominantly affects spatial structure and function. It can have such impacts by disrupting social culture practice and disturbing physical elements. Thus, it has become essential to understand of the dynamics in time and space of LULC as it can be used as a critical input for developing sustainable LULC. This study was an attempt to map and monitor the LULC change in Bali Indonesia from 2003 to 2013. Using object based classification to improve the accuracy, and change detection, multi temporal land use/cover data were extracted from a set of ASTER satellite image. The overall accuracies of the classification maps of 2003 and 2013 were 86.99% and 80.36%, respectively. Built up area and paddy field were the dominant type of land use/cover in both years. Patch increase dominantly in 2003 illustrated the rapid paddy field fragmentation and the huge occurring transformation. This approach is new for the case of diverse urban features of Bali that has been growing fast and increased the classification accuracy than the manual pixel based classification.

Keywords: land use/cover, urban, Bali, ASTER

Procedia PDF Downloads 527
2397 Advanced Magnetic Resonance Imaging in Differentiation of Neurocysticercosis and Tuberculoma

Authors: Rajendra N. Ghosh, Paramjeet Singh, Niranjan Khandelwal, Sameer Vyas, Pratibha Singhi, Naveen Sankhyan

Abstract:

Background: Tuberculoma and neurocysticercosis (NCC) are two most common intracranial infections in developing country. They often simulate on neuroimaging and in absence of typical imaging features cause significant diagnostic dilemmas. Differentiation is extremely important to avoid empirical exposure to antitubercular medications or nonspecific treatment causing disease progression. Purpose: Better characterization and differentiation of CNS tuberculoma and NCC by using morphological and multiple advanced functional MRI. Material and Methods: Total fifty untreated patients (20 tuberculoma and 30 NCC) were evaluated by using conventional and advanced sequences like CISS, SWI, DWI, DTI, Magnetization transfer (MT), T2Relaxometry (T2R), Perfusion and Spectroscopy. rCBV,ADC,FA,T2R,MTR values and metabolite ratios were calculated from lesion and normal parenchyma. Diagnosis was confirmed by typical biochemical, histopathological and imaging features. Results: CISS was most useful sequence for scolex detection (90% on CISS vs 73% on routine sequences). SWI showed higher scolex detection ability. Mean values of ADC, FA,T2R from core and rCBV from wall of lesion were significantly different in tuberculoma and NCC (P < 0.05). Mean values of rCBV, ADC, T2R and FA for tuberculoma and NCC were (3.36 vs1.3), (1.09x10⁻³vs 1.4x10⁻³), (0.13 x10⁻³ vs 0.09 x10⁻³) and (88.65 ms vs 272.3 ms) respectively. Tuberculomas showed high lipid peak, more choline and lower creatinine with Ch/Cr ratio > 1. T2R value was most significant parameter for differentiation. Cut off values for each significant parameters have proposed. Conclusion: Quantitative MRI in combination with conventional sequences can better characterize and differentiate similar appearing tuberculoma and NCC and may be incorporated in routine protocol which may avoid brain biopsy and empirical therapy.

Keywords: advanced functional MRI, differentiation, neurcysticercosis, tuberculoma

Procedia PDF Downloads 539
2396 Genome-Wide Analysis of BES1/BZR1 Gene Family in Five Plant Species

Authors: Jafar Ahmadi, Zhohreh Asiaban, Sedigheh Fabriki Ourang

Abstract:

Brassinosteroids (BRs) regulate cell elongation, vascular differentiation, senescence and stress responses. BRs signal through the BES1/BZR1 family of transcription factors, which regulate hundreds of target genes involved in this pathway. In this research a comprehensive genome-wide analysis was carried out in BES1/BZR1 gene family in Arabidopsis thaliana, Cucumis sativus, Vitis vinifera, Glycin max, and Brachypodium distachyon. Specifications of the desired sequences, dot plot and hydropathy plot were analyzed in the protein and genome sequences of five plant species. The maximum amino acid length was attributed to protein sequence Brdic3g with 374aa and the minimum amino acid length was attributed to protein sequence Gm7g with 163aa. The maximum Instability index was attributed to protein sequence AT1G19350 equal with 79.99 and the minimum Instability index was attributed to protein sequence Gm5g equal with 33.22. Aliphatic index of these protein sequences ranged from 47.82 to 78.79 in Arabidopsis thaliana, 49.91 to 57.50 in Vitis vinifera, 55.09 to 82.43 in Glycin max, 54.09 to 54.28 in Brachypodium distachyon 55.36 to 56.83 in Cucumis sativus. Overall, data obtained from our investigation contributes a better understanding of the complexity of the BES1/BZR1 gene family and provides the first step towards directing future experimental designs to perform systematic analysis of the functions of the BES1/BZR1 gene family.

Keywords: BES1/BZR1, brassinosteroids, phylogenetic analysis, transcription factor

Procedia PDF Downloads 314
2395 Land Cover Classification System for the Estimation of Carbon Storage in Terrestrial Ecosystems

Authors: Lei Zhang

Abstract:

The carbon cycle greatly influences global change, and the land cover changes contribute to the status and rate of the carbon budget in ecosystems. This paper proposes a land cover classification system for mapping land cover, the national ecological environment assessment, and estimating carbon storage in ecosystems. The classification system consists of basic land cover classes at levels Ⅰ and Ⅱ and auxiliary features at level III. The basic 38 classes characterizing land cover features are derived from 19 criteria referring to composition, structure, pattern, phenology, etc. The basic classes reflect the status of carbon storage in ecosystems. The auxiliary classes at level III complement the attributes of higher levels by 9 criteria. The 5 environmental criteria of temperature, moisture, landform, aspect and slope mainly reflect the potential and intensity of carbon storage in ecosystems. The disturbance of vegetation succession caused by land use type influences the vegetation carbon budget. The other 3 vegetation cover criteria, growth period, and species characteristics further refine the vegetation types. The hierarchical structure of the land cover map (the classes of levels Ⅰ and Ⅱ) is independent of the products of level III, which is helpful for land cover product management and applications. The classification system has been adopted in the Chinese national land cover database for the carbon budget in ecosystems at a 30 m scale.

Keywords: classification system, land cover, ecosystem, carbon storage, object based

Procedia PDF Downloads 46
2394 Genotyping and Phylogeny of Phaeomoniella Genus Associated with Grapevine Trunk Diseases in Algeria

Authors: A. Berraf-Tebbal, Z. Bouznad, , A.J.L. Phillips

Abstract:

Phaeomoniella is a fungus genus in the mitosporic ascomycota which includes Phaeomoniella chlamydospora specie associated with two declining diseases on grapevine (Vitis vinifera) namely Petri disease and esca. Recent studies have shown that several Phaeomoniella species also cause disease on many other woody crops, such as forest trees and woody ornamentals. Two new species, Phaeomoniella zymoides and Phaeomoniella pinifoliorum H.B. Lee, J.Y. Park, R.C. Summerbell et H.S. Jung, were isolated from the needle surface of Pinus densiflora Sieb. et Zucc. in Korea. The identification of species in Phaeomoniella genus can be a difficult task if based solely on morphological and cultural characters. In this respect, the application of molecular methods, particularly PCR-based techniques, may provide an important contribution. MSP-PCR (microsatellite primed-PCR) fingerprinting has proven useful in the molecular typing of fungal strains. The high discriminatory potential of this method is particularly useful when dealing with closely related or cryptic species. In the present study, the application of PCR fingerprinting was performed using the micro satellite primer M13 for the purpose of species identification and strain typing of 84 Phaeomoniella -like isolates collected from grapevines with typical symptoms of dieback. The bands produced by MSP-PCR profiles divided the strains into 3 clusters and 5 singletons with a reproducibility level of 80%. Representative isolates from each group and, when possible, isolates from Eutypa dieback and esca symptoms were selected for sequencing of the ITS region. The ITS sequences for the 16 isolates selected from the MSP-PCR profiles were combined and aligned with sequences of 18 isolates retrieved from GenBank, representing a selection of all known Phaeomoniella species. DNA sequences were compared with those available in GenBank using Neighbor-joining (NJ) and Maximum-parsimony (MP) analyses. The phylogenetic trees of the ITS region revealed that the Phaeomoniella isolates clustered with Phaeomoniella chlamydospora reference sequences with a bootstrap support of 100 %. The complexity of the pathosystems vine-trunk diseases shows clearly the need to identify unambiguously the fungal component in order to allow a better understanding of the etiology of these diseases and justify the establishment of control strategies against these fungal agents.

Keywords: Genotyping, MSP-PCR, ITS, phylogeny, trunk diseases

Procedia PDF Downloads 464
2393 From Type-I to Type-II Fuzzy System Modeling for Diagnosis of Hepatitis

Authors: Shahabeddin Sotudian, M. H. Fazel Zarandi, I. B. Turksen

Abstract:

Hepatitis is one of the most common and dangerous diseases that affects humankind, and exposes millions of people to serious health risks every year. Diagnosis of Hepatitis has always been a challenge for physicians. This paper presents an effective method for diagnosis of hepatitis based on interval Type-II fuzzy. This proposed system includes three steps: pre-processing (feature selection), Type-I and Type-II fuzzy classification, and system evaluation. KNN-FD feature selection is used as the preprocessing step in order to exclude irrelevant features and to improve classification performance and efficiency in generating the classification model. In the fuzzy classification step, an “indirect approach” is used for fuzzy system modeling by implementing the exponential compactness and separation index for determining the number of rules in the fuzzy clustering approach. Therefore, we first proposed a Type-I fuzzy system that had an accuracy of approximately 90.9%. In the proposed system, the process of diagnosis faces vagueness and uncertainty in the final decision. Thus, the imprecise knowledge was managed by using interval Type-II fuzzy logic. The results that were obtained show that interval Type-II fuzzy has the ability to diagnose hepatitis with an average accuracy of 93.94%. The classification accuracy obtained is the highest one reached thus far. The aforementioned rate of accuracy demonstrates that the Type-II fuzzy system has a better performance in comparison to Type-I and indicates a higher capability of Type-II fuzzy system for modeling uncertainty.

Keywords: hepatitis disease, medical diagnosis, type-I fuzzy logic, type-II fuzzy logic, feature selection

Procedia PDF Downloads 282
2392 DeClEx-Processing Pipeline for Tumor Classification

Authors: Gaurav Shinde, Sai Charan Gongiguntla, Prajwal Shirur, Ahmed Hambaba

Abstract:

Health issues are significantly increasing, putting a substantial strain on healthcare services. This has accelerated the integration of machine learning in healthcare, particularly following the COVID-19 pandemic. The utilization of machine learning in healthcare has grown significantly. We introduce DeClEx, a pipeline that ensures that data mirrors real-world settings by incorporating Gaussian noise and blur and employing autoencoders to learn intermediate feature representations. Subsequently, our convolutional neural network, paired with spatial attention, provides comparable accuracy to state-of-the-art pre-trained models while achieving a threefold improvement in training speed. Furthermore, we provide interpretable results using explainable AI techniques. We integrate denoising and deblurring, classification, and explainability in a single pipeline called DeClEx.

Keywords: machine learning, healthcare, classification, explainability

Procedia PDF Downloads 12
2391 Integrating Explicit Instruction and Problem-Solving Approaches for Efficient Learning

Authors: Slava Kalyuga

Abstract:

There are two opposing major points of view on the optimal degree of initial instructional guidance that is usually discussed in the literature by the advocates of the corresponding learning approaches. Using unguided or minimally guided problem-solving tasks prior to explicit instruction has been suggested by productive failure and several other instructional theories, whereas an alternative approach - using fully guided worked examples followed by problem solving - has been demonstrated as the most effective strategy within the framework of cognitive load theory. An integrated approach discussed in this paper could combine the above frameworks within a broader theoretical perspective which would allow bringing together their best features and advantages in the design of learning tasks for STEM education. This paper represents a systematic review of the available empirical studies comparing the above alternative sequences of instructional methods to explore effects of several possible moderating factors. The paper concludes that different approaches and instructional sequences should coexist within complex learning environments. Selecting optimal sequences depends on such factors as specific goals of learner activities, types of knowledge to learn, levels of element interactivity (task complexity), and levels of learner prior knowledge. This paper offers an outline of a theoretical framework for the design of complex learning tasks in STEM education that would integrate explicit instruction and inquiry (exploratory, discovery) learning approaches in ways that depend on a set of defined specific factors.

Keywords: cognitive load, explicit instruction, exploratory learning, worked examples

Procedia PDF Downloads 103
2390 Prediction of All-Beta Protein Secondary Structure Using Garnier-Osguthorpe-Robson Method

Authors: K. Tejasri, K. Suvarna Vani, S. Prathyusha, S. Ramya

Abstract:

Proteins are chained sequences of amino acids which are brought together by the peptide bonds. Many varying formations of the chains are possible due to multiple combinations of amino acids and rotation in numerous positions along the chain. Protein structure prediction is one of the crucial goals worked towards by the members of bioinformatics and theoretical chemistry backgrounds. Among the four different structure levels in proteins, we emphasize mainly the secondary level structure. Generally, the secondary protein basically comprises alpha-helix and beta-sheets. Multi-class classification problem of data with disparity is truly a challenge to overcome and has to be addressed for the beta strands. Imbalanced data distribution constitutes a couple of the classes of data having very limited training samples collated with other classes. The secondary structure data is extracted from the protein primary sequence, and the beta-strands are predicted using suitable machine learning algorithms.

Keywords: proteins, secondary structure elements, beta-sheets, beta-strands, alpha-helices, machine learning algorithms

Procedia PDF Downloads 76
2389 A Survey of Skin Cancer Detection and Classification from Skin Lesion Images Using Deep Learning

Authors: Joseph George, Anne Kotteswara Roa

Abstract:

Skin disease is one of the most common and popular kinds of health issues faced by people nowadays. Skin cancer (SC) is one among them, and its detection relies on the skin biopsy outputs and the expertise of the doctors, but it consumes more time and some inaccurate results. At the early stage, skin cancer detection is a challenging task, and it easily spreads to the whole body and leads to an increase in the mortality rate. Skin cancer is curable when it is detected at an early stage. In order to classify correct and accurate skin cancer, the critical task is skin cancer identification and classification, and it is more based on the cancer disease features such as shape, size, color, symmetry and etc. More similar characteristics are present in many skin diseases; hence it makes it a challenging issue to select important features from a skin cancer dataset images. Hence, the skin cancer diagnostic accuracy is improved by requiring an automated skin cancer detection and classification framework; thereby, the human expert’s scarcity is handled. Recently, the deep learning techniques like Convolutional neural network (CNN), Deep belief neural network (DBN), Artificial neural network (ANN), Recurrent neural network (RNN), and Long and short term memory (LSTM) have been widely used for the identification and classification of skin cancers. This survey reviews different DL techniques for skin cancer identification and classification. The performance metrics such as precision, recall, accuracy, sensitivity, specificity, and F-measures are used to evaluate the effectiveness of SC identification using DL techniques. By using these DL techniques, the classification accuracy increases along with the mitigation of computational complexities and time consumption.

Keywords: skin cancer, deep learning, performance measures, accuracy, datasets

Procedia PDF Downloads 103
2388 Random Subspace Ensemble of CMAC Classifiers

Authors: Somaiyeh Dehghan, Mohammad Reza Kheirkhahan Haghighi

Abstract:

The rapid growth of domains that have data with a large number of features, while the number of samples is limited has caused difficulty in constructing strong classifiers. To reduce the dimensionality of the feature space becomes an essential step in classification task. Random subspace method (or attribute bagging) is an ensemble classifier that consists of several classifiers that each base learner in ensemble has subset of features. In the present paper, we introduce Random Subspace Ensemble of CMAC neural network (RSE-CMAC), each of which has training with subset of features. Then we use this model for classification task. For evaluation performance of our model, we compare it with bagging algorithm on 36 UCI datasets. The results reveal that the new model has better performance.

Keywords: classification, random subspace, ensemble, CMAC neural network

Procedia PDF Downloads 311
2387 Crop Classification using Unmanned Aerial Vehicle Images

Authors: Iqra Yaseen

Abstract:

One of the well-known areas of computer science and engineering, image processing in the context of computer vision has been essential to automation. In remote sensing, medical science, and many other fields, it has made it easier to uncover previously undiscovered facts. Grading of diverse items is now possible because of neural network algorithms, categorization, and digital image processing. Its use in the classification of agricultural products, particularly in the grading of seeds or grains and their cultivars, is widely recognized. A grading and sorting system enables the preservation of time, consistency, and uniformity. Global population growth has led to an increase in demand for food staples, biofuel, and other agricultural products. To meet this demand, available resources must be used and managed more effectively. Image processing is rapidly growing in the field of agriculture. Many applications have been developed using this approach for crop identification and classification, land and disease detection and for measuring other parameters of crop. Vegetation localization is the base of performing these task. Vegetation helps to identify the area where the crop is present. The productivity of the agriculture industry can be increased via image processing that is based upon Unmanned Aerial Vehicle photography and satellite. In this paper we use the machine learning techniques like Convolutional Neural Network, deep learning, image processing, classification, You Only Live Once to UAV imaging dataset to divide the crop into distinct groups and choose the best way to use it.

Keywords: image processing, UAV, YOLO, CNN, deep learning, classification

Procedia PDF Downloads 80
2386 Application of Remote Sensing and GIS in Assessing Land Cover Changes within Granite Quarries around Brits Area, South Africa

Authors: Refilwe Moeletsi

Abstract:

Dimension stone quarrying around Brits and Belfast areas started in the early 1930s and has been growing rapidly since then. Environmental impacts associated with these quarries have not been documented, and hence this study aims at detecting any change in the environment that might have been caused by these activities. Landsat images that were used to assess land use/land cover changes in Brits quarries from 1998 - 2015. A supervised classification using maximum likelihood classifier was applied to classify each image into different land use/land cover types. Classification accuracy was assessed using Google Earth™ as a source of reference data. Post-classification change detection method was used to determine changes. The results revealed significant increase in granite quarries and corresponding decrease in vegetation cover within the study region.

Keywords: remote sensing, GIS, change detection, granite quarries

Procedia PDF Downloads 290
2385 Hyperspectral Data Classification Algorithm Based on the Deep Belief and Self-Organizing Neural Network

Authors: Li Qingjian, Li Ke, He Chun, Huang Yong

Abstract:

In this paper, the method of combining the Pohl Seidman's deep belief network with the self-organizing neural network is proposed to classify the target. This method is mainly aimed at the high nonlinearity of the hyperspectral image, the high sample dimension and the difficulty in designing the classifier. The main feature of original data is extracted by deep belief network. In the process of extracting features, adding known labels samples to fine tune the network, enriching the main characteristics. Then, the extracted feature vectors are classified into the self-organizing neural network. This method can effectively reduce the dimensions of data in the spectrum dimension in the preservation of large amounts of raw data information, to solve the traditional clustering and the long training time when labeled samples less deep learning algorithm for training problems, improve the classification accuracy and robustness. Through the data simulation, the results show that the proposed network structure can get a higher classification precision in the case of a small number of known label samples.

Keywords: DBN, SOM, pattern classification, hyperspectral, data compression

Procedia PDF Downloads 321
2384 Automatic Method for Classification of Informative and Noninformative Images in Colonoscopy Video

Authors: Nidhal K. Azawi, John M. Gauch

Abstract:

Colorectal cancer is one of the leading causes of cancer death in the US and the world, which is why millions of colonoscopy examinations are performed annually. Unfortunately, noise, specular highlights, and motion artifacts corrupt many images in a typical colonoscopy exam. The goal of our research is to produce automated techniques to detect and correct or remove these noninformative images from colonoscopy videos, so physicians can focus their attention on informative images. In this research, we first automatically extract features from images. Then we use machine learning and deep neural network to classify colonoscopy images as either informative or noninformative. Our results show that we achieve image classification accuracy between 92-98%. We also show how the removal of noninformative images together with image alignment can aid in the creation of image panoramas and other visualizations of colonoscopy images.

Keywords: colonoscopy classification, feature extraction, image alignment, machine learning

Procedia PDF Downloads 236
2383 Predicting Groundwater Areas Using Data Mining Techniques: Groundwater in Jordan as Case Study

Authors: Faisal Aburub, Wael Hadi

Abstract:

Data mining is the process of extracting useful or hidden information from a large database. Extracted information can be used to discover relationships among features, where data objects are grouped according to logical relationships; or to predict unseen objects to one of the predefined groups. In this paper, we aim to investigate four well-known data mining algorithms in order to predict groundwater areas in Jordan. These algorithms are Support Vector Machines (SVMs), Naïve Bayes (NB), K-Nearest Neighbor (kNN) and Classification Based on Association Rule (CBA). The experimental results indicate that the SVMs algorithm outperformed other algorithms in terms of classification accuracy, precision and F1 evaluation measures using the datasets of groundwater areas that were collected from Jordanian Ministry of Water and Irrigation.

Keywords: classification, data mining, evaluation measures, groundwater

Procedia PDF Downloads 257
2382 Spatio-Temporal Assessment of Urban Growth and Land Use Change in Islamabad Using Object-Based Classification Method

Authors: Rabia Shabbir, Sheikh Saeed Ahmad, Amna Butt

Abstract:

Rapid land use changes have taken place in Islamabad, the capital city of Pakistan, over the past decades due to accelerated urbanization and industrialization. In this study, land use changes in the metropolitan area of Islamabad was observed by the combined use of GIS and satellite remote sensing for a time period of 15 years. High-resolution Google Earth images were downloaded from 2000-2015, and object-based classification method was used for accurate classification using eCognition software. The information regarding urban settlements, industrial area, barren land, agricultural area, vegetation, water, and transportation infrastructure was extracted. The results showed that the city experienced a spatial expansion, rapid urban growth, land use change and expanding transportation infrastructure. The study concluded the integration of GIS and remote sensing as an effective approach for analyzing the spatial pattern of urban growth and land use change.

Keywords: land use change, urban growth, Islamabad, object-based classification, Google Earth, remote sensing, GIS

Procedia PDF Downloads 138
2381 Analyzing Tools and Techniques for Classification In Educational Data Mining: A Survey

Authors: D. I. George Amalarethinam, A. Emima

Abstract:

Educational Data Mining (EDM) is one of the newest topics to emerge in recent years, and it is concerned with developing methods for analyzing various types of data gathered from the educational circle. EDM methods and techniques with machine learning algorithms are used to extract meaningful and usable information from huge databases. For scientists and researchers, realistic applications of Machine Learning in the EDM sectors offer new frontiers and present new problems. One of the most important research areas in EDM is predicting student success. The prediction algorithms and techniques must be developed to forecast students' performance, which aids the tutor, institution to boost the level of student’s performance. This paper examines various classification techniques in prediction methods and data mining tools used in EDM.

Keywords: classification technique, data mining, EDM methods, prediction methods

Procedia PDF Downloads 102
2380 Morphological Processing of Punjabi Text for Sentiment Analysis of Farmer Suicides

Authors: Jaspreet Singh, Gurvinder Singh, Prabhsimran Singh, Rajinder Singh, Prithvipal Singh, Karanjeet Singh Kahlon, Ravinder Singh Sawhney

Abstract:

Morphological evaluation of Indian languages is one of the burgeoning fields in the area of Natural Language Processing (NLP). The evaluation of a language is an eminent task in the era of information retrieval and text mining. The extraction and classification of knowledge from text can be exploited for sentiment analysis and morphological evaluation. This study coalesce morphological evaluation and sentiment analysis for the task of classification of farmer suicide cases reported in Punjab state of India. The pre-processing of Punjabi text involves morphological evaluation and normalization of Punjabi word tokens followed by the training of proposed model using deep learning classification on Punjabi language text extracted from online Punjabi news reports. The class-wise accuracies of sentiment prediction for four negatively oriented classes of farmer suicide cases are 93.85%, 88.53%, 83.3%, and 95.45% respectively. The overall accuracy of sentiment classification obtained using proposed framework on 275 Punjabi text documents is found to be 90.29%.

Keywords: deep neural network, farmer suicides, morphological processing, punjabi text, sentiment analysis

Procedia PDF Downloads 296
2379 A Nonlinear Feature Selection Method for Hyperspectral Image Classification

Authors: Pei-Jyun Hsieh, Cheng-Hsuan Li, Bor-Chen Kuo

Abstract:

For hyperspectral image classification, feature reduction is an important pre-processing for avoiding the Hughes phenomena due to the difficulty for collecting training samples. Hence, lots of researches developed feature selection methods such as F-score, HSIC (Hilbert-Schmidt Independence Criterion), and etc., to improve hyperspectral image classification. However, most of them only consider the class separability in the original space, i.e., a linear class separability. In this study, we proposed a nonlinear class separability measure based on kernel trick for selecting an appropriate feature subset. The proposed nonlinear class separability was formed by a generalized RBF kernel with different bandwidths with respect to different features. Moreover, it considered the within-class separability and the between-class separability. A genetic algorithm was applied to tune these bandwidths such that the smallest with-class separability and the largest between-class separability simultaneously. This indicates the corresponding feature space is more suitable for classification. In addition, the corresponding nonlinear classification boundary can separate classes very well. These optimal bandwidths also show the importance of bands for hyperspectral image classification. The reciprocals of these bandwidths can be viewed as weights of bands. The smaller bandwidth, the larger weight of the band, and the more importance for classification. Hence, the descending order of the reciprocals of the bands gives an order for selecting the appropriate feature subsets. In the experiments, three hyperspectral image data sets, the Indian Pine Site data set, the PAVIA data set, and the Salinas A data set, were used to demonstrate the selected feature subsets by the proposed nonlinear feature selection method are more appropriate for hyperspectral image classification. Only ten percent of samples were randomly selected to form the training dataset. All non-background samples were used to form the testing dataset. The support vector machine was applied to classify these testing samples based on selected feature subsets. According to the experiments on the Indian Pine Site data set with 220 bands, the highest accuracies by applying the proposed method, F-score, and HSIC are 0.8795, 0.8795, and 0.87404, respectively. However, the proposed method selects 158 features. F-score and HSIC select 168 features and 217 features, respectively. Moreover, the classification accuracies increase dramatically only using first few features. The classification accuracies with respect to feature subsets of 10 features, 20 features, 50 features, and 110 features are 0.69587, 0.7348, 0.79217, and 0.84164, respectively. Furthermore, only using half selected features (110 features) of the proposed method, the corresponding classification accuracy (0.84168) is approximate to the highest classification accuracy, 0.8795. For other two hyperspectral image data sets, the PAVIA data set and Salinas A data set, we can obtain the similar results. These results illustrate our proposed method can efficiently find feature subsets to improve hyperspectral image classification. One can apply the proposed method to determine the suitable feature subset first according to specific purposes. Then researchers can only use the corresponding sensors to obtain the hyperspectral image and classify the samples. This can not only improve the classification performance but also reduce the cost for obtaining hyperspectral images.

Keywords: hyperspectral image classification, nonlinear feature selection, kernel trick, support vector machine

Procedia PDF Downloads 245
2378 Personal Information Classification Based on Deep Learning in Automatic Form Filling System

Authors: Shunzuo Wu, Xudong Luo, Yuanxiu Liao

Abstract:

Recently, the rapid development of deep learning makes artificial intelligence (AI) penetrate into many fields, replacing manual work there. In particular, AI systems also become a research focus in the field of automatic office. To meet real needs in automatic officiating, in this paper we develop an automatic form filling system. Specifically, it uses two classical neural network models and several word embedding models to classify various relevant information elicited from the Internet. When training the neural network models, we use less noisy and balanced data for training. We conduct a series of experiments to test my systems and the results show that our system can achieve better classification results.

Keywords: artificial intelligence and office, NLP, deep learning, text classification

Procedia PDF Downloads 166
2377 A Topological Approach for Motion Track Discrimination

Authors: Tegan H. Emerson, Colin C. Olson, George Stantchev, Jason A. Edelberg, Michael Wilson

Abstract:

Detecting small targets at range is difficult because there is not enough spatial information present in an image sub-region containing the target to use correlation-based methods to differentiate it from dynamic confusers present in the scene. Moreover, this lack of spatial information also disqualifies the use of most state-of-the-art deep learning image-based classifiers. Here, we use characteristics of target tracks extracted from video sequences as data from which to derive distinguishing topological features that help robustly differentiate targets of interest from confusers. In particular, we calculate persistent homology from time-delayed embeddings of dynamic statistics calculated from motion tracks extracted from a wide field-of-view video stream. In short, we use topological methods to extract features related to target motion dynamics that are useful for classification and disambiguation and show that small targets can be detected at range with high probability.

Keywords: motion tracks, persistence images, time-delay embedding, topological data analysis

Procedia PDF Downloads 93
2376 Multi-Level Air Quality Classification in China Using Information Gain and Support Vector Machine

Authors: Bingchun Liu, Pei-Chann Chang, Natasha Huang, Dun Li

Abstract:

Machine Learning and Data Mining are the two important tools for extracting useful information and knowledge from large datasets. In machine learning, classification is a wildly used technique to predict qualitative variables and is generally preferred over regression from an operational point of view. Due to the enormous increase in air pollution in various countries especially China, Air Quality Classification has become one of the most important topics in air quality research and modelling. This study aims at introducing a hybrid classification model based on information theory and Support Vector Machine (SVM) using the air quality data of four cities in China namely Beijing, Guangzhou, Shanghai and Tianjin from Jan 1, 2014 to April 30, 2016. China's Ministry of Environmental Protection has classified the daily air quality into 6 levels namely Serious Pollution, Severe Pollution, Moderate Pollution, Light Pollution, Good and Excellent based on their respective Air Quality Index (AQI) values. Using the information theory, information gain (IG) is calculated and feature selection is done for both categorical features and continuous numeric features. Then SVM Machine Learning algorithm is implemented on the selected features with cross-validation. The final evaluation reveals that the IG and SVM hybrid model performs better than SVM (alone), Artificial Neural Network (ANN) and K-Nearest Neighbours (KNN) models in terms of accuracy as well as complexity.

Keywords: machine learning, air quality classification, air quality index, information gain, support vector machine, cross-validation

Procedia PDF Downloads 214
2375 Self-Supervised Pretraining on Sequences of Functional Magnetic Resonance Imaging Data for Transfer Learning to Brain Decoding Tasks

Authors: Sean Paulsen, Michael Casey

Abstract:

In this work we present a self-supervised pretraining framework for transformers on functional Magnetic Resonance Imaging (fMRI) data. First, we pretrain our architecture on two self-supervised tasks simultaneously to teach the model a general understanding of the temporal and spatial dynamics of human auditory cortex during music listening. Our pretraining results are the first to suggest a synergistic effect of multitask training on fMRI data. Second, we finetune the pretrained models and train additional fresh models on a supervised fMRI classification task. We observe significantly improved accuracy on held-out runs with the finetuned models, which demonstrates the ability of our pretraining tasks to facilitate transfer learning. This work contributes to the growing body of literature on transformer architectures for pretraining and transfer learning with fMRI data, and serves as a proof of concept for our pretraining tasks and multitask pretraining on fMRI data.

Keywords: transfer learning, fMRI, self-supervised, brain decoding, transformer, multitask training

Procedia PDF Downloads 65
2374 Auto Classification of Multiple ECG Arrhythmic Detection via Machine Learning Techniques: A Review

Authors: Ng Liang Shen, Hau Yuan Wen

Abstract:

Arrhythmia analysis of ECG signal plays a major role in diagnosing most of the cardiac diseases. Therefore, a single arrhythmia detection of an electrocardiographic (ECG) record can determine multiple pattern of various algorithms and match accordingly each ECG beats based on Machine Learning supervised learning. These researchers used different features and classification methods to classify different arrhythmia types. A major problem in these studies is the fact that the symptoms of the disease do not show all the time in the ECG record. Hence, a successful diagnosis might require the manual investigation of several hours of ECG records. The point of this paper presents investigations cardiovascular ailment in Electrocardiogram (ECG) Signals for Cardiac Arrhythmia utilizing examination of ECG irregular wave frames via heart beat as correspond arrhythmia which with Machine Learning Pattern Recognition.

Keywords: electrocardiogram, ECG, classification, machine learning, pattern recognition, detection, QRS

Procedia PDF Downloads 347
2373 Land Use/Land Cover Mapping Using Landsat 8 and Sentinel-2 in a Mediterranean Landscape

Authors: Moschos Vogiatzis, K. Perakis

Abstract:

Spatial-explicit and up-to-date land use/land cover information is fundamental for spatial planning, land management, sustainable development, and sound decision-making. In the last decade, many satellite-derived land cover products at different spatial, spectral, and temporal resolutions have been developed, such as the European Copernicus Land Cover product. However, more efficient and detailed information for land use/land cover is required at the regional or local scale. A typical Mediterranean basin with a complex landscape comprised of various forest types, crops, artificial surfaces, and wetlands was selected to test and develop our approach. In this study, we investigate the improvement of Copernicus Land Cover product (CLC2018) using Landsat 8 and Sentinel-2 pixel-based classification based on all available existing geospatial data (Forest Maps, LPIS, Natura2000 habitats, cadastral parcels, etc.). We examined and compared the performance of the Random Forest classifier for land use/land cover mapping. In total, 10 land use/land cover categories were recognized in Landsat 8 and 11 in Sentinel-2A. A comparison of the overall classification accuracies for 2018 shows that Landsat 8 classification accuracy was slightly higher than Sentinel-2A (82,99% vs. 80,30%). We concluded that the main land use/land cover types of CLC2018, even within a heterogeneous area, can be successfully mapped and updated according to CLC nomenclature. Future research should be oriented toward integrating spatiotemporal information from seasonal bands and spectral indexes in the classification process.

Keywords: classification, land use/land cover, mapping, random forest

Procedia PDF Downloads 104
2372 Terrain Classification for Ground Robots Based on Acoustic Features

Authors: Bernd Kiefer, Abraham Gebru Tesfay, Dietrich Klakow

Abstract:

The motivation of our work is to detect different terrain types traversed by a robot based on acoustic data from the robot-terrain interaction. Different acoustic features and classifiers were investigated, such as Mel-frequency cepstral coefficient and Gamma-tone frequency cepstral coefficient for the feature extraction, and Gaussian mixture model and Feed forward neural network for the classification. We analyze the system’s performance by comparing our proposed techniques with some other features surveyed from distinct related works. We achieve precision and recall values between 87% and 100% per class, and an average accuracy at 95.2%. We also study the effect of varying audio chunk size in the application phase of the models and find only a mild impact on performance.

Keywords: acoustic features, autonomous robots, feature extraction, terrain classification

Procedia PDF Downloads 342
2371 The Implementation of the Multi-Agent Classification System (MACS) in Compliance with FIPA Specifications

Authors: Mohamed R. Mhereeg

Abstract:

The paper discusses the implementation of the MultiAgent classification System (MACS) and utilizing it to provide an automated and accurate classification of end users developing applications in the spreadsheet domain. However, different technologies have been brought together to build MACS. The strength of the system is the integration of the agent technology with the FIPA specifications together with other technologies, which are the .NET widows service based agents, the Windows Communication Foundation (WCF) services, the Service Oriented Architecture (SOA), and Oracle Data Mining (ODM). Microsoft's .NET windows service based agents were utilized to develop the monitoring agents of MACS, the .NET WCF services together with SOA approach allowed the distribution and communication between agents over the WWW. The Monitoring Agents (MAs) were configured to execute automatically to monitor excel spreadsheets development activities by content. Data gathered by the Monitoring Agents from various resources over a period of time was collected and filtered by a Database Updater Agent (DUA) residing in the .NET client application of the system. This agent then transfers and stores the data in Oracle server database via Oracle stored procedures for further processing that leads to the classification of the end user developers.

Keywords: MACS, implementation, multi-agent, SOA, autonomous, WCF

Procedia PDF Downloads 259