Search results for: Meyer classification
2181 Hybrid Reliability-Similarity-Based Approach for Supervised Machine Learning
Authors: Walid Cherif
Abstract:
Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.Keywords: data mining, knowledge discovery, machine learning, similarity measurement, supervised classification
Procedia PDF Downloads 4652180 Statistical Wavelet Features, PCA, and SVM-Based Approach for EEG Signals Classification
Authors: R. K. Chaurasiya, N. D. Londhe, S. Ghosh
Abstract:
The study of the electrical signals produced by neural activities of human brain is called Electroencephalography. In this paper, we propose an automatic and efficient EEG signal classification approach. The proposed approach is used to classify the EEG signal into two classes: epileptic seizure or not. In the proposed approach, we start with extracting the features by applying Discrete Wavelet Transform (DWT) in order to decompose the EEG signals into sub-bands. These features, extracted from details and approximation coefficients of DWT sub-bands, are used as input to Principal Component Analysis (PCA). The classification is based on reducing the feature dimension using PCA and deriving the support-vectors using Support Vector Machine (SVM). The experimental are performed on real and standard dataset. A very high level of classification accuracy is obtained in the result of classification.Keywords: discrete wavelet transform, electroencephalogram, pattern recognition, principal component analysis, support vector machine
Procedia PDF Downloads 6392179 Lipschitz Classifiers Ensembles: Usage for Classification of Target Events in C-OTDR Monitoring Systems
Authors: Andrey V. Timofeev
Abstract:
This paper introduces an original method for guaranteed estimation of the accuracy of an ensemble of Lipschitz classifiers. The solution was obtained as a finite closed set of alternative hypotheses, which contains an object of classification with a probability of not less than the specified value. Thus, the classification is represented by a set of hypothetical classes. In this case, the smaller the cardinality of the discrete set of hypothetical classes is, the higher is the classification accuracy. Experiments have shown that if the cardinality of the classifiers ensemble is increased then the cardinality of this set of hypothetical classes is reduced. The problem of the guaranteed estimation of the accuracy of an ensemble of Lipschitz classifiers is relevant in the multichannel classification of target events in C-OTDR monitoring systems. Results of suggested approach practical usage to accuracy control in C-OTDR monitoring systems are present.Keywords: Lipschitz classifiers, confidence set, C-OTDR monitoring, classifiers accuracy, classifiers ensemble
Procedia PDF Downloads 4932178 Heuristic of Style Transfer for Real-Time Detection or Classification of Weather Conditions from Camera Images
Authors: Hamed Ouattara, Pierre Duthon, Frédéric Bernardin, Omar Ait Aider, Pascal Salmane
Abstract:
In this article, we present three neural network architectures for real-time classification of weather conditions (sunny, rainy, snowy, foggy) from images. Inspired by recent advances in style transfer, two of these architectures -Truncated ResNet50 and Truncated ResNet50 with Gram Matrix and Attention- surpass the state of the art and demonstrate re-markable generalization capability on several public databases, including Kaggle (2000 images), Kaggle 850 images, MWI (1996 images) [1], and Image2Weather [2]. Although developed for weather detection, these architectures are also suitable for other appearance-based classification tasks, such as animal species recognition, texture classification, disease detection in medical images, and industrial defect identification. We illustrate these applications in the section “Applications of Our Models to Other Tasks” with the “SIIM-ISIC Melanoma Classification Challenge 2020” [3].Keywords: weather simulation, weather measurement, weather classification, weather detection, style transfer, Pix2Pix, CycleGAN, CUT, neural style transfer
Procedia PDF Downloads 122177 A Review of Effective Gene Selection Methods for Cancer Classification Using Microarray Gene Expression Profile
Authors: Hala Alshamlan, Ghada Badr, Yousef Alohali
Abstract:
Cancer is one of the dreadful diseases, which causes considerable death rate in humans. DNA microarray-based gene expression profiling has been emerged as an efficient technique for cancer classification, as well as for diagnosis, prognosis, and treatment purposes. In recent years, a DNA microarray technique has gained more attraction in both scientific and in industrial fields. It is important to determine the informative genes that cause cancer to improve early cancer diagnosis and to give effective chemotherapy treatment. In order to gain deep insight into the cancer classification problem, it is necessary to take a closer look at the proposed gene selection methods. We believe that they should be an integral preprocessing step for cancer classification. Furthermore, finding an accurate gene selection method is a very significant issue in a cancer classification area because it reduces the dimensionality of microarray dataset and selects informative genes. In this paper, we classify and review the state-of-art gene selection methods. We proceed by evaluating the performance of each gene selection approach based on their classification accuracy and number of informative genes. In our evaluation, we will use four benchmark microarray datasets for the cancer diagnosis (leukemia, colon, lung, and prostate). In addition, we compare the performance of gene selection method to investigate the effective gene selection method that has the ability to identify a small set of marker genes, and ensure high cancer classification accuracy. To the best of our knowledge, this is the first attempt to compare gene selection approaches for cancer classification using microarray gene expression profile.Keywords: gene selection, feature selection, cancer classification, microarray, gene expression profile
Procedia PDF Downloads 4552176 Preliminary Study of Sediment-Derived Plastiglomerate: Proposal to Classification
Authors: Agung Rizki Perdana, Asrofi Mursalin, Adniwan Shubhi Banuzaki, M. Indra Novian
Abstract:
The understanding about sediment-derived plastiglomerate has a wide-range of merit in the academic realm. It can cover discussions about the Anthropocene Epoch in the scope of geoscience knowledge to even provide a solution for the environmental problem of plastic waste. Albeit its importance, very few research has been done regarding this issue. This research aims to create a classification as a pioneer for the study of sediment-derived plastiglomerate. This research was done in Bantul Regency, Daerah Istimewa Yogyakarta Province as an analogue of plastic debris sedimentation process. Observation is carried out in five observation points that shows three different depositional environments, which are terrestrial, fluvial, and transitional environment. The resulting classification uses three parameters and forms in a taxonomical manner. These parameters are composition, degree of lithification, and abundance of matrix respectively in advancing order. There is also a compositional ternary diagram which should be followed before entering the plastiglomerate nomenclature classification.Keywords: plastiglomerate, classification, sedimentary mechanism, microplastic
Procedia PDF Downloads 1322175 Use of Interpretable Evolved Search Query Classifiers for Sinhala Documents
Authors: Prasanna Haddela
Abstract:
Document analysis is a well matured yet still active research field, partly as a result of the intricate nature of building computational tools but also due to the inherent problems arising from the variety and complexity of human languages. Breaking down language barriers is vital in enabling access to a number of recent technologies. This paper investigates the application of document classification methods to new Sinhalese datasets. This language is geographically isolated and rich with many of its own unique features. We will examine the interpretability of the classification models with a particular focus on the use of evolved Lucene search queries generated using a Genetic Algorithm (GA) as a method of document classification. We will compare the accuracy and interpretability of these search queries with other popular classifiers. The results are promising and are roughly in line with previous work on English language datasets.Keywords: evolved search queries, Sinhala document classification, Lucene Sinhala analyzer, interpretable text classification, genetic algorithm
Procedia PDF Downloads 1142174 Classification Systems of Peat Soils Based on Their Geotechnical, Physical and Chemical Properties
Authors: Mohammad Saberian, Reza Porhoseini, Mohammad Ali Rahgozar
Abstract:
Peat is a partially carbonized vegetable tissue which is formed in wet conditions by decomposition of various plants, mosses and animal remains. This restricted definition, including only materials which are entirely of vegetative origin, conflicts with several established soil classification systems. Peat soils are usually defined as soils having more than 75 percent organic matter. Due to this composition, the structure of peat soil is highly different from the mineral soils such as silt, clay and sand. Peat has high compressibility, high moisture content, low shear strength and low bearing capacity, so it is considered to be in the category of problematic. Since this kind of soil is generally found in many countries and various zones, except for desert and polar zones, recognizing this soil is inevitably significant. The objective of this paper is to review the classification of peats based on various properties of peat soils such as organic contents, water content, color, odor, and decomposition, scholars offer various classification systems which Von Post classification system is one of the most well-known and efficient system.Keywords: peat soil, degree of decomposition, organic content, water content, Von Post classification
Procedia PDF Downloads 5972173 The Use of Hedging Devices in Studens’ Oral Presentation
Authors: Siti Navila
Abstract:
Hedging as a kind of pragmatic competence is an essential part in achieving the goal in communication, especially in academic discourse where the process of sharing knowledge among academic community takes place. Academic discourse demands an appropriateness and modesty of an author or speaker in stating arguments, to name but few, by considering the politeness, being cautious and tentative, and differentiating personal opinions and facts in which these aspects can be achieved through hedging. This study was conducted to find the hedging devices used by students as well as to analyze how they use them in their oral presentation. Some oral presentations from English Department students of the State University of Jakarta on their Academic Presentation course final test were recorded and explored formally and functionally. It was found that the most frequent hedging devices used by students were shields from all hedging devices that students commonly used when they showed suggestion, stated claims, showed opinion to provide possible but still valid answer, and offered the appropriate solution. The researcher suggests that hedging can be familiarized in learning, since potential conflicts that is likely to occur while delivering ideas in academic contexts such as disagreement, criticism, and personal judgment can be reduced with the use of hedging. It will also benefit students in achieving the academic competence with an ability to demonstrate their ideas appropriately and more acceptable in academic discourse.Keywords: academic discourse, hedging, hedging devices, lexical hedges, Meyer classification
Procedia PDF Downloads 4612172 INRAM-3DCNN: Multi-Scale Convolutional Neural Network Based on Residual and Attention Module Combined with Multilayer Perceptron for Hyperspectral Image Classification
Authors: Jianhong Xiang, Rui Sun, Linyu Wang
Abstract:
In recent years, due to the continuous improvement of deep learning theory, Convolutional Neural Network (CNN) has played a great superior performance in the research of Hyperspectral Image (HSI) classification. Since HSI has rich spatial-spectral information, only utilizing a single dimensional or single size convolutional kernel will limit the detailed feature information received by CNN, which limits the classification accuracy of HSI. In this paper, we design a multi-scale CNN with MLP based on residual and attention modules (INRAM-3DCNN) for the HSI classification task. We propose to use multiple 3D convolutional kernels to extract the packet feature information and fully learn the spatial-spectral features of HSI while designing residual 3D convolutional branches to avoid the decline of classification accuracy due to network degradation. Secondly, we also design the 2D Inception module with a joint channel attention mechanism to quickly extract key spatial feature information at different scales of HSI and reduce the complexity of the 3D model. Due to the high parallel processing capability and nonlinear global action of the Multilayer Perceptron (MLP), we use it in combination with the previous CNN structure for the final classification process. The experimental results on two HSI datasets show that the proposed INRAM-3DCNN method has superior classification performance and can perform the classification task excellently.Keywords: INRAM-3DCNN, residual, channel attention, hyperspectral image classification
Procedia PDF Downloads 812171 Deep Learning-Based Automated Structure Deterioration Detection for Building Structures: A Technological Advancement for Ensuring Structural Integrity
Authors: Kavita Bodke
Abstract:
Structural health monitoring (SHM) is experiencing growth, necessitating the development of distinct methodologies to address its expanding scope effectively. In this study, we developed automatic structure damage identification, which incorporates three unique types of a building’s structural integrity. The first pertains to the presence of fractures within the structure, the second relates to the issue of dampness within the structure, and the third involves corrosion inside the structure. This study employs image classification techniques to discern between intact and impaired structures within structural data. The aim of this research is to find automatic damage detection with the probability of each damage class being present in one image. Based on this probability, we know which class has a higher probability or is more affected than the other classes. Utilizing photographs captured by a mobile camera serves as the input for an image classification system. Image classification was employed in our study to perform multi-class and multi-label classification. The objective was to categorize structural data based on the presence of cracks, moisture, and corrosion. In the context of multi-class image classification, our study employed three distinct methodologies: Random Forest, Multilayer Perceptron, and CNN. For the task of multi-label image classification, the models employed were Rasnet, Xceptionet, and Inception.Keywords: SHM, CNN, deep learning, multi-class classification, multi-label classification
Procedia PDF Downloads 392170 Medical Image Classification Using Legendre Multifractal Spectrum Features
Authors: R. Korchiyne, A. Sbihi, S. M. Farssi, R. Touahni, M. Tahiri Alaoui
Abstract:
Trabecular bone structure is important texture in the study of osteoporosis. Legendre multifractal spectrum can reflect the complex and self-similarity characteristic of structures. The main objective of this paper is to develop a new technique of medical image classification based on Legendre multifractal spectrum. Novel features have been developed from basic geometrical properties of this spectrum in a supervised image classification. The proposed method has been successfully used to classify medical images of bone trabeculations, and could be a useful supplement to the clinical observations for osteoporosis diagnosis. A comparative study with existing data reveals that the results of this approach are concordant.Keywords: multifractal analysis, medical image, osteoporosis, fractal dimension, Legendre spectrum, supervised classification
Procedia PDF Downloads 5152169 Neural Network Approach to Classifying Truck Traffic
Authors: Ren Moses
Abstract:
The process of classifying vehicles on a highway is hereby viewed as a pattern recognition problem in which connectionist techniques such as artificial neural networks (ANN) can be used to assign vehicles to their correct classes and hence to establish optimum axle spacing thresholds. In the United States, vehicles are typically classified into 13 classes using a methodology commonly referred to as “Scheme F”. In this research, the ANN model was developed, trained, and applied to field data of vehicles. The data comprised of three vehicular features—axle spacing, number of axles per vehicle, and overall vehicle weight. The ANN reduced the classification error rate from 9.5 percent to 6.2 percent when compared to an existing classification algorithm that is not ANN-based and which uses two vehicular features for classification, that is, axle spacing and number of axles. The inclusion of overall vehicle weight as a third classification variable further reduced the error rate from 6.2 percent to only 3.0 percent. The promising results from the neural networks were used to set up new thresholds that reduce classification error rate.Keywords: artificial neural networks, vehicle classification, traffic flow, traffic analysis, and highway opera-tions
Procedia PDF Downloads 3122168 Combined Optical Coherence Microscopy and Spectrally Resolved Multiphoton Microscopy
Authors: Bjorn-Ole Meyer, Dominik Marti, Peter E. Andersen
Abstract:
A multimodal imaging system, combining spectrally resolved multiphoton microscopy (MPM) and optical coherence microscopy (OCM) is demonstrated. MPM and OCM are commonly integrated into multimodal imaging platforms to combine functional and morphological information. The MPM signals, such as two-photon fluorescence emission (TPFE) and signals created by second harmonic generation (SHG) are biomarkers which exhibit information on functional biological features such as the ratio of pyridine nucleotide (NAD(P)H) and flavin adenine dinucleotide (FAD) in the classification of cancerous tissue. While the spectrally resolved imaging allows for the study of biomarkers, using a spectrometer as a detector limits the imaging speed of the system significantly. To overcome those limitations, an OCM setup was added to the system, which allows for fast acquisition of structural information. Thus, after rapid imaging of larger specimens, navigation within the sample is possible. Subsequently, distinct features can be selected for further investigation using MPM. Additionally, by probing a different contrast, complementary information is obtained, and different biomarkers can be investigated. OCM images of tissue and cell samples are obtained, and distinctive features are evaluated using MPM to illustrate the benefits of the system.Keywords: optical coherence microscopy, multiphoton microscopy, multimodal imaging, two-photon fluorescence emission
Procedia PDF Downloads 5112167 Radar-Based Classification of Pedestrian and Dog Using High-Resolution Raw Range-Doppler Signatures
Authors: C. Mayr, J. Periya, A. Kariminezhad
Abstract:
In this paper, we developed a learning framework for the classification of vulnerable road users (VRU) by their range-Doppler signatures. The frequency-modulated continuous-wave (FMCW) radar raw data is first pre-processed to obtain robust object range-Doppler maps per coherent time interval. The complex-valued range-Doppler maps captured from our outdoor measurements are further fed into a convolutional neural network (CNN) to learn the classification. This CNN has gone through a hyperparameter optimization process for improved learning. By learning VRU range-Doppler signatures, the three classes 'pedestrian', 'dog', and 'noise' are classified with an average accuracy of almost 95%. Interestingly, this classification accuracy holds for a combined longitudinal and lateral object trajectories.Keywords: machine learning, radar, signal processing, autonomous driving
Procedia PDF Downloads 2462166 Classification of Poverty Level Data in Indonesia Using the Naïve Bayes Method
Authors: Anung Style Bukhori, Ani Dijah Rahajoe
Abstract:
Poverty poses a significant challenge in Indonesia, requiring an effective analytical approach to understand and address this issue. In this research, we applied the Naïve Bayes classification method to examine and classify poverty data in Indonesia. The main focus is on classifying data using RapidMiner, a powerful data analysis platform. The analysis process involves data splitting to train and test the classification model. First, we collected and prepared a poverty dataset that includes various factors such as education, employment, and health..The experimental results indicate that the Naïve Bayes classification model can provide accurate predictions regarding the risk of poverty. The use of RapidMiner in the analysis process offers flexibility and efficiency in evaluating the model's performance. The classification produces several values to serve as the standard for classifying poverty data in Indonesia using Naive Bayes. The accuracy result obtained is 40.26%, with a moderate recall result of 35.94%, a high recall result of 63.16%, and a low recall result of 38.03%. The precision for the moderate class is 58.97%, for the high class is 17.39%, and for the low class is 58.70%. These results can be seen from the graph below.Keywords: poverty, classification, naïve bayes, Indonesia
Procedia PDF Downloads 602165 Drone Classification Using Classification Methods Using Conventional Model With Embedded Audio-Visual Features
Authors: Hrishi Rakshit, Pooneh Bagheri Zadeh
Abstract:
This paper investigates the performance of drone classification methods using conventional DCNN with different hyperparameters, when additional drone audio data is embedded in the dataset for training and further classification. In this paper, first a custom dataset is created using different images of drones from University of South California (USC) datasets and Leeds Beckett university datasets with embedded drone audio signal. The three well-known DCNN architectures namely, Resnet50, Darknet53 and Shufflenet are employed over the created dataset tuning their hyperparameters such as, learning rates, maximum epochs, Mini Batch size with different optimizers. Precision-Recall curves and F1 Scores-Threshold curves are used to evaluate the performance of the named classification algorithms. Experimental results show that Resnet50 has the highest efficiency compared to other DCNN methods.Keywords: drone classifications, deep convolutional neural network, hyperparameters, drone audio signal
Procedia PDF Downloads 1042164 Application of Fuzzy Approach to the Vibration Fault Diagnosis
Authors: Jalel Khelil
Abstract:
In order to improve reliability of Gas Turbine machine especially its generator equipment, a fault diagnosis system based on fuzzy approach is proposed. Three various methods namely K-NN (K-nearest neighbors), F-KNN (Fuzzy K-nearest neighbors) and FNM (Fuzzy nearest mean) are adopted to provide the measurement of relative strength of vibration defaults. Both applications consist of two major steps: Feature extraction and default classification. 09 statistical features are extracted from vibration signals. 03 different classes are used in this study which describes vibrations condition: Normal, unbalance defect, and misalignment defect. The use of the fuzzy approaches and the classification results are discussed. Results show that these approaches yield high successful rates of vibration default classification.Keywords: fault diagnosis, fuzzy classification k-nearest neighbor, vibration
Procedia PDF Downloads 4682163 MhAGCN: Multi-Head Attention Graph Convolutional Network for Web Services Classification
Authors: Bing Li, Zhi Li, Yilong Yang
Abstract:
Web classification can promote the quality of service discovery and management in the service repository. It is widely used to locate developers desired services. Although traditional classification methods based on supervised learning models can achieve classification tasks, developers need to manually mark web services, and the quality of these tags may not be enough to establish an accurate classifier for service classification. With the doubling of the number of web services, the manual tagging method has become unrealistic. In recent years, the attention mechanism has made remarkable progress in the field of deep learning, and its huge potential has been fully demonstrated in various fields. This paper designs a multi-head attention graph convolutional network (MHAGCN) service classification method, which can assign different weights to the neighborhood nodes without complicated matrix operations or relying on understanding the entire graph structure. The framework combines the advantages of the attention mechanism and graph convolutional neural network. It can classify web services through automatic feature extraction. The comprehensive experimental results on a real dataset not only show the superior performance of the proposed model over the existing models but also demonstrate its potentially good interpretability for graph analysis.Keywords: attention mechanism, graph convolutional network, interpretability, service classification, service discovery
Procedia PDF Downloads 1372162 Musical Instruments Classification Using Machine Learning Techniques
Authors: Bhalke D. G., Bormane D. S., Kharate G. K.
Abstract:
This paper presents classification of musical instrument using machine learning techniques. The classification has been carried out using temporal, spectral, cepstral and wavelet features. Detail feature analysis is carried out using separate and combined features. Further, instrument model has been developed using K-Nearest Neighbor and Support Vector Machine (SVM). Benchmarked McGill university database has been used to test the performance of the system. Experimental result shows that SVM performs better as compared to KNN classifier.Keywords: feature extraction, SVM, KNN, musical instruments
Procedia PDF Downloads 4802161 Improved Classification Procedure for Imbalanced and Overlapped Situations
Authors: Hankyu Lee, Seoung Bum Kim
Abstract:
The issue with imbalance and overlapping in the class distribution becomes important in various applications of data mining. The imbalanced dataset is a special case in classification problems in which the number of observations of one class (i.e., major class) heavily exceeds the number of observations of the other class (i.e., minor class). Overlapped dataset is the case where many observations are shared together between the two classes. Imbalanced and overlapped data can be frequently found in many real examples including fraud and abuse patients in healthcare, quality prediction in manufacturing, text classification, oil spill detection, remote sensing, and so on. The class imbalance and overlap problem is the challenging issue because this situation degrades the performance of most of the standard classification algorithms. In this study, we propose a classification procedure that can effectively handle imbalanced and overlapped datasets by splitting data space into three parts: nonoverlapping, light overlapping, and severe overlapping and applying the classification algorithm in each part. These three parts were determined based on the Hausdorff distance and the margin of the modified support vector machine. An experiments study was conducted to examine the properties of the proposed method and compared it with other classification algorithms. The results showed that the proposed method outperformed the competitors under various imbalanced and overlapped situations. Moreover, the applicability of the proposed method was demonstrated through the experiment with real data.Keywords: classification, imbalanced data with class overlap, split data space, support vector machine
Procedia PDF Downloads 3082160 Assessment of Planet Image for Land Cover Mapping Using Soft and Hard Classifiers
Authors: Lamyaa Gamal El-Deen Taha, Ashraf Sharawi
Abstract:
Planet image is a new data source from planet lab. This research is concerned with the assessment of Planet image for land cover mapping. Two pixel based classifiers and one subpixel based classifier were compared. Firstly, rectification of Planet image was performed. Secondly, a comparison between minimum distance, maximum likelihood and neural network classifications for classification of Planet image was performed. Thirdly, the overall accuracy of classification and kappa coefficient were calculated. Results indicate that neural network classification is best followed by maximum likelihood classifier then minimum distance classification for land cover mapping.Keywords: planet image, land cover mapping, rectification, neural network classification, multilayer perceptron, soft classifiers, hard classifiers
Procedia PDF Downloads 1882159 Satellite Image Classification Using Firefly Algorithm
Authors: Paramjit Kaur, Harish Kundra
Abstract:
In the recent years, swarm intelligence based firefly algorithm has become a great focus for the researchers to solve the real time optimization problems. Here, firefly algorithm is used for the application of satellite image classification. For experimentation, Alwar area is considered to multiple land features like vegetation, barren, hilly, residential and water surface. Alwar dataset is considered with seven band satellite images. Firefly Algorithm is based on the attraction of less bright fireflies towards more brightener one. For the evaluation of proposed concept accuracy assessment parameters are calculated using error matrix. With the help of Error matrix, parameters of Kappa Coefficient, Overall Accuracy and feature wise accuracy parameters of user’s accuracy & producer’s accuracy can be calculated. Overall results are compared with BBO, PSO, Hybrid FPAB/BBO, Hybrid ACO/SOFM and Hybrid ACO/BBO based on the kappa coefficient and overall accuracy parameters.Keywords: image classification, firefly algorithm, satellite image classification, terrain classification
Procedia PDF Downloads 4012158 Sentiment Classification Using Enhanced Contextual Valence Shifters
Authors: Vo Ngoc Phu, Phan Thi Tuoi
Abstract:
We have explored different methods of improving the accuracy of sentiment classification. The sentiment orientation of a document can be positive (+), negative (-), or neutral (0). We combine five dictionaries from [2, 3, 4, 5, 6] into the new one with 21137 entries. The new dictionary has many verbs, adverbs, phrases and idioms, that are not in five ones before. The paper shows that our proposed method based on the combination of Term-Counting method and Enhanced Contextual Valence Shifters method has improved the accuracy of sentiment classification. The combined method has accuracy 68.984% on the testing dataset, and 69.224% on the training dataset. All of these methods are implemented to classify the reviews based on our new dictionary and the Internet Movie data set.Keywords: sentiment classification, sentiment orientation, valence shifters, contextual, valence shifters, term counting
Procedia PDF Downloads 5052157 Optimal Classifying and Extracting Fuzzy Relationship from Query Using Text Mining Techniques
Authors: Faisal Alshuwaier, Ali Areshey
Abstract:
Text mining techniques are generally applied for classifying the text, finding fuzzy relations and structures in data sets. This research provides plenty text mining capabilities. One common application is text classification and event extraction, which encompass deducing specific knowledge concerning incidents referred to in texts. The main contribution of this paper is the clarification of a concept graph generation mechanism, which is based on a text classification and optimal fuzzy relationship extraction. Furthermore, the work presented in this paper explains the application of fuzzy relationship extraction and branch and bound method to simplify the texts.Keywords: extraction, max-prod, fuzzy relations, text mining, memberships, classification, memberships, classification
Procedia PDF Downloads 5832156 Detection and Classification of Mammogram Images Using Principle Component Analysis and Lazy Classifiers
Authors: Rajkumar Kolangarakandy
Abstract:
Feature extraction and selection is the primary part of any mammogram classification algorithms. The choice of feature, attribute or measurements have an important influence in any classification system. Discrete Wavelet Transformation (DWT) coefficients are one of the prominent features for representing images in frequency domain. The features obtained after the decomposition of the mammogram images using wavelet transformations have higher dimension. Even though the features are higher in dimension, they were highly correlated and redundant in nature. The dimensionality reduction techniques play an important role in selecting the optimum number of features from the higher dimension data, which are highly correlated. PCA is a mathematical tool that reduces the dimensionality of the data while retaining most of the variation in the dataset. In this paper, a multilevel classification of mammogram images using reduced discrete wavelet transformation coefficients and lazy classifiers is proposed. The classification is accomplished in two different levels. In the first level, mammogram ROIs extracted from the dataset is classified as normal and abnormal types. In the second level, all the abnormal mammogram ROIs is classified into benign and malignant too. A further classification is also accomplished based on the variation in structure and intensity distribution of the images in the dataset. The Lazy classifiers called Kstar, IBL and LWL are used for classification. The classification results obtained with the reduced feature set is highly promising and the result is also compared with the performance obtained without dimension reduction.Keywords: PCA, wavelet transformation, lazy classifiers, Kstar, IBL, LWL
Procedia PDF Downloads 3352155 Characterization, Classification and Fertility Capability Classification of Three Rice Zones of Ebonyi State, Southeastern Nigeria
Authors: Sunday Nathaniel Obasi, Chiamak Chinasa Obasi
Abstract:
Soil characterization and classification provide the basic information necessary to create a functional evaluation and soil classification schemes. Fertility capability classification (FCC) on the other hand is a technical system that groups the soils according to kinds of problems they present for management of soil physical and chemical properties. This research was carried out in Ebonyi state, southeastern Nigeria, which is an agrarian state and a leading rice producing part of southeastern Nigeria. In order to maximize the soil and enhance the productivity of rice in Ebonyi soils, soil classification, and fertility classification information need to be supplied. The state was grouped into three locations according to their agricultural zones namely; Ebonyi north, Ebonyi central and Ebonyi south representing Abakaliki, Ikwo and Ivo locations respectively. Major rice growing areas of the soils were located and two profile pits were sunk in each of the studied zones from which soils were characterized, classified and fertility capability classification (FCC) developed. Soil classification was done using United State Department of Agriculture (USDA) Soil Taxonomy and correlated with World Reference Base for soil resources. Results obtained classified Abakaliki 1 and Abakaliki 2 as Typic Fluvaquents (Ochric Fluvisols). Ikwo 1 was classified as Vertic Eutrudepts (Eutric Vertisols) while Ikwo 2 was classified as Typic Eutrudepts (Eutric Cambisols). Ivo 1 and Ivo 2 were both classified as Aquic Eutrudepts (Gleyic Leptosols). Fertility capability classification (FCC) revealed that all studied soils had mostly loamy topsoils and subsoils except Ikwo 1 with clayey topsoil. Limitations encountered in the studied soils include; dryness (d), low ECEC (e), low nutrient capital reserve (k) and water logging/ anaerobic condition (gley). Thus, FCC classifications were Ldek for Abakaliki 1 and 2, Ckv for Ikwo 1, LCk for Ikwo 2 while Ivo 1 and 2 were Legk and Lgk respectively.Keywords: soil classification, soil fertility, limitations, modifiers, Southeastern Nigeria
Procedia PDF Downloads 1302154 Land Cover Classification Using Sentinel-2 Image Data and Random Forest Algorithm
Authors: Thanh Noi Phan, Martin Kappas, Jan Degener
Abstract:
The currently launched Sentinel 2 (S2) satellite (June, 2015) bring a great potential and opportunities for land use/cover map applications, due to its fine spatial resolution multispectral as well as high temporal resolutions. So far, there are handful studies using S2 real data for land cover classification. Especially in northern Vietnam, to our best knowledge, there exist no studies using S2 data for land cover map application. The aim of this study is to provide the preliminary result of land cover classification using Sentinel -2 data with a rising state – of – art classifier, Random Forest. A case study with heterogeneous land use/cover in the eastern of Hanoi Capital – Vietnam was chosen for this study. All 10 spectral bands of 10 and 20 m pixel size of S2 images were used, the 10 m bands were resampled to 20 m. Among several classified algorithms, supervised Random Forest classifier (RF) was applied because it was reported as one of the most accuracy methods of satellite image classification. The results showed that the red-edge and shortwave infrared (SWIR) bands play an important role in land cover classified results. A very high overall accuracy above 90% of classification results was achieved.Keywords: classify algorithm, classification, land cover, random forest, sentinel 2, Vietnam
Procedia PDF Downloads 3892153 Classification of Cochannel Signals Using Cyclostationary Signal Processing and Deep Learning
Authors: Bryan Crompton, Daniel Giger, Tanay Mehta, Apurva Mody
Abstract:
The task of classifying radio frequency (RF) signals has seen recent success in employing deep neural network models. In this work, we present a combined signal processing and machine learning approach to signal classification for cochannel anomalous signals. The power spectral density and cyclostationary signal processing features of a captured signal are computed and fed into a neural net to produce a classification decision. Our combined signal preprocessing and machine learning approach allows for simpler neural networks with fast training times and small computational resource requirements for inference with longer preprocessing time.Keywords: signal processing, machine learning, cyclostationary signal processing, signal classification
Procedia PDF Downloads 1092152 Using Data Mining Technique for Scholarship Disbursement
Authors: J. K. Alhassan, S. A. Lawal
Abstract:
This work is on decision tree-based classification for the disbursement of scholarship. Tree-based data mining classification technique is used in other to determine the generic rule to be used to disburse the scholarship. The system based on the defined rules from the tree is able to determine the class (status) to which an applicant shall belong whether Granted or Not Granted. The applicants that fall to the class of granted denote a successful acquirement of scholarship while those in not granted class are unsuccessful in the scheme. An algorithm that can be used to classify the applicants based on the rules from tree-based classification was also developed. The tree-based classification is adopted because of its efficiency, effectiveness, and easy to comprehend features. The system was tested with the data of National Information Technology Development Agency (NITDA) Abuja, a Parastatal of Federal Ministry of Communication Technology that is mandated to develop and regulate information technology in Nigeria. The system was found working according to the specification. It is therefore recommended for all scholarship disbursement organizations.Keywords: classification, data mining, decision tree, scholarship
Procedia PDF Downloads 378