Search results for: based classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 28385

Search results for: based classification

28325 Development of Fake News Model Using Machine Learning through Natural Language Processing

Authors: Sajjad Ahmed, Knut Hinkelmann, Flavio Corradini

Abstract:

Fake news detection research is still in the early stage as this is a relatively new phenomenon in the interest raised by society. Machine learning helps to solve complex problems and to build AI systems nowadays and especially in those cases where we have tacit knowledge or the knowledge that is not known. We used machine learning algorithms and for identification of fake news; we applied three classifiers; Passive Aggressive, Naïve Bayes, and Support Vector Machine. Simple classification is not completely correct in fake news detection because classification methods are not specialized for fake news. With the integration of machine learning and text-based processing, we can detect fake news and build classifiers that can classify the news data. Text classification mainly focuses on extracting various features of text and after that incorporating those features into classification. The big challenge in this area is the lack of an efficient way to differentiate between fake and non-fake due to the unavailability of corpora. We applied three different machine learning classifiers on two publicly available datasets. Experimental analysis based on the existing dataset indicates a very encouraging and improved performance.

Keywords: fake news detection, natural language processing, machine learning, classification techniques.

Procedia PDF Downloads 135
28324 Hybrid Reliability-Similarity-Based Approach for Supervised Machine Learning

Authors: Walid Cherif

Abstract:

Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.

Keywords: data mining, knowledge discovery, machine learning, similarity measurement, supervised classification

Procedia PDF Downloads 439
28323 Obstacle Classification Method Based on 2D LIDAR Database

Authors: Moohyun Lee, Soojung Hur, Yongwan Park

Abstract:

In this paper is proposed a method uses only LIDAR system to classification an obstacle and determine its type by establishing database for classifying obstacles based on LIDAR. The existing LIDAR system, in determining the recognition of obstruction in an autonomous vehicle, has an advantage in terms of accuracy and shorter recognition time. However, it was difficult to determine the type of obstacle and therefore accurate path planning based on the type of obstacle was not possible. In order to overcome this problem, a method of classifying obstacle type based on existing LIDAR and using the width of obstacle materials was proposed. However, width measurement was not sufficient to improve accuracy. In this research, the width data was used to do the first classification; database for LIDAR intensity data by four major obstacle materials on the road were created; comparison is made to the LIDAR intensity data of actual obstacle materials; and determine the obstacle type by finding the one with highest similarity values. An experiment using an actual autonomous vehicle under real environment shows that data declined in quality in comparison to 3D LIDAR and it was possible to classify obstacle materials using 2D LIDAR.

Keywords: obstacle, classification, database, LIDAR, segmentation, intensity

Procedia PDF Downloads 314
28322 Medical Image Classification Using Legendre Multifractal Spectrum Features

Authors: R. Korchiyne, A. Sbihi, S. M. Farssi, R. Touahni, M. Tahiri Alaoui

Abstract:

Trabecular bone structure is important texture in the study of osteoporosis. Legendre multifractal spectrum can reflect the complex and self-similarity characteristic of structures. The main objective of this paper is to develop a new technique of medical image classification based on Legendre multifractal spectrum. Novel features have been developed from basic geometrical properties of this spectrum in a supervised image classification. The proposed method has been successfully used to classify medical images of bone trabeculations, and could be a useful supplement to the clinical observations for osteoporosis diagnosis. A comparative study with existing data reveals that the results of this approach are concordant.

Keywords: multifractal analysis, medical image, osteoporosis, fractal dimension, Legendre spectrum, supervised classification

Procedia PDF Downloads 491
28321 Automatic Classification of Periodic Heart Sounds Using Convolutional Neural Network

Authors: Jia Xin Low, Keng Wah Choo

Abstract:

This paper presents an automatic normal and abnormal heart sound classification model developed based on deep learning algorithm. MITHSDB heart sounds datasets obtained from the 2016 PhysioNet/Computing in Cardiology Challenge database were used in this research with the assumption that the electrocardiograms (ECG) were recorded simultaneously with the heart sounds (phonocardiogram, PCG). The PCG time series are segmented per heart beat, and each sub-segment is converted to form a square intensity matrix, and classified using convolutional neural network (CNN) models. This approach removes the need to provide classification features for the supervised machine learning algorithm. Instead, the features are determined automatically through training, from the time series provided. The result proves that the prediction model is able to provide reasonable and comparable classification accuracy despite simple implementation. This approach can be used for real-time classification of heart sounds in Internet of Medical Things (IoMT), e.g. remote monitoring applications of PCG signal.

Keywords: convolutional neural network, discrete wavelet transform, deep learning, heart sound classification

Procedia PDF Downloads 324
28320 Real-Time Classification of Marbles with Decision-Tree Method

Authors: K. S. Parlak, E. Turan

Abstract:

The separation of marbles according to the pattern quality is a process made according to expert decision. The classification phase is the most critical part in terms of economic value. In this study, a self-learning system is proposed which performs the classification of marbles quickly and with high success. This system performs ten feature extraction by taking ten marble images from the camera. The marbles are classified by decision tree method using the obtained properties. The user forms the training set by training the system at the marble classification stage. The system evolves itself in every marble image that is classified. The aim of the proposed system is to minimize the error caused by the person performing the classification and achieve it quickly.

Keywords: decision tree, feature extraction, k-means clustering, marble classification

Procedia PDF Downloads 358
28319 Radar-Based Classification of Pedestrian and Dog Using High-Resolution Raw Range-Doppler Signatures

Authors: C. Mayr, J. Periya, A. Kariminezhad

Abstract:

In this paper, we developed a learning framework for the classification of vulnerable road users (VRU) by their range-Doppler signatures. The frequency-modulated continuous-wave (FMCW) radar raw data is first pre-processed to obtain robust object range-Doppler maps per coherent time interval. The complex-valued range-Doppler maps captured from our outdoor measurements are further fed into a convolutional neural network (CNN) to learn the classification. This CNN has gone through a hyperparameter optimization process for improved learning. By learning VRU range-Doppler signatures, the three classes 'pedestrian', 'dog', and 'noise' are classified with an average accuracy of almost 95%. Interestingly, this classification accuracy holds for a combined longitudinal and lateral object trajectories.

Keywords: machine learning, radar, signal processing, autonomous driving

Procedia PDF Downloads 214
28318 Application of Fuzzy Approach to the Vibration Fault Diagnosis

Authors: Jalel Khelil

Abstract:

In order to improve reliability of Gas Turbine machine especially its generator equipment, a fault diagnosis system based on fuzzy approach is proposed. Three various methods namely K-NN (K-nearest neighbors), F-KNN (Fuzzy K-nearest neighbors) and FNM (Fuzzy nearest mean) are adopted to provide the measurement of relative strength of vibration defaults. Both applications consist of two major steps: Feature extraction and default classification. 09 statistical features are extracted from vibration signals. 03 different classes are used in this study which describes vibrations condition: Normal, unbalance defect, and misalignment defect. The use of the fuzzy approaches and the classification results are discussed. Results show that these approaches yield high successful rates of vibration default classification.

Keywords: fault diagnosis, fuzzy classification k-nearest neighbor, vibration

Procedia PDF Downloads 445
28317 The Change of Urban Land Use/Cover Using Object Based Approach for Southern Bali

Authors: I. Gusti A. A. Rai Asmiwyati, Robert J. Corner, Ashraf M. Dewan

Abstract:

Change on land use/cover (LULC) dominantly affects spatial structure and function. It can have such impacts by disrupting social culture practice and disturbing physical elements. Thus, it has become essential to understand of the dynamics in time and space of LULC as it can be used as a critical input for developing sustainable LULC. This study was an attempt to map and monitor the LULC change in Bali Indonesia from 2003 to 2013. Using object based classification to improve the accuracy, and change detection, multi temporal land use/cover data were extracted from a set of ASTER satellite image. The overall accuracies of the classification maps of 2003 and 2013 were 86.99% and 80.36%, respectively. Built up area and paddy field were the dominant type of land use/cover in both years. Patch increase dominantly in 2003 illustrated the rapid paddy field fragmentation and the huge occurring transformation. This approach is new for the case of diverse urban features of Bali that has been growing fast and increased the classification accuracy than the manual pixel based classification.

Keywords: land use/cover, urban, Bali, ASTER

Procedia PDF Downloads 514
28316 Electroencephalogram Based Alzheimer Disease Classification using Machine and Deep Learning Methods

Authors: Carlos Roncero-Parra, Alfonso Parreño-Torres, Jorge Mateo Sotos, Alejandro L. Borja

Abstract:

In this research, different methods based on machine/deep learning algorithms are presented for the classification and diagnosis of patients with mental disorders such as alzheimer. For this purpose, the signals obtained from 32 unipolar electrodes identified by non-invasive EEG were examined, and their basic properties were obtained. More specifically, different well-known machine learning based classifiers have been used, i.e., support vector machine (SVM), Bayesian linear discriminant analysis (BLDA), decision tree (DT), Gaussian Naïve Bayes (GNB), K-nearest neighbor (KNN) and Convolutional Neural Network (CNN). A total of 668 patients from five different hospitals have been studied in the period from 2011 to 2021. The best accuracy is obtained was around 93 % in both ADM and ADA classifications. It can be concluded that such a classification will enable the training of algorithms that can be used to identify and classify different mental disorders with high accuracy.

Keywords: alzheimer, machine learning, deep learning, EEG

Procedia PDF Downloads 88
28315 The Design of the Multi-Agent Classification System (MACS)

Authors: Mohamed R. Mhereeg

Abstract:

The paper discusses the design of a .NET Windows Service based agent system called MACS (Multi-Agent Classification System). MACS is a system aims to accurately classify spread-sheet developers competency over a network. It is designed to automatically and autonomously monitor spread-sheet users and gather their development activities based on the utilization of the software Multi-Agent Technology (MAS). This is accomplished in such a way that makes management capable to efficiently allow for precise tailor training activities for future spread-sheet development. The monitoring agents of MACS are intended to be distributed over the WWW in order to satisfy the monitoring and classification of the multiple developer aspect. The Prometheus methodology is used for the design of the agents of MACS. Prometheus has been used to undertake this phase of the system design because it is developed specifically for specifying and designing agent-oriented systems. Additionally, Prometheus specifies also the communication needed between the agents in order to coordinate to achieve their delegated tasks.

Keywords: classification, design, MACS, MAS, prometheus

Procedia PDF Downloads 373
28314 Optimal Classifying and Extracting Fuzzy Relationship from Query Using Text Mining Techniques

Authors: Faisal Alshuwaier, Ali Areshey

Abstract:

Text mining techniques are generally applied for classifying the text, finding fuzzy relations and structures in data sets. This research provides plenty text mining capabilities. One common application is text classification and event extraction, which encompass deducing specific knowledge concerning incidents referred to in texts. The main contribution of this paper is the clarification of a concept graph generation mechanism, which is based on a text classification and optimal fuzzy relationship extraction. Furthermore, the work presented in this paper explains the application of fuzzy relationship extraction and branch and bound method to simplify the texts.

Keywords: extraction, max-prod, fuzzy relations, text mining, memberships, classification, memberships, classification

Procedia PDF Downloads 551
28313 3D Reconstruction of Human Body Based on Gender Classification

Authors: Jiahe Liu, Hongyang Yu, Feng Qian, Miao Luo

Abstract:

SMPL-X was a powerful parametric human body model that included male, neutral, and female models, with significant gender differences between these three models. During the process of 3D human body reconstruction, the correct selection of standard templates was crucial for obtaining accurate results. To address this issue, we developed an efficient gender classification algorithm to automatically select the appropriate template for 3D human body reconstruction. The key to this gender classification algorithm was the precise analysis of human body features. By using the SMPL-X model, the algorithm could detect and identify gender features of the human body, thereby determining which standard template should be used. The accuracy of this algorithm made the 3D reconstruction process more accurate and reliable, as it could adjust model parameters based on individual gender differences. SMPL-X and the related gender classification algorithm have brought important advancements to the field of 3D human body reconstruction. By accurately selecting standard templates, they have improved the accuracy of reconstruction and have broad potential in various application fields. These technologies continue to drive the development of the 3D reconstruction field, providing us with more realistic and accurate human body models.

Keywords: gender classification, joint detection, SMPL-X, 3D reconstruction

Procedia PDF Downloads 43
28312 Classification of Hyperspectral Image Using Mathematical Morphological Operator-Based Distance Metric

Authors: Geetika Barman, B. S. Daya Sagar

Abstract:

In this article, we proposed a pixel-wise classification of hyperspectral images using a mathematical morphology operator-based distance metric called “dilation distance” and “erosion distance”. This method involves measuring the spatial distance between the spectral features of a hyperspectral image across the bands. The key concept of the proposed approach is that the “dilation distance” is the maximum distance a pixel can be moved without changing its classification, whereas the “erosion distance” is the maximum distance that a pixel can be moved before changing its classification. The spectral signature of the hyperspectral image carries unique class information and shape for each class. This article demonstrates how easily the dilation and erosion distance can measure spatial distance compared to other approaches. This property is used to calculate the spatial distance between hyperspectral image feature vectors across the bands. The dissimilarity matrix is then constructed using both measures extracted from the feature spaces. The measured distance metric is used to distinguish between the spectral features of various classes and precisely distinguish between each class. This is illustrated using both toy data and real datasets. Furthermore, we investigated the role of flat vs. non-flat structuring elements in capturing the spatial features of each class in the hyperspectral image. In order to validate, we compared the proposed approach to other existing methods and demonstrated empirically that mathematical operator-based distance metric classification provided competitive results and outperformed some of them.

Keywords: dilation distance, erosion distance, hyperspectral image classification, mathematical morphology

Procedia PDF Downloads 59
28311 Spatio-Temporal Assessment of Urban Growth and Land Use Change in Islamabad Using Object-Based Classification Method

Authors: Rabia Shabbir, Sheikh Saeed Ahmad, Amna Butt

Abstract:

Rapid land use changes have taken place in Islamabad, the capital city of Pakistan, over the past decades due to accelerated urbanization and industrialization. In this study, land use changes in the metropolitan area of Islamabad was observed by the combined use of GIS and satellite remote sensing for a time period of 15 years. High-resolution Google Earth images were downloaded from 2000-2015, and object-based classification method was used for accurate classification using eCognition software. The information regarding urban settlements, industrial area, barren land, agricultural area, vegetation, water, and transportation infrastructure was extracted. The results showed that the city experienced a spatial expansion, rapid urban growth, land use change and expanding transportation infrastructure. The study concluded the integration of GIS and remote sensing as an effective approach for analyzing the spatial pattern of urban growth and land use change.

Keywords: land use change, urban growth, Islamabad, object-based classification, Google Earth, remote sensing, GIS

Procedia PDF Downloads 132
28310 Neural Network Approach to Classifying Truck Traffic

Authors: Ren Moses

Abstract:

The process of classifying vehicles on a highway is hereby viewed as a pattern recognition problem in which connectionist techniques such as artificial neural networks (ANN) can be used to assign vehicles to their correct classes and hence to establish optimum axle spacing thresholds. In the United States, vehicles are typically classified into 13 classes using a methodology commonly referred to as “Scheme F”. In this research, the ANN model was developed, trained, and applied to field data of vehicles. The data comprised of three vehicular features—axle spacing, number of axles per vehicle, and overall vehicle weight. The ANN reduced the classification error rate from 9.5 percent to 6.2 percent when compared to an existing classification algorithm that is not ANN-based and which uses two vehicular features for classification, that is, axle spacing and number of axles. The inclusion of overall vehicle weight as a third classification variable further reduced the error rate from 6.2 percent to only 3.0 percent. The promising results from the neural networks were used to set up new thresholds that reduce classification error rate.

Keywords: artificial neural networks, vehicle classification, traffic flow, traffic analysis, and highway opera-tions

Procedia PDF Downloads 280
28309 Roof Material Detection Based on Object-Based Approach Using WorldView-2 Satellite Imagery

Authors: Ebrahim Taherzadeh, Helmi Z. M. Shafri, Kaveh Shahi

Abstract:

One of the most important tasks in urban area remote sensing is detection of impervious surface (IS), such as building roof and roads. However, detection of IS in heterogeneous areas still remains as one of the most challenging works. In this study, detection of concrete roof using an object-oriented approach was proposed. A new rule-based classification was developed to detect concrete roof tile. The proposed rule-based classification was applied to WorldView-2 image. Results showed that the proposed rule has good potential to predict concrete roof material from WorldView-2 images with 85% accuracy.

Keywords: object-based, roof material, concrete tile, WorldView-2

Procedia PDF Downloads 400
28308 A Generalized Weighted Loss for Support Vextor Classification and Multilayer Perceptron

Authors: Filippo Portera

Abstract:

Usually standard algorithms employ a loss where each error is the mere absolute difference between the true value and the prediction, in case of a regression task. In the present, we present several error weighting schemes that are a generalization of the consolidated routine. We study both a binary classification model for Support Vextor Classification and a regression net for Multylayer Perceptron. Results proves that the error is never worse than the standard procedure and several times it is better.

Keywords: loss, binary-classification, MLP, weights, regression

Procedia PDF Downloads 69
28307 A Novel PSO Based Decision Tree Classification

Authors: Ali Farzan

Abstract:

Classification of data objects or patterns is a major part in most of Decision making systems. One of the popular and commonly used classification methods is Decision Tree (DT). It is a hierarchical decision making system by which a binary tree is constructed and starting from root, at each node some of the classes is rejected until reaching the leaf nods. Each leaf node is a representative of one specific class. Finding the splitting criteria in each node for constructing or training the tree is a major problem. Particle Swarm Optimization (PSO) has been adopted as a metaheuristic searching method for finding the best splitting criteria. Result of evaluating the proposed method over benchmark datasets indicates the higher accuracy of the new PSO based decision tree.

Keywords: decision tree, particle swarm optimization, splitting criteria, metaheuristic

Procedia PDF Downloads 383
28306 Multilabel Classification with Neural Network Ensemble Method

Authors: Sezin Ekşioğlu

Abstract:

Multilabel classification has a huge importance for several applications, it is also a challenging research topic. It is a kind of supervised learning that contains binary targets. The distance between multilabel and binary classification is having more than one class in multilabel classification problems. Features can belong to one class or many classes. There exists a wide range of applications for multi label prediction such as image labeling, text categorization, gene functionality. Even though features are classified in many classes, they may not always be properly classified. There are many ensemble methods for the classification. However, most of the researchers have been concerned about better multilabel methods. Especially little ones focus on both efficiency of classifiers and pairwise relationships at the same time in order to implement better multilabel classification. In this paper, we worked on modified ensemble methods by getting benefit from k-Nearest Neighbors and neural network structure to address issues within a beneficial way and to get better impacts from the multilabel classification. Publicly available datasets (yeast, emotion, scene and birds) are performed to demonstrate the developed algorithm efficiency and the technique is measured by accuracy, F1 score and hamming loss metrics. Our algorithm boosts benchmarks for each datasets with different metrics.

Keywords: multilabel, classification, neural network, KNN

Procedia PDF Downloads 130
28305 Enhanced Image Representation for Deep Belief Network Classification of Hyperspectral Images

Authors: Khitem Amiri, Mohamed Farah

Abstract:

Image classification is a challenging task and is gaining lots of interest since it helps us to understand the content of images. Recently Deep Learning (DL) based methods gave very interesting results on several benchmarks. For Hyperspectral images (HSI), the application of DL techniques is still challenging due to the scarcity of labeled data and to the curse of dimensionality. Among other approaches, Deep Belief Network (DBN) based approaches gave a fair classification accuracy. In this paper, we address the problem of the curse of dimensionality by reducing the number of bands and replacing the HSI channels by the channels representing radiometric indices. Therefore, instead of using all the HSI bands, we compute the radiometric indices such as NDVI (Normalized Difference Vegetation Index), NDWI (Normalized Difference Water Index), etc, and we use the combination of these indices as input for the Deep Belief Network (DBN) based classification model. Thus, we keep almost all the pertinent spectral information while reducing considerably the size of the image. In order to test our image representation, we applied our method on several HSI datasets including the Indian pines dataset, Jasper Ridge data and it gave comparable results to the state of the art methods while reducing considerably the time of training and testing.

Keywords: hyperspectral images, deep belief network, radiometric indices, image classification

Procedia PDF Downloads 248
28304 Automatic Classification Using Dynamic Fuzzy C Means Algorithm and Mathematical Morphology: Application in 3D MRI Image

Authors: Abdelkhalek Bakkari

Abstract:

Image segmentation is a critical step in image processing and pattern recognition. In this paper, we proposed a new robust automatic image classification based on a dynamic fuzzy c-means algorithm and mathematical morphology. The proposed segmentation algorithm (DFCM_MM) has been applied to MR perfusion images. The obtained results show the validity and robustness of the proposed approach.

Keywords: segmentation, classification, dynamic, fuzzy c-means, MR image

Procedia PDF Downloads 447
28303 MhAGCN: Multi-Head Attention Graph Convolutional Network for Web Services Classification

Authors: Bing Li, Zhi Li, Yilong Yang

Abstract:

Web classification can promote the quality of service discovery and management in the service repository. It is widely used to locate developers desired services. Although traditional classification methods based on supervised learning models can achieve classification tasks, developers need to manually mark web services, and the quality of these tags may not be enough to establish an accurate classifier for service classification. With the doubling of the number of web services, the manual tagging method has become unrealistic. In recent years, the attention mechanism has made remarkable progress in the field of deep learning, and its huge potential has been fully demonstrated in various fields. This paper designs a multi-head attention graph convolutional network (MHAGCN) service classification method, which can assign different weights to the neighborhood nodes without complicated matrix operations or relying on understanding the entire graph structure. The framework combines the advantages of the attention mechanism and graph convolutional neural network. It can classify web services through automatic feature extraction. The comprehensive experimental results on a real dataset not only show the superior performance of the proposed model over the existing models but also demonstrate its potentially good interpretability for graph analysis.

Keywords: attention mechanism, graph convolutional network, interpretability, service classification, service discovery

Procedia PDF Downloads 112
28302 Application of Rapid Eye Imagery in Crop Type Classification Using Vegetation Indices

Authors: Sunita Singh, Rajani Srivastava

Abstract:

For natural resource management and in other applications about earth observation revolutionary remote sensing technology plays a significant role. One of such application in monitoring and classification of crop types at spatial and temporal scale, as it provides latest, most precise and cost-effective information. Present study emphasizes the use of three different vegetation indices of Rapid Eye imagery on crop type classification. It also analyzed the effect of each indices on classification accuracy. Rapid Eye imagery is highly demanded and preferred for agricultural and forestry sectors as it has red-edge and NIR bands. The three indices used in this study were: the Normalized Difference Vegetation Index (NDVI), the Green Normalized Difference Vegetation Index (GNDVI), and the Normalized Difference Red Edge Index (NDRE) and all of these incorporated the Red Edge band. The study area is Varanasi district of Uttar Pradesh, India and Radial Basis Function (RBF) kernel was used here for the Support Vector Machines (SVMs) classification. Classification was performed with these three vegetation indices. The contribution of each indices on image classification accuracy was also tested with single band classification. Highest classification accuracy of 85% was obtained using three vegetation indices. The study concluded that NDRE has the highest contribution on classification accuracy compared to the other vegetation indices and the Rapid Eye imagery can get satisfactory results of classification accuracy without original bands.

Keywords: GNDVI, NDRE, NDVI, rapid eye, vegetation indices

Procedia PDF Downloads 334
28301 Enhanced Arabic Semantic Information Retrieval System Based on Arabic Text Classification

Authors: A. Elsehemy, M. Abdeen , T. Nazmy

Abstract:

Since the appearance of the Semantic web, many semantic search techniques and models were proposed to exploit the information in ontology to enhance the traditional keyword-based search. Many advances were made in languages such as English, German, French and Spanish. However, other languages such as Arabic are not fully supported yet. In this paper we present a framework for ontology based information retrieval for Arabic language. Our system consists of four main modules, namely query parser, indexer, search and a ranking module. Our approach includes building a semantic index by linking ontology concepts to documents, including an annotation weight for each link, to be used in ranking the results. We also augmented the framework with an automatic document categorizer, which enhances the overall document ranking. We have built three Arabic domain ontologies: Sports, Economic and Politics as example for the Arabic language. We built a knowledge base that consists of 79 classes and more than 1456 instances. The system is evaluated using the precision and recall metrics. We have done many retrieval operations on a sample of 40,316 documents with a size 320 MB of pure text. The results show that the semantic search enhanced with text classification gives better performance results than the system without classification.

Keywords: Arabic text classification, ontology based retrieval, Arabic semantic web, information retrieval, Arabic ontology

Procedia PDF Downloads 501
28300 A Human Activity Recognition System Based on Sensory Data Related to Object Usage

Authors: M. Abdullah, Al-Wadud

Abstract:

Sensor-based activity recognition systems usually accounts which sensors have been activated to perform an activity. The system then combines the conditional probabilities of those sensors to represent different activities and takes the decision based on that. However, the information about the sensors which are not activated may also be of great help in deciding which activity has been performed. This paper proposes an approach where the sensory data related to both usage and non-usage of objects are utilized to make the classification of activities. Experimental results also show the promising performance of the proposed method.

Keywords: Naïve Bayesian, based classification, activity recognition, sensor data, object-usage model

Procedia PDF Downloads 300
28299 Improved Classification Procedure for Imbalanced and Overlapped Situations

Authors: Hankyu Lee, Seoung Bum Kim

Abstract:

The issue with imbalance and overlapping in the class distribution becomes important in various applications of data mining. The imbalanced dataset is a special case in classification problems in which the number of observations of one class (i.e., major class) heavily exceeds the number of observations of the other class (i.e., minor class). Overlapped dataset is the case where many observations are shared together between the two classes. Imbalanced and overlapped data can be frequently found in many real examples including fraud and abuse patients in healthcare, quality prediction in manufacturing, text classification, oil spill detection, remote sensing, and so on. The class imbalance and overlap problem is the challenging issue because this situation degrades the performance of most of the standard classification algorithms. In this study, we propose a classification procedure that can effectively handle imbalanced and overlapped datasets by splitting data space into three parts: nonoverlapping, light overlapping, and severe overlapping and applying the classification algorithm in each part. These three parts were determined based on the Hausdorff distance and the margin of the modified support vector machine. An experiments study was conducted to examine the properties of the proposed method and compared it with other classification algorithms. The results showed that the proposed method outperformed the competitors under various imbalanced and overlapped situations. Moreover, the applicability of the proposed method was demonstrated through the experiment with real data.

Keywords: classification, imbalanced data with class overlap, split data space, support vector machine

Procedia PDF Downloads 281
28298 A Study on the Performance of 2-PC-D Classification Model

Authors: Nurul Aini Abdul Wahab, Nor Syamim Halidin, Sayidatina Aisah Masnan, Nur Izzati Romli

Abstract:

There are many applications of principle component method for reducing the large set of variables in various fields. Fisher’s Discriminant function is also a popular tool for classification. In this research, the researcher focuses on studying the performance of Principle Component-Fisher’s Discriminant function in helping to classify rice kernels to their defined classes. The data were collected on the smells or odour of the rice kernel using odour-detection sensor, Cyranose. 32 variables were captured by this electronic nose (e-nose). The objective of this research is to measure how well a combination model, between principle component and linear discriminant, to be as a classification model. Principle component method was used to reduce all 32 variables to a smaller and manageable set of components. Then, the reduced components were used to develop the Fisher’s Discriminant function. In this research, there are 4 defined classes of rice kernel which are Aromatic, Brown, Ordinary and Others. Based on the output from principle component method, the 32 variables were reduced to only 2 components. Based on the output of classification table from the discriminant analysis, 40.76% from the total observations were correctly classified into their classes by the PC-Discriminant function. Indirectly, it gives an idea that the classification model developed has committed to more than 50% of misclassifying the observations. As a conclusion, the Fisher’s Discriminant function that was built on a 2-component from PCA (2-PC-D) is not satisfying to classify the rice kernels into its defined classes.

Keywords: classification model, discriminant function, principle component analysis, variable reduction

Procedia PDF Downloads 311
28297 Classifying and Predicting Efficiencies Using Interval DEA Grid Setting

Authors: Yiannis G. Smirlis

Abstract:

The classification and the prediction of efficiencies in Data Envelopment Analysis (DEA) is an important issue, especially in large scale problems or when new units frequently enter the under-assessment set. In this paper, we contribute to the subject by proposing a grid structure based on interval segmentations of the range of values for the inputs and outputs. Such intervals combined, define hyper-rectangles that partition the space of the problem. This structure, exploited by Interval DEA models and a dominance relation, acts as a DEA pre-processor, enabling the classification and prediction of efficiency scores, without applying any DEA models.

Keywords: data envelopment analysis, interval DEA, efficiency classification, efficiency prediction

Procedia PDF Downloads 146
28296 Ontology-Based Backpropagation Neural Network Classification and Reasoning Strategy for NoSQL and SQL Databases

Authors: Hao-Hsiang Ku, Ching-Ho Chi

Abstract:

Big data applications have become an imperative for many fields. Many researchers have been devoted into increasing correct rates and reducing time complexities. Hence, the study designs and proposes an Ontology-based backpropagation neural network classification and reasoning strategy for NoSQL big data applications, which is called ON4NoSQL. ON4NoSQL is responsible for enhancing the performances of classifications in NoSQL and SQL databases to build up mass behavior models. Mass behavior models are made by MapReduce techniques and Hadoop distributed file system based on Hadoop service platform. The reference engine of ON4NoSQL is the ontology-based backpropagation neural network classification and reasoning strategy. Simulation results indicate that ON4NoSQL can efficiently achieve to construct a high performance environment for data storing, searching, and retrieving.

Keywords: Hadoop, NoSQL, ontology, back propagation neural network, high distributed file system

Procedia PDF Downloads 238