Search results for: classification quality
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 11145

Search results for: classification quality

11025 Synthetic Aperture Radar Remote Sensing Classification Using the Bag of Visual Words Model to Land Cover Studies

Authors: Reza Mohammadi, Mahmod R. Sahebi, Mehrnoosh Omati, Milad Vahidi

Abstract:

Classification of high resolution polarimetric Synthetic Aperture Radar (PolSAR) images plays an important role in land cover and land use management. Recently, classification algorithms based on Bag of Visual Words (BOVW) model have attracted significant interest among scholars and researchers in and out of the field of remote sensing. In this paper, BOVW model with pixel based low-level features has been implemented to classify a subset of San Francisco bay PolSAR image, acquired by RADARSAR 2 in C-band. We have used segment-based decision-making strategy and compared the result with the result of traditional Support Vector Machine (SVM) classifier. 90.95% overall accuracy of the classification with the proposed algorithm has shown that the proposed algorithm is comparable with the state-of-the-art methods. In addition to increase in the classification accuracy, the proposed method has decreased undesirable speckle effect of SAR images.

Keywords: Bag of Visual Words (BOVW), classification, feature extraction, land cover management, Polarimetric Synthetic Aperture Radar (PolSAR)

Procedia PDF Downloads 173
11024 Novel Inference Algorithm for Gaussian Process Classification Model with Multiclass and Its Application to Human Action Classification

Authors: Wanhyun Cho, Soonja Kang, Sangkyoon Kim, Soonyoung Park

Abstract:

In this paper, we propose a novel inference algorithm for the multi-class Gaussian process classification model that can be used in the field of human behavior recognition. This algorithm can drive simultaneously both a posterior distribution of a latent function and estimators of hyper-parameters in a Gaussian process classification model with multi-class. Our algorithm is based on the Laplace approximation (LA) technique and variational EM framework. This is performed in two steps: called expectation and maximization steps. First, in the expectation step, using the Bayesian formula and LA technique, we derive approximately the posterior distribution of the latent function indicating the possibility that each observation belongs to a certain class in the Gaussian process classification model. Second, in the maximization step, using a derived posterior distribution of latent function, we compute the maximum likelihood estimator for hyper-parameters of a covariance matrix necessary to define prior distribution for latent function. These two steps iteratively repeat until a convergence condition satisfies. Moreover, we apply the proposed algorithm with human action classification problem using a public database, namely, the KTH human action data set. Experimental results reveal that the proposed algorithm shows good performance on this data set.

Keywords: bayesian rule, gaussian process classification model with multiclass, gaussian process prior, human action classification, laplace approximation, variational EM algorithm

Procedia PDF Downloads 300
11023 Classification of Generative Adversarial Network Generated Multivariate Time Series Data Featuring Transformer-Based Deep Learning Architecture

Authors: Thrivikraman Aswathi, S. Advaith

Abstract:

As there can be cases where the use of real data is somehow limited, such as when it is hard to get access to a large volume of real data, we need to go for synthetic data generation. This produces high-quality synthetic data while maintaining the statistical properties of a specific dataset. In the present work, a generative adversarial network (GAN) is trained to produce multivariate time series (MTS) data since the MTS is now being gathered more often in various real-world systems. Furthermore, the GAN-generated MTS data is fed into a transformer-based deep learning architecture that carries out the data categorization into predefined classes. Further, the model is evaluated across various distinct domains by generating corresponding MTS data.

Keywords: GAN, transformer, classification, multivariate time series

Procedia PDF Downloads 86
11022 Polarimetric Synthetic Aperture Radar Data Classification Using Support Vector Machine and Mahalanobis Distance

Authors: Najoua El Hajjaji El Idrissi, Necip Gokhan Kasapoglu

Abstract:

Polarimetric Synthetic Aperture Radar-based imaging is a powerful technique used for earth observation and classification of surfaces. Forest evolution has been one of the vital areas of attention for the remote sensing experts. The information about forest areas can be achieved by remote sensing, whether by using active radars or optical instruments. However, due to several weather constraints, such as cloud cover, limited information can be recovered using optical data and for that reason, Polarimetric Synthetic Aperture Radar (PolSAR) is used as a powerful tool for forestry inventory. In this [14paper, we applied support vector machine (SVM) and Mahalanobis distance to the fully polarimetric AIRSAR P, L, C-bands data from the Nezer forest areas, the classification is based in the separation of different tree ages. The classification results were evaluated and the results show that the SVM performs better than the Mahalanobis distance and SVM achieves approximately 75% accuracy. This result proves that SVM classification can be used as a useful method to evaluate fully polarimetric SAR data with sufficient value of accuracy.

Keywords: classification, synthetic aperture radar, SAR polarimetry, support vector machine, mahalanobis distance

Procedia PDF Downloads 97
11021 Classification of Land Cover Usage from Satellite Images Using Deep Learning Algorithms

Authors: Shaik Ayesha Fathima, Shaik Noor Jahan, Duvvada Rajeswara Rao

Abstract:

Earth's environment and its evolution can be seen through satellite images in near real-time. Through satellite imagery, remote sensing data provide crucial information that can be used for a variety of applications, including image fusion, change detection, land cover classification, agriculture, mining, disaster mitigation, and monitoring climate change. The objective of this project is to propose a method for classifying satellite images according to multiple predefined land cover classes. The proposed approach involves collecting data in image format. The data is then pre-processed using data pre-processing techniques. The processed data is fed into the proposed algorithm and the obtained result is analyzed. Some of the algorithms used in satellite imagery classification are U-Net, Random Forest, Deep Labv3, CNN, ANN, Resnet etc. In this project, we are using the DeepLabv3 (Atrous convolution) algorithm for land cover classification. The dataset used is the deep globe land cover classification dataset. DeepLabv3 is a semantic segmentation system that uses atrous convolution to capture multi-scale context by adopting multiple atrous rates in cascade or in parallel to determine the scale of segments.

Keywords: area calculation, atrous convolution, deep globe land cover classification, deepLabv3, land cover classification, resnet 50

Procedia PDF Downloads 112
11020 Classification of Opaque Exterior Walls of Buildings from a Sustainable Point of View

Authors: Michelle Sánchez de León Brajkovich, Nuria Martí Audi

Abstract:

The envelope is one of the most important elements when one analyzes the operation of the building in terms of sustainability. Taking this into consideration, this research focuses on setting a classification system of the envelopes opaque systems, crossing the knowledge and parameters of construction systems with requirements in terms of sustainability that they may have, to have a better understanding of how these systems work with respect to their sustainable contribution to the building. Therefore, this paper evaluates the importance of the envelope design on the building sustainability. It analyses the parameters that make the construction systems behave differently in terms of sustainability. At the same time it explains the classification process generated from this analysis that results in a classification where all opaque vertical envelope construction systems enter.

Keywords: sustainable, exterior walls, envelope, facades, construction systems, energy efficiency

Procedia PDF Downloads 534
11019 Multi-Classification Deep Learning Model for Diagnosing Different Chest Diseases

Authors: Bandhan Dey, Muhsina Bintoon Yiasha, Gulam Sulaman Choudhury

Abstract:

Chest disease is one of the most problematic ailments in our regular life. There are many known chest diseases out there. Diagnosing them correctly plays a vital role in the process of treatment. There are many methods available explicitly developed for different chest diseases. But the most common approach for diagnosing these diseases is through X-ray. In this paper, we proposed a multi-classification deep learning model for diagnosing COVID-19, lung cancer, pneumonia, tuberculosis, and atelectasis from chest X-rays. In the present work, we used the transfer learning method for better accuracy and fast training phase. The performance of three architectures is considered: InceptionV3, VGG-16, and VGG-19. We evaluated these deep learning architectures using public digital chest x-ray datasets with six classes (i.e., COVID-19, lung cancer, pneumonia, tuberculosis, atelectasis, and normal). The experiments are conducted on six-classification, and we found that VGG16 outperforms other proposed models with an accuracy of 95%.

Keywords: deep learning, image classification, X-ray images, Tensorflow, Keras, chest diseases, convolutional neural networks, multi-classification

Procedia PDF Downloads 55
11018 Determination of Water Pollution and Water Quality with Decision Trees

Authors: Çiğdem Bakır, Mecit Yüzkat

Abstract:

With the increasing emphasis on water quality worldwide, the search for and expanding the market for new and intelligent monitoring systems has increased. The current method is the laboratory process, where samples are taken from bodies of water, and tests are carried out in laboratories. This method is time-consuming, a waste of manpower, and uneconomical. To solve this problem, we used machine learning methods to detect water pollution in our study. We created decision trees with the Orange3 software we used in our study and tried to determine all the factors that cause water pollution. An automatic prediction model based on water quality was developed by taking many model inputs such as water temperature, pH, transparency, conductivity, dissolved oxygen, and ammonia nitrogen with machine learning methods. The proposed approach consists of three stages: preprocessing of the data used, feature detection, and classification. We tried to determine the success of our study with different accuracy metrics and the results. We presented it comparatively. In addition, we achieved approximately 98% success with the decision tree.

Keywords: decision tree, water quality, water pollution, machine learning

Procedia PDF Downloads 58
11017 Analyzing the Impact of Code Commenting on Software Quality

Authors: Thulya Premathilake, Tharushi Perera, Hansi Thathsarani, Tharushi Nethmini, Dilshan De Silva, Piyumika Samarasekara

Abstract:

One of the most efficient ways to assist developers in grasping the source code is to make use of comments, which can be found throughout the code. When working in fields such as software development, having comments in your code that are of good quality is a fundamental requirement. Tackling software problems while making use of programs that have already been built. It is essential for the intention of the source code to be made crystal apparent in the comments that are added to the code. This assists programmers in better comprehending the programs they are working on and enables them to complete software maintenance jobs in a more timely manner. In spite of the fact that comments and documentation are meant to improve readability and maintainability, the vast majority of programmers place the majority of their focus on the actual code that is being written. This study provides a complete and comprehensive overview of the previous research that has been conducted on the topic of code comments. The study focuses on four main topics, including automated comment production, comment consistency, comment classification, and comment quality rating. One is able to get the knowledge that is more complete for use in following inquiries if they conduct an analysis of the proper approaches that were used in this study issue.

Keywords: code commenting, source code, software quality, quality assurance

Procedia PDF Downloads 54
11016 Experimental Study of Hyperparameter Tuning a Deep Learning Convolutional Recurrent Network for Text Classification

Authors: Bharatendra Rai

Abstract:

The sequence of words in text data has long-term dependencies and is known to suffer from vanishing gradient problems when developing deep learning models. Although recurrent networks such as long short-term memory networks help to overcome this problem, achieving high text classification performance is a challenging problem. Convolutional recurrent networks that combine the advantages of long short-term memory networks and convolutional neural networks can be useful for text classification performance improvements. However, arriving at suitable hyperparameter values for convolutional recurrent networks is still a challenging task where fitting a model requires significant computing resources. This paper illustrates the advantages of using convolutional recurrent networks for text classification with the help of statistically planned computer experiments for hyperparameter tuning.

Keywords: long short-term memory networks, convolutional recurrent networks, text classification, hyperparameter tuning, Tukey honest significant differences

Procedia PDF Downloads 83
11015 Maturity Classification of Oil Palm Fresh Fruit Bunches Using Thermal Imaging Technique

Authors: Shahrzad Zolfagharnassab, Abdul Rashid Mohamed Shariff, Reza Ehsani, Hawa Ze Jaffar, Ishak Aris

Abstract:

Ripeness estimation of oil palm fresh fruit is important processes that affect the profitableness and salability of oil palm fruits. The adulthood or ripeness of the oil palm fruits influences the quality of oil palm. Conventional procedure includes physical grading of Fresh Fruit Bunches (FFB) maturity by calculating the number of loose fruits per bunch. This physical classification of oil palm FFB is costly, time consuming and the results may have human error. Hence, many researchers try to develop the methods for ascertaining the maturity of oil palm fruits and thereby, deviously the oil content of distinct palm fruits without the need for exhausting oil extraction and analysis. This research investigates the potential of infrared images (Thermal Images) as a predictor to classify the oil palm FFB ripeness. A total of 270 oil palm fresh fruit bunches from most common cultivar of oil palm bunches Nigresens according to three maturity categories: under ripe, ripe and over ripe were collected. Each sample was scanned by the thermal imaging cameras FLIR E60 and FLIR T440. The average temperature of each bunches were calculated by using image processing in FLIR Tools and FLIR ThermaCAM researcher pro 2.10 environment software. The results show that temperature content decreased from immature to over mature oil palm FFBs. An overall analysis-of-variance (ANOVA) test was proved that this predictor gave significant difference between underripe, ripe and overripe maturity categories. This shows that the temperature as predictors can be good indicators to classify oil palm FFB. Classification analysis was performed by using the temperature of the FFB as predictors through Linear Discriminant Analysis (LDA), Mahalanobis Discriminant Analysis (MDA), Artificial Neural Network (ANN) and K- Nearest Neighbor (KNN) methods. The highest overall classification accuracy was 88.2% by using Artificial Neural Network. This research proves that thermal imaging and neural network method can be used as predictors of oil palm maturity classification.

Keywords: artificial neural network, maturity classification, oil palm FFB, thermal imaging

Procedia PDF Downloads 322
11014 Performance Evaluation of Contemporary Classifiers for Automatic Detection of Epileptic EEG

Authors: K. E. Ch. Vidyasagar, M. Moghavvemi, T. S. S. T. Prabhat

Abstract:

Epilepsy is a global problem, and with seizures eluding even the smartest of diagnoses a requirement for automatic detection of the same using electroencephalogram (EEG) would have a huge impact in diagnosis of the disorder. Among a multitude of methods for automatic epilepsy detection, one should find the best method out, based on accuracy, for classification. This paper reasons out, and rationalizes, the best methods for classification. Accuracy is based on the classifier, and thus this paper discusses classifiers like quadratic discriminant analysis (QDA), classification and regression tree (CART), support vector machine (SVM), naive Bayes classifier (NBC), linear discriminant analysis (LDA), K-nearest neighbor (KNN) and artificial neural networks (ANN). Results show that ANN is the most accurate of all the above stated classifiers with 97.7% accuracy, 97.25% specificity and 98.28% sensitivity in its merit. This is followed closely by SVM with 1% variation in result. These results would certainly help researchers choose the best classifier for detection of epilepsy.

Keywords: classification, seizure, KNN, SVM, LDA, ANN, epilepsy

Procedia PDF Downloads 487
11013 The Impact of System and Data Quality on Organizational Success in the Kingdom of Bahrain

Authors: Amal M. Alrayes

Abstract:

Data and system quality play a central role in organizational success, and the quality of any existing information system has a major influence on the effectiveness of overall system performance.Given the importance of system and data quality to an organization, it is relevant to highlight their importance on organizational performance in the Kingdom of Bahrain. This research aims to discover whether system quality and data quality are related, and to study the impact of system and data quality on organizational success. A theoretical model based on previous research is used to show the relationship between data and system quality, and organizational impact. We hypothesize, first, that system quality is positively associated with organizational impact, secondly that system quality is positively associated with data quality, and finally that data quality is positively associated with organizational impact. A questionnaire was conducted among public and private organizations in the Kingdom of Bahrain. The results show that there is a strong association between data and system quality, that affects organizational success.

Keywords: data quality, performance, system quality, Kingdom of Bahrain

Procedia PDF Downloads 457
11012 3D Receiver Operator Characteristic Histogram

Authors: Xiaoli Zhang, Xiongfei Li, Yuncong Feng

Abstract:

ROC curves, as a widely used evaluating tool in machine learning field, are the tradeoff of true positive rate and negative rate. However, they are blamed for ignoring some vital information in the evaluation process, such as the amount of information about the target that each instance carries, predicted score given by each classification model to each instance. Hence, in this paper, a new classification performance method is proposed by extending the Receiver Operator Characteristic (ROC) curves to 3D space, which is denoted as 3D ROC Histogram. In the histogram, the

Keywords: classification, performance evaluation, receiver operating characteristic histogram, hardness prediction

Procedia PDF Downloads 285
11011 Combined Odd Pair Autoregressive Coefficients for Epileptic EEG Signals Classification by Radial Basis Function Neural Network

Authors: Boukari Nassim

Abstract:

This paper describes the use of odd pair autoregressive coefficients (Yule _Walker and Burg) for the feature extraction of electroencephalogram (EEG) signals. In the classification: the radial basis function neural network neural network (RBFNN) is employed. The RBFNN is described by his architecture and his characteristics: as the RBF is defined by the spread which is modified for improving the results of the classification. Five types of EEG signals are defined for this work: Set A, Set B for normal signals, Set C, Set D for interictal signals, set E for ictal signal (we can found that in Bonn university). In outputs, two classes are given (AC, AD, AE, BC, BD, BE, CE, DE), the best accuracy is calculated at 99% for the combined odd pair autoregressive coefficients. Our method is very effective for the diagnosis of epileptic EEG signals.

Keywords: epilepsy, EEG signals classification, combined odd pair autoregressive coefficients, radial basis function neural network

Procedia PDF Downloads 318
11010 A Comparative Analysis of E-Government Quality Models

Authors: Abdoullah Fath-Allah, Laila Cheikhi, Rafa E. Al-Qutaish, Ali Idri

Abstract:

Many quality models have been used to measure e-government portals quality. However, the absence of an international consensus for e-government portals quality models results in many differences in terms of quality attributes and measures. The aim of this paper is to compare and analyze the existing e-government quality models proposed in literature (those that are based on ISO standards and those that are not) in order to propose guidelines to build a good and useful e-government portals quality model. Our findings show that, there is no e-government portal quality model based on the new international standard ISO 25010. Besides that, the quality models are not based on a best practice model to allow agencies to both; measure e-government portals quality and identify missing best practices for those portals.

Keywords: e-government, portal, best practices, quality model, ISO, standard, ISO 25010, ISO 9126

Procedia PDF Downloads 522
11009 Automatic Classification Using Dynamic Fuzzy C Means Algorithm and Mathematical Morphology: Application in 3D MRI Image

Authors: Abdelkhalek Bakkari

Abstract:

Image segmentation is a critical step in image processing and pattern recognition. In this paper, we proposed a new robust automatic image classification based on a dynamic fuzzy c-means algorithm and mathematical morphology. The proposed segmentation algorithm (DFCM_MM) has been applied to MR perfusion images. The obtained results show the validity and robustness of the proposed approach.

Keywords: segmentation, classification, dynamic, fuzzy c-means, MR image

Procedia PDF Downloads 437
11008 Classification of Construction Projects

Authors: M. Safa, A. Sabet, S. MacGillivray, M. Davidson, K. Kaczmarczyk, C. T. Haas, G. E. Gibson, D. Rayside

Abstract:

To address construction project requirements and specifications, scholars and practitioners need to establish a taxonomy according to a scheme that best fits their need. While existing characterization methods are continuously being improved, new ones are devised to cover project properties which have not been previously addressed. One such method, the Project Definition Rating Index (PDRI), has received limited consideration strictly as a classification scheme. Developed by the Construction Industry Institute (CII) in 1996, the PDRI has been refined over the last two decades as a method for evaluating a project's scope definition completeness during front-end planning (FEP). The main contribution of this study is a review of practical project classification methods, and a discussion of how PDRI can be used to classify projects based on their readiness in the FEP phase. The proposed model has been applied to 59 construction projects in Ontario, and the results are discussed.

Keywords: project classification, project definition rating index (PDRI), risk, project goals alignment

Procedia PDF Downloads 644
11007 New Approach to Construct Phylogenetic Tree

Authors: Ouafae Baida, Najma Hamzaoui, Maha Akbib, Abdelfettah Sedqui, Abdelouahid Lyhyaoui

Abstract:

Numerous scientific works present various methods to analyze the data for several domains, specially the comparison of classifications. In our recent work, we presented a new approach to help the user choose the best classification method from the results obtained by every method, by basing itself on the distances between the trees of classification. The result of our approach was in the form of a dendrogram contains methods as a succession of connections. This approach is much needed in phylogeny analysis. This discipline is intended to analyze the sequences of biological macro molecules for information on the evolutionary history of living beings, including their relationship. The product of phylogeny analysis is a phylogenetic tree. In this paper, we recommend the use of a new method of construction the phylogenetic tree based on comparison of different classifications obtained by different molecular genes.

Keywords: hierarchical classification, classification methods, structure of tree, genes, phylogenetic analysis

Procedia PDF Downloads 471
11006 A Novel Heuristic for Analysis of Large Datasets by Selecting Wrapper-Based Features

Authors: Bushra Zafar, Usman Qamar

Abstract:

Large data sample size and dimensions render the effectiveness of conventional data mining methodologies. A data mining technique are important tools for collection of knowledgeable information from variety of databases and provides supervised learning in the form of classification to design models to describe vital data classes while structure of the classifier is based on class attribute. Classification efficiency and accuracy are often influenced to great extent by noisy and undesirable features in real application data sets. The inherent natures of data set greatly masks its quality analysis and leave us with quite few practical approaches to use. To our knowledge first time, we present a new approach for investigation of structure and quality of datasets by providing a targeted analysis of localization of noisy and irrelevant features of data sets. Machine learning is based primarily on feature selection as pre-processing step which offers us to select few features from number of features as a subset by reducing the space according to certain evaluation criterion. The primary objective of this study is to trim down the scope of the given data sample by searching a small set of important features which may results into good classification performance. For this purpose, a heuristic for wrapper-based feature selection using genetic algorithm and for discriminative feature selection an external classifier are used. Selection of feature based on its number of occurrence in the chosen chromosomes. Sample dataset has been used to demonstrate proposed idea effectively. A proposed method has improved average accuracy of different datasets is about 95%. Experimental results illustrate that proposed algorithm increases the accuracy of prediction of different diseases.

Keywords: data mining, generic algorithm, KNN algorithms, wrapper based feature selection

Procedia PDF Downloads 290
11005 Brainwave Classification for Brain Balancing Index (BBI) via 3D EEG Model Using k-NN Technique

Authors: N. Fuad, M. N. Taib, R. Jailani, M. E. Marwan

Abstract:

In this paper, the comparison between k-Nearest Neighbor (kNN) algorithms for classifying the 3D EEG model in brain balancing is presented. The EEG signal recording was conducted on 51 healthy subjects. Development of 3D EEG models involves pre-processing of raw EEG signals and construction of spectrogram images. Then, maximum PSD values were extracted as features from the model. There are three indexes for the balanced brain; index 3, index 4 and index 5. There are significant different of the EEG signals due to the brain balancing index (BBI). Alpha-α (8–13 Hz) and beta-β (13–30 Hz) were used as input signals for the classification model. The k-NN classification result is 88.46% accuracy. These results proved that k-NN can be used in order to predict the brain balancing application.

Keywords: power spectral density, 3D EEG model, brain balancing, kNN

Procedia PDF Downloads 451
11004 Quality Analysis of Vegetables Through Image Processing

Authors: Abdul Khalique Baloch, Ali Okatan

Abstract:

The quality analysis of food and vegetable from image is hot topic now a day, where researchers make them better then pervious findings through different technique and methods. In this research we have review the literature, and find gape from them, and suggest better proposed approach, design the algorithm, developed a software to measure the quality from images, where accuracy of image show better results, and compare the results with Perouse work done so for. The Application we uses an open-source dataset and python language with tensor flow lite framework. In this research we focus to sort food and vegetable from image, in the images, the application can sorts and make them grading after process the images, it could create less errors them human base sorting errors by manual grading. Digital pictures datasets were created. The collected images arranged by classes. The classification accuracy of the system was about 94%. As fruits and vegetables play main role in day-to-day life, the quality of fruits and vegetables is necessary in evaluating agricultural produce, the customer always buy good quality fruits and vegetables. This document is about quality detection of fruit and vegetables using images. Most of customers suffering due to unhealthy foods and vegetables by suppliers, so there is no proper quality measurement level followed by hotel managements. it have developed software to measure the quality of the fruits and vegetables by using images, it will tell you how is your fruits and vegetables are fresh or rotten. Some algorithms reviewed in this thesis including digital images, ResNet, VGG16, CNN and Transfer Learning grading feature extraction. This application used an open source dataset of images and language used python, and designs a framework of system.

Keywords: deep learning, computer vision, image processing, rotten fruit detection, fruits quality criteria, vegetables quality criteria

Procedia PDF Downloads 43
11003 Development of Fake News Model Using Machine Learning through Natural Language Processing

Authors: Sajjad Ahmed, Knut Hinkelmann, Flavio Corradini

Abstract:

Fake news detection research is still in the early stage as this is a relatively new phenomenon in the interest raised by society. Machine learning helps to solve complex problems and to build AI systems nowadays and especially in those cases where we have tacit knowledge or the knowledge that is not known. We used machine learning algorithms and for identification of fake news; we applied three classifiers; Passive Aggressive, Naïve Bayes, and Support Vector Machine. Simple classification is not completely correct in fake news detection because classification methods are not specialized for fake news. With the integration of machine learning and text-based processing, we can detect fake news and build classifiers that can classify the news data. Text classification mainly focuses on extracting various features of text and after that incorporating those features into classification. The big challenge in this area is the lack of an efficient way to differentiate between fake and non-fake due to the unavailability of corpora. We applied three different machine learning classifiers on two publicly available datasets. Experimental analysis based on the existing dataset indicates a very encouraging and improved performance.

Keywords: fake news detection, natural language processing, machine learning, classification techniques.

Procedia PDF Downloads 129
11002 Concurrent Engineering Challenges and Resolution Mechanisms from Quality Perspectives

Authors: Grmanesh Gidey Kahsay

Abstract:

In modern technical engineering applications, quality is defined in two ways. The first one is that quality is the parameter that measures a product or service’s characteristics to meet and satisfy the pre-stated or fundamental needs (reliability, durability, serviceability). The second one is the quality of a product or service free of any defect or deficiencies. The American Society for Quality (ASQ) describes quality as a pursuit of optimal solutions to confirm successes and fulfillment to be accountable for the product or service's requirements and expectations. This article focuses on quality engineering tools in modern industrial applications. Quality engineering is a field of engineering that deals with the principles, techniques, models, and applications of the product or service to guarantee quality. Including the entire activities to analyze the product’s design and development, quality engineering emphasizes how to make sure that products and services are designed and developed to meet consumers’ requirements. This episode acquaints with quality tools such as quality systems, auditing, product design, and process control. The finding presents thoughts that aim to improve quality engineering proficiency and effectiveness by introducing essential quality techniques and tools in some selected industries.

Keywords: essential quality tools, quality systems and models, quality management systems, and quality assurance

Procedia PDF Downloads 119
11001 Classifying and Predicting Efficiencies Using Interval DEA Grid Setting

Authors: Yiannis G. Smirlis

Abstract:

The classification and the prediction of efficiencies in Data Envelopment Analysis (DEA) is an important issue, especially in large scale problems or when new units frequently enter the under-assessment set. In this paper, we contribute to the subject by proposing a grid structure based on interval segmentations of the range of values for the inputs and outputs. Such intervals combined, define hyper-rectangles that partition the space of the problem. This structure, exploited by Interval DEA models and a dominance relation, acts as a DEA pre-processor, enabling the classification and prediction of efficiency scores, without applying any DEA models.

Keywords: data envelopment analysis, interval DEA, efficiency classification, efficiency prediction

Procedia PDF Downloads 134
11000 A Supervised Learning Data Mining Approach for Object Recognition and Classification in High Resolution Satellite Data

Authors: Mais Nijim, Rama Devi Chennuboyina, Waseem Al Aqqad

Abstract:

Advances in spatial and spectral resolution of satellite images have led to tremendous growth in large image databases. The data we acquire through satellites, radars and sensors consists of important geographical information that can be used for remote sensing applications such as region planning, disaster management. Spatial data classification and object recognition are important tasks for many applications. However, classifying objects and identifying them manually from images is a difficult task. Object recognition is often considered as a classification problem, this task can be performed using machine-learning techniques. Despite of many machine-learning algorithms, the classification is done using supervised classifiers such as Support Vector Machines (SVM) as the area of interest is known. We proposed a classification method, which considers neighboring pixels in a region for feature extraction and it evaluates classifications precisely according to neighboring classes for semantic interpretation of region of interest (ROI). A dataset has been created for training and testing purpose; we generated the attributes by considering pixel intensity values and mean values of reflectance. We demonstrated the benefits of using knowledge discovery and data-mining techniques, which can be on image data for accurate information extraction and classification from high spatial resolution remote sensing imagery.

Keywords: remote sensing, object recognition, classification, data mining, waterbody identification, feature extraction

Procedia PDF Downloads 311
10999 Revisiting the Swadesh Wordlist: How Long Should It Be

Authors: Feda Negesse

Abstract:

One of the most important indicators of research quality is a good data - collection instrument that can yield reliable and valid data. The Swadesh wordlist has been used for more than half a century for collecting data in comparative and historical linguistics though arbitrariness is observed in its application and size. This research compare s the classification results of the 100 Swadesh wordlist with those of its subsets to determine if reducing the size of the wordlist impact s its effectiveness. In the comparison, the 100, 50 and 40 wordlists were used to compute lexical distances of 29 Cushitic and Semitic languages spoken in Ethiopia and neighbouring countries. Gabmap, a based application, was employed to compute the lexical distances and to divide the languages into related clusters. The study shows that the subsets are not as effective as the 100 wordlist in clustering languages into smaller subgroups but they are equally effective in di viding languages into bigger groups such as subfamilies. It is noted that the subsets may lead to an erroneous classification whereby unrelated languages by chance form a cluster which is not attested by a comparative study. The chance to get a wrong result is higher when the subsets are used to classify languages which are not closely related. Though a further study is still needed to settle the issues around the size of the Swadesh wordlist, this study indicates that the 50 and 40 wordlists cannot be recommended as reliable substitute s for the 100 wordlist under all circumstances. The choice seems to be determined by the objective of a researcher and the degree of affiliation among the languages to be classified.

Keywords: classification, Cushitic, Swadesh, wordlist

Procedia PDF Downloads 270
10998 Exploring the Role of Data Mining in Crime Classification: A Systematic Literature Review

Authors: Faisal Muhibuddin, Ani Dijah Rahajoe

Abstract:

This in-depth exploration, through a systematic literature review, scrutinizes the nuanced role of data mining in the classification of criminal activities. The research focuses on investigating various methodological aspects and recent developments in leveraging data mining techniques to enhance the effectiveness and precision of crime categorization. Commencing with an exposition of the foundational concepts of crime classification and its evolutionary dynamics, this study details the paradigm shift from conventional methods towards approaches supported by data mining, addressing the challenges and complexities inherent in the modern crime landscape. Specifically, the research delves into various data mining techniques, including K-means clustering, Naïve Bayes, K-nearest neighbour, and clustering methods. A comprehensive review of the strengths and limitations of each technique provides insights into their respective contributions to improving crime classification models. The integration of diverse data sources takes centre stage in this research. A detailed analysis explores how the amalgamation of structured data (such as criminal records) and unstructured data (such as social media) can offer a holistic understanding of crime, enriching classification models with more profound insights. Furthermore, the study explores the temporal implications in crime classification, emphasizing the significance of considering temporal factors to comprehend long-term trends and seasonality. The availability of real-time data is also elucidated as a crucial element in enhancing responsiveness and accuracy in crime classification.

Keywords: data mining, classification algorithm, naïve bayes, k-means clustering, k-nearest neigbhor, crime, data analysis, sistematic literature review

Procedia PDF Downloads 23
10997 Feature Weighting Comparison Based on Clustering Centers in the Detection of Diabetic Retinopathy

Authors: Kemal Polat

Abstract:

In this paper, three feature weighting methods have been used to improve the classification performance of diabetic retinopathy (DR). To classify the diabetic retinopathy, features extracted from the output of several retinal image processing algorithms, such as image-level, lesion-specific and anatomical components, have been used and fed them into the classifier algorithms. The dataset used in this study has been taken from University of California, Irvine (UCI) machine learning repository. Feature weighting methods including the fuzzy c-means clustering based feature weighting, subtractive clustering based feature weighting, and Gaussian mixture clustering based feature weighting, have been used and compered with each other in the classification of DR. After feature weighting, five different classifier algorithms comprising multi-layer perceptron (MLP), k- nearest neighbor (k-NN), decision tree, support vector machine (SVM), and Naïve Bayes have been used. The hybrid method based on combination of subtractive clustering based feature weighting and decision tree classifier has been obtained the classification accuracy of 100% in the screening of DR. These results have demonstrated that the proposed hybrid scheme is very promising in the medical data set classification.

Keywords: machine learning, data weighting, classification, data mining

Procedia PDF Downloads 300
10996 Feature Extraction and Classification Based on the Bayes Test for Minimum Error

Authors: Nasar Aldian Ambark Shashoa

Abstract:

Classification with a dimension reduction based on Bayesian approach is proposed in this paper . The first step is to generate a sample (parameter) of fault-free mode class and faulty mode class. The second, in order to obtain good classification performance, a selection of important features is done with the discrete karhunen-loeve expansion. Next, the Bayes test for minimum error is used to classify the classes. Finally, the results for simulated data demonstrate the capabilities of the proposed procedure.

Keywords: analytical redundancy, fault detection, feature extraction, Bayesian approach

Procedia PDF Downloads 497