Search results for: KaraAgroAI cocoa dataset
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1224

Search results for: KaraAgroAI cocoa dataset

684 Intelligent Software Architecture and Automatic Re-Architecting Based on Machine Learning

Authors: Gebremeskel Hagos Gebremedhin, Feng Chong, Heyan Huang

Abstract:

Software system is the combination of architecture and organized components to accomplish a specific function or set of functions. A good software architecture facilitates application system development, promotes achievement of functional requirements, and supports system reconfiguration. We describe three studies demonstrating the utility of our architecture in the subdomain of mobile office robots and identify software engineering principles embodied in the architecture. The main aim of this paper is to analyze prove architecture design and automatic re-architecting using machine learning. Intelligence software architecture and automatic re-architecting process is reorganizing in to more suitable one of the software organizational structure system using the user access dataset for creating relationship among the components of the system. The 3-step approach of data mining was used to analyze effective recovery, transformation and implantation with the use of clustering algorithm. Therefore, automatic re-architecting without changing the source code is possible to solve the software complexity problem and system software reuse.

Keywords: intelligence, software architecture, re-architecting, software reuse, High level design

Procedia PDF Downloads 119
683 Image Retrieval Based on Multi-Feature Fusion for Heterogeneous Image Databases

Authors: N. W. U. D. Chathurani, Shlomo Geva, Vinod Chandran, Proboda Rajapaksha

Abstract:

Selecting an appropriate image representation is the most important factor in implementing an effective Content-Based Image Retrieval (CBIR) system. This paper presents a multi-feature fusion approach for efficient CBIR, based on the distance distribution of features and relative feature weights at the time of query processing. It is a simple yet effective approach, which is free from the effect of features' dimensions, ranges, internal feature normalization and the distance measure. This approach can easily be adopted in any feature combination to improve retrieval quality. The proposed approach is empirically evaluated using two benchmark datasets for image classification (a subset of the Corel dataset and Oliva and Torralba) and compared with existing approaches. The performance of the proposed approach is confirmed with the significantly improved performance in comparison with the independently evaluated baseline of the previously proposed feature fusion approaches.

Keywords: feature fusion, image retrieval, membership function, normalization

Procedia PDF Downloads 345
682 HPPDFIM-HD: Transaction Distortion and Connected Perturbation Approach for Hierarchical Privacy Preserving Distributed Frequent Itemset Mining over Horizontally-Partitioned Dataset

Authors: Fuad Ali Mohammed Al-Yarimi

Abstract:

Many algorithms have been proposed to provide privacy preserving in data mining. These protocols are based on two main approaches named as: the perturbation approach and the Cryptographic approach. The first one is based on perturbation of the valuable information while the second one uses cryptographic techniques. The perturbation approach is much more efficient with reduced accuracy while the cryptographic approach can provide solutions with perfect accuracy. However, the cryptographic approach is a much slower method and requires considerable computation and communication overhead. In this paper, a new scalable protocol is proposed which combines the advantages of the perturbation and distortion along with cryptographic approach to perform privacy preserving in distributed frequent itemset mining on horizontally distributed data. Both the privacy and performance characteristics of the proposed protocol are studied empirically.

Keywords: anonymity data, data mining, distributed frequent itemset mining, gaussian perturbation, perturbation approach, privacy preserving data mining

Procedia PDF Downloads 505
681 Towards an Enhanced Compartmental Model for Profiling Malware Dynamics

Authors: Jessemyn Modiini, Timothy Lynar, Elena Sitnikova

Abstract:

We present a novel enhanced compartmental model for malware spread analysis in cyber security. This paper applies cyber security data features to epidemiological compartmental models to model the infectious potential of malware. Compartmental models are most efficient for calculating the infectious potential of a disease. In this paper, we discuss and profile epidemiologically relevant data features from a Domain Name System (DNS) dataset. We then apply these features to epidemiological compartmental models to network traffic features. This paper demonstrates how epidemiological principles can be applied to the novel analysis of key cybersecurity behaviours and trends and provides insight into threat modelling above that of kill-chain analysis. In applying deterministic compartmental models to a cyber security use case, the authors analyse the deficiencies and provide an enhanced stochastic model for cyber epidemiology. This enhanced compartmental model (SUEICRN model) is contrasted with the traditional SEIR model to demonstrate its efficacy.

Keywords: cybersecurity, epidemiology, cyber epidemiology, malware

Procedia PDF Downloads 108
680 Using Machine Learning to Predict Answers to Big-Five Personality Questions

Authors: Aadityaa Singla

Abstract:

The big five personality traits are as follows: openness, conscientiousness, extraversion, agreeableness, and neuroticism. In order to get an insight into their personality, many flocks to these categories, which each have different meanings/characteristics. This information is important not only to individuals but also to career professionals and psychologists who can use this information for candidate assessment or job recruitment. The links between AI and psychology have been well studied in cognitive science, but it is still a rather novel development. It is possible for various AI classification models to accurately predict a personality question via ten input questions. This would contrast with the hundred questions that normal humans have to answer to gain a complete picture of their five personality traits. In order to approach this problem, various AI classification models were used on a dataset to predict what a user may answer. From there, the model's prediction was compared to its actual response. Normally, there are five answer choices (a 20% chance of correct guess), and the models exceed that value to different degrees, proving their significance. By utilizing an MLP classifier, decision tree, linear model, and K-nearest neighbors, they were able to obtain a test accuracy of 86.643, 54.625, 47.875, and 52.125, respectively. These approaches display that there is potential in the future for more nuanced predictions to be made regarding personality.

Keywords: machine learning, personally, big five personality traits, cognitive science

Procedia PDF Downloads 145
679 A Reliable Multi-Type Vehicle Classification System

Authors: Ghada S. Moussa

Abstract:

Vehicle classification is an important task in traffic surveillance and intelligent transportation systems. Classification of vehicle images is facing several problems such as: high intra-class vehicle variations, occlusion, shadow, illumination. These problems and others must be considered to develop a reliable vehicle classification system. In this study, a reliable multi-type vehicle classification system based on Bag-of-Words (BoW) paradigm is developed. Our proposed system used and compared four well-known classifiers; Linear Discriminant Analysis (LDA), Support Vector Machine (SVM), k-Nearest Neighbour (KNN), and Decision Tree to classify vehicles into four categories: motorcycles, small, medium and large. Experiments on a large dataset show that our approach is efficient and reliable in classifying vehicles with accuracy of 95.7%. The SVM outperforms other classification algorithms in terms of both accuracy and robustness alongside considerable reduction in execution time. The innovativeness of developed system is it can serve as a framework for many vehicle classification systems.

Keywords: vehicle classification, bag-of-words technique, SVM classifier, LDA classifier, KNN classifier, decision tree classifier, SIFT algorithm

Procedia PDF Downloads 358
678 Large-Scale Electroencephalogram Biometrics through Contrastive Learning

Authors: Mostafa ‘Neo’ Mohsenvand, Mohammad Rasool Izadi, Pattie Maes

Abstract:

EEG-based biometrics (user identification) has been explored on small datasets of no more than 157 subjects. Here we show that the accuracy of modern supervised methods falls rapidly as the number of users increases to a few thousand. Moreover, supervised methods require a large amount of labeled data for training which limits their applications in real-world scenarios where acquiring data for training should not take more than a few minutes. We show that using contrastive learning for pre-training, it is possible to maintain high accuracy on a dataset of 2130 subjects while only using a fraction of labels. We compare 5 different self-supervised tasks for pre-training of the encoder where our proposed method achieves the accuracy of 96.4%, improving the baseline supervised models by 22.75% and the competing self-supervised model by 3.93%. We also study the effects of the length of the signal and the number of channels on the accuracy of the user-identification models. Our results reveal that signals from temporal and frontal channels contain more identifying features compared to other channels.

Keywords: brainprint, contrastive learning, electroencephalo-gram, self-supervised learning, user identification

Procedia PDF Downloads 157
677 Dissecting ESG: The Impact of Environmental, Social, and Governance Factors on Stock Price Risk in European Markets

Authors: Sylwia Frydrych, Jörg Prokop, Michał Buszko

Abstract:

This study investigates the complex relationship between corporate ESG (Environmental, Social, Governance) performance and stock price risk within the European market context. By analyzing a dataset of 435 companies across 19 European countries, the research assesses the impact of both combined ESG performance and its individual components on various risk measures, including volatility, idiosyncratic risk, systematic risk, and downside risk. The findings reveal that while overall ESG scores do not significantly influence stock price risk, disaggregating the ESG components uncovers significant relationships. Governance practices are shown to consistently reduce market risk, positioning them as critical in risk management. However, environmental engagement tends to increase risk, particularly in times of regulatory shifts like those introduced in the EU post-2018. This research provides valuable insights for investors and corporate managers on the nuanced roles of ESG factors in financial risk, emphasizing the need for careful consideration of each ESG pillar in decision-making processes.

Keywords: ESG performance, ESG factors, ESG pillars, ESG scores

Procedia PDF Downloads 25
676 Subspace Rotation Algorithm for Implementing Restricted Hopfield Network as an Auto-Associative Memory

Authors: Ci Lin, Tet Yeap, Iluju Kiringa

Abstract:

This paper introduces the subspace rotation algorithm (SRA) to train the Restricted Hopfield Network (RHN) as an auto-associative memory. Subspace rotation algorithm is a gradient-free subspace tracking approach based on the singular value decomposition (SVD). In comparison with Backpropagation Through Time (BPTT) on training RHN, it is observed that SRA could always converge to the optimal solution and BPTT could not achieve the same performance when the model becomes complex, and the number of patterns is large. The AUTS case study showed that the RHN model trained by SRA could achieve a better structure of attraction basin with larger radius(in general) than the Hopfield Network(HNN) model trained by Hebbian learning rule. Through learning 10000 patterns from MNIST dataset with RHN models with different number of hidden nodes, it is observed that an several components could be adjusted to achieve a balance between recovery accuracy and noise resistance.

Keywords: hopfield neural network, restricted hopfield network, subspace rotation algorithm, hebbian learning rule

Procedia PDF Downloads 117
675 Automatic Detection of Proliferative Cells in Immunohistochemically Images of Meningioma Using Fuzzy C-Means Clustering and HSV Color Space

Authors: Vahid Anari, Mina Bakhshi

Abstract:

Visual search and identification of immunohistochemically stained tissue of meningioma was performed manually in pathologic laboratories to detect and diagnose the cancers type of meningioma. This task is very tedious and time-consuming. Moreover, because of cell's complex nature, it still remains a challenging task to segment cells from its background and analyze them automatically. In this paper, we develop and test a computerized scheme that can automatically identify cells in microscopic images of meningioma and classify them into positive (proliferative) and negative (normal) cells. Dataset including 150 images are used to test the scheme. The scheme uses Fuzzy C-means algorithm as a color clustering method based on perceptually uniform hue, saturation, value (HSV) color space. Since the cells are distinguishable by the human eye, the accuracy and stability of the algorithm are quantitatively compared through application to a wide variety of real images.

Keywords: positive cell, color segmentation, HSV color space, immunohistochemistry, meningioma, thresholding, fuzzy c-means

Procedia PDF Downloads 210
674 An Auxiliary Technique for Coronary Heart Disease Prediction by Analyzing Electrocardiogram Based on ResNet and Bi-Long Short-Term Memory

Authors: Yang Zhang, Jian He

Abstract:

Heart disease is one of the leading causes of death in the world, and coronary heart disease (CHD) is one of the major heart diseases. Electrocardiogram (ECG) is widely used in the detection of heart diseases, but the traditional manual method for CHD prediction by analyzing ECG requires lots of professional knowledge for doctors. This paper introduces sliding window and continuous wavelet transform (CWT) to transform ECG signals into images, and then ResNet and Bi-LSTM are introduced to build the ECG feature extraction network (namely ECGNet). At last, an auxiliary system for coronary heart disease prediction was developed based on modified ResNet18 and Bi-LSTM, and the public ECG dataset of CHD from MIMIC-3 was used to train and test the system. The experimental results show that the accuracy of the method is 83%, and the F1-score is 83%. Compared with the available methods for CHD prediction based on ECG, such as kNN, decision tree, VGGNet, etc., this method not only improves the prediction accuracy but also could avoid the degradation phenomenon of the deep learning network.

Keywords: Bi-LSTM, CHD, ECG, ResNet, sliding window

Procedia PDF Downloads 89
673 Enhancing Code Security with AI-Powered Vulnerability Detection

Authors: Zzibu Mark Brian

Abstract:

As software systems become increasingly complex, ensuring code security is a growing concern. Traditional vulnerability detection methods often rely on manual code reviews or static analysis tools, which can be time-consuming and prone to errors. This paper presents a distinct approach to enhancing code security by leveraging artificial intelligence (AI) and machine learning (ML) techniques. Our proposed system utilizes a combination of natural language processing (NLP) and deep learning algorithms to identify and classify vulnerabilities in real-world codebases. By analyzing vast amounts of open-source code data, our AI-powered tool learns to recognize patterns and anomalies indicative of security weaknesses. We evaluated our system on a dataset of over 10,000 open-source projects, achieving an accuracy rate of 92% in detecting known vulnerabilities. Furthermore, our tool identified previously unknown vulnerabilities in popular libraries and frameworks, demonstrating its potential for improving software security.

Keywords: AI, machine language, cord security, machine leaning

Procedia PDF Downloads 36
672 Multinomial Dirichlet Gaussian Process Model for Classification of Multidimensional Data

Authors: Wanhyun Cho, Soonja Kang, Sanggoon Kim, Soonyoung Park

Abstract:

We present probabilistic multinomial Dirichlet classification model for multidimensional data and Gaussian process priors. Here, we have considered an efficient computational method that can be used to obtain the approximate posteriors for latent variables and parameters needed to define the multiclass Gaussian process classification model. We first investigated the process of inducing a posterior distribution for various parameters and latent function by using the variational Bayesian approximations and important sampling method, and next we derived a predictive distribution of latent function needed to classify new samples. The proposed model is applied to classify the synthetic multivariate dataset in order to verify the performance of our model. Experiment result shows that our model is more accurate than the other approximation methods.

Keywords: multinomial dirichlet classification model, Gaussian process priors, variational Bayesian approximation, importance sampling, approximate posterior distribution, marginal likelihood evidence

Procedia PDF Downloads 444
671 Key Factors Influencing Individual Knowledge Capability in KIFs

Authors: Salman Iqbal

Abstract:

Knowledge management (KM) literature has mainly focused on the antecedents of KM. The purpose of this study is to investigate the effect of specific human resource management (HRM) practices on employee knowledge sharing and its outcome as individual knowledge capability. Based on previous literature, a model is proposed for the study and hypotheses are formulated. The cross-sectional dataset comes from a sample of 19 knowledge intensive firms (KIFs). This study has run an item parceling technique followed by Confirmatory Factor Analysis (CFA) on the latent constructs of the research model. Employees’ collaboration and their interpersonal trust can help to improve their knowledge sharing behaviour and knowledge capability within organisations. This study suggests that in future, by using a larger sample, better statistical insight is possible. The findings of this study are beneficial for scholars, policy makers and practitioners. The empirical results of this study are entirely based on employees’ perceptions and make a significant research contribution, given there is a dearth of empirical research focusing on the subcontinent.

Keywords: employees’ collaboration, individual knowledge capability, knowledge sharing, monetary rewards, structural equation modelling

Procedia PDF Downloads 275
670 NFResNet: Multi-Scale and U-Shaped Networks for Deblurring

Authors: Tanish Mittal, Preyansh Agrawal, Esha Pahwa, Aarya Makwana

Abstract:

Multi-Scale and U-shaped Networks are widely used in various image restoration problems, including deblurring. Keeping in mind the wide range of applications, we present a comparison of these architectures and their effects on image deblurring. We also introduce a new block called as NFResblock. It consists of a Fast Fourier Transformation layer and a series of modified Non-Linear Activation Free Blocks. Based on these architectures and additions, we introduce NFResnet and NFResnet+, which are modified multi-scale and U-Net architectures, respectively. We also use three differ-ent loss functions to train these architectures: Charbonnier Loss, Edge Loss, and Frequency Reconstruction Loss. Extensive experiments on the Deep Video Deblurring dataset, along with ablation studies for each component, have been presented in this paper. The proposed architectures achieve a considerable increase in Peak Signal to Noise (PSNR) ratio and Structural Similarity Index (SSIM) value.

Keywords: multi-scale, Unet, deblurring, FFT, resblock, NAF-block, nfresnet, charbonnier, edge, frequency reconstruction

Procedia PDF Downloads 136
669 Multidirectional Product Support System for Decision Making in Textile Industry Using Collaborative Filtering Methods

Authors: A. Senthil Kumar, V. Murali Bhaskaran

Abstract:

In the information technology ground, people are using various tools and software for their official use and personal reasons. Nowadays, people are worrying to choose data accessing and extraction tools at the time of buying and selling their products. In addition, worry about various quality factors such as price, durability, color, size, and availability of the product. The main purpose of the research study is to find solutions to these unsolved existing problems. The proposed algorithm is a Multidirectional Rank Prediction (MDRP) decision making algorithm in order to take an effective strategic decision at all the levels of data extraction, uses a real time textile dataset and analyzes the results. Finally, the results are obtained and compared with the existing measurement methods such as PCC, SLCF, and VSS. The result accuracy is higher than the existing rank prediction methods.

Keywords: Knowledge Discovery in Database (KDD), Multidirectional Rank Prediction (MDRP), Pearson’s Correlation Coefficient (PCC), VSS (Vector Space Similarity)

Procedia PDF Downloads 286
668 Plant Leaf Recognition Using Deep Learning

Authors: Aadhya Kaul, Gautam Manocha, Preeti Nagrath

Abstract:

Our environment comprises of a wide variety of plants that are similar to each other and sometimes the similarity between the plants makes the identification process tedious thus increasing the workload of the botanist all over the world. Now all the botanists cannot be accessible all the time for such laborious plant identification; therefore, there is an urge for a quick classification model. Also, along with the identification of the plants, it is also necessary to classify the plant as healthy or not as for a good lifestyle, humans require good food and this food comes from healthy plants. A large number of techniques have been applied to classify the plants as healthy or diseased in order to provide the solution. This paper proposes one such method known as anomaly detection using autoencoders using a set of collections of leaves. In this method, an autoencoder model is built using Keras and then the reconstruction of the original images of the leaves is done and the threshold loss is found in order to classify the plant leaves as healthy or diseased. A dataset of plant leaves is considered to judge the reconstructed performance by convolutional autoencoders and the average accuracy obtained is 71.55% for the purpose.

Keywords: convolutional autoencoder, anomaly detection, web application, FLASK

Procedia PDF Downloads 163
667 Efficacy of Deep Learning for Below-Canopy Reconstruction of Satellite and Aerial Sensing Point Clouds through Fractal Tree Symmetry

Authors: Dhanuj M. Gandikota

Abstract:

Sensor-derived three-dimensional (3D) point clouds of trees are invaluable in remote sensing analysis for the accurate measurement of key structural metrics, bio-inventory values, spatial planning/visualization, and ecological modeling. Machine learning (ML) holds the potential in addressing the restrictive tradeoffs in cost, spatial coverage, resolution, and information gain that exist in current point cloud sensing methods. Terrestrial laser scanning (TLS) remains the highest fidelity source of both canopy and below-canopy structural features, but usage is limited in both coverage and cost, requiring manual deployment to map out large, forested areas. While aerial laser scanning (ALS) remains a reliable avenue of LIDAR active remote sensing, ALS is also cost-restrictive in deployment methods. Space-borne photogrammetry from high-resolution satellite constellations is an avenue of passive remote sensing with promising viability in research for the accurate construction of vegetation 3-D point clouds. It provides both the lowest comparative cost and the largest spatial coverage across remote sensing methods. However, both space-borne photogrammetry and ALS demonstrate technical limitations in the capture of valuable below-canopy point cloud data. Looking to minimize these tradeoffs, we explored a class of powerful ML algorithms called Deep Learning (DL) that show promise in recent research on 3-D point cloud reconstruction and interpolation. Our research details the efficacy of applying these DL techniques to reconstruct accurate below-canopy point clouds from space-borne and aerial remote sensing through learned patterns of tree species fractal symmetry properties and the supplementation of locally sourced bio-inventory metrics. From our dataset, consisting of tree point clouds obtained from TLS, we deconstructed the point clouds of each tree into those that would be obtained through ALS and satellite photogrammetry of varying resolutions. We fed this ALS/satellite point cloud dataset, along with the simulated local bio-inventory metrics, into the DL point cloud reconstruction architectures to generate the full 3-D tree point clouds (the truth values are denoted by the full TLS tree point clouds containing the below-canopy information). Point cloud reconstruction accuracy was validated both through the measurement of error from the original TLS point clouds as well as the error of extraction of key structural metrics, such as crown base height, diameter above root crown, and leaf/wood volume. The results of this research additionally demonstrate the supplemental performance gain of using minimum locally sourced bio-inventory metric information as an input in ML systems to reach specified accuracy thresholds of tree point cloud reconstruction. This research provides insight into methods for the rapid, cost-effective, and accurate construction of below-canopy tree 3-D point clouds, as well as the supported potential of ML and DL to learn complex, unmodeled patterns of fractal tree growth symmetry.

Keywords: deep learning, machine learning, satellite, photogrammetry, aerial laser scanning, terrestrial laser scanning, point cloud, fractal symmetry

Procedia PDF Downloads 103
666 Crop Recommendation System Using Machine Learning

Authors: Prathik Ranka, Sridhar K, Vasanth Daniel, Mithun Shankar

Abstract:

With growing global food needs and climate uncertainties, informed crop choices are critical for increasing agricultural productivity. Here we propose a machine learning-based crop recommendation system to help farmers in choosing the most proper crops according to their geographical regions and soil properties. We can deploy algorithms like Decision Trees, Random Forests and Support Vector Machines on a broad dataset that consists of climatic factors, soil characteristics and historical crop yields to predict the best choice of crops. The approach includes first preprocessing the data after assessing them for missing values, unlike in previous jobs where we used all the available information and then transformed because there was no way such a model could have worked with missing data, and normalizing as throughput that will be done over a network to get best results out of our machine learning division. The model effectiveness is measured through performance metrics like accuracy, precision and recall. The resultant app provides a farmer-friendly dashboard through which farmers can enter their local conditions and receive individualized crop suggestions.

Keywords: crop recommendation, precision agriculture, crop, machine learning

Procedia PDF Downloads 15
665 Amharic Text News Classification Using Supervised Learning

Authors: Misrak Assefa

Abstract:

The Amharic language is the second most widely spoken Semitic language in the world. There are several new overloaded on the web. Searching some useful documents from the web on a specific topic, which is written in the Amharic language, is a challenging task. Hence, document categorization is required for managing and filtering important information. In the classification of Amharic text news, there is still a gap in the domain of information that needs to be launch. This study attempts to design an automatic Amharic news classification using a supervised learning mechanism on four un-touch classes. To achieve this research, 4,182 news articles were used. Naive Bayes (NB) and Decision tree (j48) algorithms were used to classify the given Amharic dataset. In this paper, k-fold cross-validation is used to estimate the accuracy of the classifier. As a result, it shows those algorithms can be applicable in Amharic news categorization. The best average accuracy result is achieved by j48 decision tree and naïve Bayes is 95.2345 %, and 94.6245 % respectively using three categories. This research indicated that a typical decision tree algorithm is more applicable to Amharic news categorization.

Keywords: text categorization, supervised machine learning, naive Bayes, decision tree

Procedia PDF Downloads 211
664 Improved Rare Species Identification Using Focal Loss Based Deep Learning Models

Authors: Chad Goldsworthy, B. Rajeswari Matam

Abstract:

The use of deep learning for species identification in camera trap images has revolutionised our ability to study, conserve and monitor species in a highly efficient and unobtrusive manner, with state-of-the-art models achieving accuracies surpassing the accuracy of manual human classification. The high imbalance of camera trap datasets, however, results in poor accuracies for minority (rare or endangered) species due to their relative insignificance to the overall model accuracy. This paper investigates the use of Focal Loss, in comparison to the traditional Cross Entropy Loss function, to improve the identification of minority species in the “255 Bird Species” dataset from Kaggle. The results show that, although Focal Loss slightly decreased the accuracy of the majority species, it was able to increase the F1-score by 0.06 and improve the identification of the bottom two, five and ten (minority) species by 37.5%, 15.7% and 10.8%, respectively, as well as resulting in an improved overall accuracy of 2.96%.

Keywords: convolutional neural networks, data imbalance, deep learning, focal loss, species classification, wildlife conservation

Procedia PDF Downloads 191
663 An Experimental Study on Some Conventional and Hybrid Models of Fuzzy Clustering

Authors: Jeugert Kujtila, Kristi Hoxhalli, Ramazan Dalipi, Erjon Cota, Ardit Murati, Erind Bedalli

Abstract:

Clustering is a versatile instrument in the analysis of collections of data providing insights of the underlying structures of the dataset and enhancing the modeling capabilities. The fuzzy approach to the clustering problem increases the flexibility involving the concept of partial memberships (some value in the continuous interval [0, 1]) of the instances in the clusters. Several fuzzy clustering algorithms have been devised like FCM, Gustafson-Kessel, Gath-Geva, kernel-based FCM, PCM etc. Each of these algorithms has its own advantages and drawbacks, so none of these algorithms would be able to perform superiorly in all datasets. In this paper we will experimentally compare FCM, GK, GG algorithm and a hybrid two-stage fuzzy clustering model combining the FCM and Gath-Geva algorithms. Firstly we will theoretically dis-cuss the advantages and drawbacks for each of these algorithms and we will describe the hybrid clustering model exploiting the advantages and diminishing the drawbacks of each algorithm. Secondly we will experimentally compare the accuracy of the hybrid model by applying it on several benchmark and synthetic datasets.

Keywords: fuzzy clustering, fuzzy c-means algorithm (FCM), Gustafson-Kessel algorithm, hybrid clustering model

Procedia PDF Downloads 514
662 Estimating Gait Parameter from Digital RGB Camera Using Real Time AlphaPose Learning Architecture

Authors: Murad Almadani, Khalil Abu-Hantash, Xinyu Wang, Herbert Jelinek, Kinda Khalaf

Abstract:

Gait analysis is used by healthcare professionals as a tool to gain a better understanding of the movement impairment and track progress. In most circumstances, monitoring patients in their real-life environments with low-cost equipment such as cameras and wearable sensors is more important. Inertial sensors, on the other hand, cannot provide enough information on angular dynamics. This research offers a method for tracking 2D joint coordinates using cutting-edge vision algorithms and a single RGB camera. We provide an end-to-end comprehensive deep learning pipeline for marker-less gait parameter estimation, which, to our knowledge, has never been done before. To make our pipeline function in real-time for real-world applications, we leverage the AlphaPose human posture prediction model and a deep learning transformer. We tested our approach on the well-known GPJATK dataset, which produces promising results.

Keywords: gait analysis, human pose estimation, deep learning, real time gait estimation, AlphaPose, transformer

Procedia PDF Downloads 118
661 Examining Bulling Rates among Youth with Intellectual Disabilities

Authors: Kaycee L. Bills

Abstract:

Adolescents and youth who are members of a minority group are more likely to experience higher rates of bullying in comparison to other student demographics. Specifically, adolescents with intellectual disabilities are a minority population that is more susceptible to experience unfair treatment in social settings. This study employs the 2015 Wave of the National Crime Victimization Survey – School Crime Supplement (NCVS/SCS) longitudinal dataset to explore bullying rates experienced among adolescents with intellectual disabilities. This study uses chi-square testing and a logistic regression to analyze if having a disability influences the likelihood of being bullied in comparison to other student demographics. Results of the chi-square testing and the logistic regression indicate that adolescent students who were identified as having a disability were approximately four times more likely to experience higher bullying rates in comparison to all other majority and minority student populations. Thus, it means having a disability resulted in higher bullying rates in comparison to all student groups.

Keywords: disability, bullying, social work, school bullying

Procedia PDF Downloads 131
660 Heart Attack Prediction Using Several Machine Learning Methods

Authors: Suzan Anwar, Utkarsh Goyal

Abstract:

Heart rate (HR) is a predictor of cardiovascular, cerebrovascular, and all-cause mortality in the general population, as well as in patients with cardio and cerebrovascular diseases. Machine learning (ML) significantly improves the accuracy of cardiovascular risk prediction, increasing the number of patients identified who could benefit from preventive treatment while avoiding unnecessary treatment of others. This research examines relationship between the individual's various heart health inputs like age, sex, cp, trestbps, thalach, oldpeaketc, and the likelihood of developing heart disease. Machine learning techniques like logistic regression and decision tree, and Python are used. The results of testing and evaluating the model using the Heart Failure Prediction Dataset show the chance of a person having a heart disease with variable accuracy. Logistic regression has yielded an accuracy of 80.48% without data handling. With data handling (normalization, standardscaler), the logistic regression resulted in improved accuracy of 87.80%, decision tree 100%, random forest 100%, and SVM 100%.

Keywords: heart rate, machine learning, SVM, decision tree, logistic regression, random forest

Procedia PDF Downloads 138
659 Topic Modelling Using Latent Dirichlet Allocation and Latent Semantic Indexing on SA Telco Twitter Data

Authors: Phumelele Kubheka, Pius Owolawi, Gbolahan Aiyetoro

Abstract:

Twitter is one of the most popular social media platforms where users can share their opinions on different subjects. As of 2010, The Twitter platform generates more than 12 Terabytes of data daily, ~ 4.3 petabytes in a single year. For this reason, Twitter is a great source for big mining data. Many industries such as Telecommunication companies can leverage the availability of Twitter data to better understand their markets and make an appropriate business decision. This study performs topic modeling on Twitter data using Latent Dirichlet Allocation (LDA). The obtained results are benchmarked with another topic modeling technique, Latent Semantic Indexing (LSI). The study aims to retrieve topics on a Twitter dataset containing user tweets on South African Telcos. Results from this study show that LSI is much faster than LDA. However, LDA yields better results with higher topic coherence by 8% for the best-performing model represented in Table 1. A higher topic coherence score indicates better performance of the model.

Keywords: big data, latent Dirichlet allocation, latent semantic indexing, telco, topic modeling, twitter

Procedia PDF Downloads 150
658 Recognizing Human Actions by Multi-Layer Growing Grid Architecture

Authors: Z. Gharaee

Abstract:

Recognizing actions performed by others is important in our daily lives since it is necessary for communicating with others in a proper way. We perceive an action by observing the kinematics of motions involved in the performance. We use our experience and concepts to make a correct recognition of the actions. Although building the action concepts is a life-long process, which is repeated throughout life, we are very efficient in applying our learned concepts in analyzing motions and recognizing actions. Experiments on the subjects observing the actions performed by an actor show that an action is recognized after only about two hundred milliseconds of observation. In this study, hierarchical action recognition architecture is proposed by using growing grid layers. The first-layer growing grid receives the pre-processed data of consecutive 3D postures of joint positions and applies some heuristics during the growth phase to allocate areas of the map by inserting new neurons. As a result of training the first-layer growing grid, action pattern vectors are generated by connecting the elicited activations of the learned map. The ordered vector representation layer receives action pattern vectors to create time-invariant vectors of key elicited activations. Time-invariant vectors are sent to second-layer growing grid for categorization. This grid creates the clusters representing the actions. Finally, one-layer neural network developed by a delta rule labels the action categories in the last layer. System performance has been evaluated in an experiment with the publicly available MSR-Action3D dataset. There are actions performed by using different parts of human body: Hand Clap, Two Hands Wave, Side Boxing, Bend, Forward Kick, Side Kick, Jogging, Tennis Serve, Golf Swing, Pick Up and Throw. The growing grid architecture was trained by applying several random selections of generalization test data fed to the system during on average 100 epochs for each training of the first-layer growing grid and around 75 epochs for each training of the second-layer growing grid. The average generalization test accuracy is 92.6%. A comparison analysis between the performance of growing grid architecture and self-organizing map (SOM) architecture in terms of accuracy and learning speed show that the growing grid architecture is superior to the SOM architecture in action recognition task. The SOM architecture completes learning the same dataset of actions in around 150 epochs for each training of the first-layer SOM while it takes 1200 epochs for each training of the second-layer SOM and it achieves the average recognition accuracy of 90% for generalization test data. In summary, using the growing grid network preserves the fundamental features of SOMs, such as topographic organization of neurons, lateral interactions, the abilities of unsupervised learning and representing high dimensional input space in the lower dimensional maps. The architecture also benefits from an automatic size setting mechanism resulting in higher flexibility and robustness. Moreover, by utilizing growing grids the system automatically obtains a prior knowledge of input space during the growth phase and applies this information to expand the map by inserting new neurons wherever there is high representational demand.

Keywords: action recognition, growing grid, hierarchical architecture, neural networks, system performance

Procedia PDF Downloads 157
657 Using Artificial Vision Techniques for Dust Detection on Photovoltaic Panels

Authors: Gustavo Funes, Eduardo Peters, Jose Delpiano

Abstract:

It is widely known that photovoltaic technology has been massively distributed over the last decade despite its low-efficiency ratio. Dust deposition reduces this efficiency even more, lowering the energy production and module lifespan. In this work, we developed an artificial vision algorithm based on CIELAB color space to identify dust over panels in an autonomous way. We performed several experiments photographing three different types of panels, 30W, 340W and 410W. Those panels were soiled artificially with uniform and non-uniform distributed dust. The algorithm proposed uses statistical tools to provide a simulation with a 100% soiled panel and then performs a comparison to get the percentage of dirt in the experimental data set. The simulation uses a seed that is obtained by taking a dust sample from the maximum amount of dust from the dataset. The final result is the dirt percentage and the possible distribution of dust over the panel. Dust deposition is a key factor for plant owners to determine cleaning cycles or identify nonuniform depositions that could lead to module failure and hot spots.

Keywords: dust detection, photovoltaic, artificial vision, soiling

Procedia PDF Downloads 50
656 Automated Process Quality Monitoring and Diagnostics for Large-Scale Measurement Data

Authors: Hyun-Woo Cho

Abstract:

Continuous monitoring of industrial plants is one of necessary tasks when it comes to ensuring high-quality final products. In terms of monitoring and diagnosis, it is quite critical and important to detect some incipient abnormal events of manufacturing processes in order to improve safety and reliability of operations involved and to reduce related losses. In this work a new multivariate statistical online diagnostic method is presented using a case study. For building some reference models an empirical discriminant model is constructed based on various past operation runs. When a fault is detected on-line, an on-line diagnostic module is initiated. Finally, the status of the current operating conditions is compared with the reference model to make a diagnostic decision. The performance of the presented framework is evaluated using a dataset from complex industrial processes. It has been shown that the proposed diagnostic method outperforms other techniques especially in terms of incipient detection of any faults occurred.

Keywords: data mining, empirical model, on-line diagnostics, process fault, process monitoring

Procedia PDF Downloads 401
655 Machine Learning Driven Analysis of Kepler Objects of Interest to Identify Exoplanets

Authors: Akshat Kumar, Vidushi

Abstract:

This paper identifies 27 KOIs, 26 of which are currently classified as candidates and one as false positives that have a high probability of being confirmed. For this purpose, 11 machine learning algorithms were implemented on the cumulative kepler dataset sourced from the NASA exoplanet archive; it was observed that the best-performing model was HistGradientBoosting and XGBoost with a test accuracy of 93.5%, and the lowest-performing model was Gaussian NB with a test accuracy of 54%, to test model performance F1, cross-validation score and RUC curve was calculated. Based on the learned models, the significant characteristics for confirm exoplanets were identified, putting emphasis on the object’s transit and stellar properties; these characteristics were namely koi_count, koi_prad, koi_period, koi_dor, koi_ror, and koi_smass, which were later considered to filter out the potential KOIs. The paper also calculates the Earth similarity index based on the planetary radius and equilibrium temperature for each KOI identified to aid in their classification.

Keywords: Kepler objects of interest, exoplanets, space exploration, machine learning, earth similarity index, transit photometry

Procedia PDF Downloads 75