Search results for: EGY-BCD dataset
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1095

Search results for: EGY-BCD dataset

645 [Keynote Speech]: Feature Selection and Predictive Modeling of Housing Data Using Random Forest

Authors: Bharatendra Rai

Abstract:

Predictive data analysis and modeling involving machine learning techniques become challenging in presence of too many explanatory variables or features. Presence of too many features in machine learning is known to not only cause algorithms to slow down, but they can also lead to decrease in model prediction accuracy. This study involves housing dataset with 79 quantitative and qualitative features that describe various aspects people consider while buying a new house. Boruta algorithm that supports feature selection using a wrapper approach build around random forest is used in this study. This feature selection process leads to 49 confirmed features which are then used for developing predictive random forest models. The study also explores five different data partitioning ratios and their impact on model accuracy are captured using coefficient of determination (r-square) and root mean square error (rsme).

Keywords: housing data, feature selection, random forest, Boruta algorithm, root mean square error

Procedia PDF Downloads 296
644 A Study of Permission-Based Malware Detection Using Machine Learning

Authors: Ratun Rahman, Rafid Islam, Akin Ahmed, Kamrul Hasan, Hasan Mahmud

Abstract:

Malware is becoming more prevalent, and several threat categories have risen dramatically in recent years. This paper provides a bird's-eye view of the world of malware analysis. The efficiency of five different machine learning methods (Naive Bayes, K-Nearest Neighbor, Decision Tree, Random Forest, and TensorFlow Decision Forest) combined with features picked from the retrieval of Android permissions to categorize applications as harmful or benign is investigated in this study. The test set consists of 1,168 samples (among these android applications, 602 are malware and 566 are benign applications), each consisting of 948 features (permissions). Using the permission-based dataset, the machine learning algorithms then produce accuracy rates above 80%, except the Naive Bayes Algorithm with 65% accuracy. Of the considered algorithms TensorFlow Decision Forest performed the best with an accuracy of 90%.

Keywords: android malware detection, machine learning, malware, malware analysis

Procedia PDF Downloads 132
643 Using Greywolf Optimized Machine Learning Algorithms to Improve Accuracy for Predicting Hospital Readmission for Diabetes

Authors: Vincent Liu

Abstract:

Machine learning algorithms (ML) can achieve high accuracy in predicting outcomes compared to classical models. Metaheuristic, nature-inspired algorithms can enhance traditional ML algorithms by optimizing them such as by performing feature selection. We compare ten ML algorithms to predict 30-day hospital readmission rates for diabetes patients in the US using a dataset from UCI Machine Learning Repository with feature selection performed by Greywolf nature-inspired algorithm. The baseline accuracy for the initial random forest model was 65%. After performing feature engineering, SMOTE for class balancing, and Greywolf optimization, the machine learning algorithms showed better metrics, including F1 scores, accuracy, and confusion matrix with improvements ranging in 10%-30%, and a best model of XGBoost with an accuracy of 95%. Applying machine learning this way can improve patient outcomes as unnecessary rehospitalizations can be prevented by focusing on patients that are at a higher risk of readmission.

Keywords: diabetes, machine learning, 30-day readmission, metaheuristic

Procedia PDF Downloads 32
642 Hyper Tuned RBF SVM: Approach for the Prediction of the Breast Cancer

Authors: Surita Maini, Sanjay Dhanka

Abstract:

Machine learning (ML) involves developing algorithms and statistical models that enable computers to learn and make predictions or decisions based on data without being explicitly programmed. Because of its unlimited abilities ML is gaining popularity in medical sectors; Medical Imaging, Electronic Health Records, Genomic Data Analysis, Wearable Devices, Disease Outbreak Prediction, Disease Diagnosis, etc. In the last few decades, many researchers have tried to diagnose Breast Cancer (BC) using ML, because early detection of any disease can save millions of lives. Working in this direction, the authors have proposed a hybrid ML technique RBF SVM, to predict the BC in earlier the stage. The proposed method is implemented on the Breast Cancer UCI ML dataset with 569 instances and 32 attributes. The authors recorded performance metrics of the proposed model i.e., Accuracy 98.24%, Sensitivity 98.67%, Specificity 97.43%, F1 Score 98.67%, Precision 98.67%, and run time 0.044769 seconds. The proposed method is validated by K-Fold cross-validation.

Keywords: breast cancer, support vector classifier, machine learning, hyper parameter tunning

Procedia PDF Downloads 50
641 Change Detection Method Based on Scale-Invariant Feature Transformation Keypoints and Segmentation for Synthetic Aperture Radar Image

Authors: Lan Du, Yan Wang, Hui Dai

Abstract:

Synthetic aperture radar (SAR) image change detection has recently become a challenging problem owing to the existence of speckle noises. In this paper, an unsupervised distribution-free change detection for SAR image based on scale-invariant feature transform (SIFT) keypoints and segmentation is proposed. Firstly, the noise-robust SIFT keypoints which reveal the blob-like structures in an image are extracted in the log-ratio image to reduce the detection range. Then, different from the traditional change detection which directly obtains the change-detection map from the difference image, segmentation is made around the extracted keypoints in the two original multitemporal SAR images to obtain accurate changed region. At last, the change-detection map is generated by comparing the two segmentations. Experimental results on the real SAR image dataset demonstrate the effectiveness of the proposed method.

Keywords: change detection, Synthetic Aperture Radar (SAR), Scale-Invariant Feature Transformation (SIFT), segmentation

Procedia PDF Downloads 358
640 Traffic Light Detection Using Image Segmentation

Authors: Vaishnavi Shivde, Shrishti Sinha, Trapti Mishra

Abstract:

Traffic light detection from a moving vehicle is an important technology both for driver safety assistance functions as well as for autonomous driving in the city. This paper proposed a deep-learning-based traffic light recognition method that consists of a pixel-wise image segmentation technique and a fully convolutional network i.e., UNET architecture. This paper has used a method for detecting the position and recognizing the state of the traffic lights in video sequences is presented and evaluated using Traffic Light Dataset which contains masked traffic light image data. The first stage is the detection, which is accomplished through image processing (image segmentation) techniques such as image cropping, color transformation, segmentation of possible traffic lights. The second stage is the recognition, which means identifying the color of the traffic light or knowing the state of traffic light which is achieved by using a Convolutional Neural Network (UNET architecture).

Keywords: traffic light detection, image segmentation, machine learning, classification, convolutional neural networks

Procedia PDF Downloads 146
639 Neural Network Based Compressor Flow Estimator in an Aircraft Vapor Cycle System

Authors: Justin Reverdi, Sixin Zhang, Serge Gratton, Said Aoues, Thomas Pellegrini

Abstract:

In Vapor Cycle Systems, the flow sensor plays a key role in different monitoring and control purposes. However, physical sensors can be expensive, inaccurate, heavy, cumbersome, or highly sensitive to vibrations, which is especially problematic when embedded into an aircraft. The conception of a virtual sensor based on other standard sensors is a good alternative. In this paper, a data-driven model using a Convolutional Neural Network is proposed to estimate the flow of the compressor. To fit the model to our dataset, we tested different loss functions. We show in our application that a Dynamic Time Warping based loss function called DILATE leads to better dynamical performance than the vanilla mean squared error (MSE) loss function. DILATE allows choosing a trade-off between static and dynamic performance.

Keywords: deep learning, dynamic time warping, vapor cycle system, virtual sensor

Procedia PDF Downloads 123
638 Application of Advanced Remote Sensing Data in Mineral Exploration in the Vicinity of Heavy Dense Forest Cover Area of Jharkhand and Odisha State Mining Area

Authors: Hemant Kumar, R. N. K. Sharma, A. P. Krishna

Abstract:

The study has been carried out on the Saranda in Jharkhand and a part of Odisha state. Geospatial data of Hyperion, a remote sensing satellite, have been used. This study has used a wide variety of patterns related to image processing to enhance and extract the mining class of Fe and Mn ores.Landsat-8, OLI sensor data have also been used to correctly explore related minerals. In this way, various processes have been applied to increase the mineralogy class and comparative evaluation with related frequency done. The Hyperion dataset for hyperspectral remote sensing has been specifically verified as an effective tool for mineral or rock information extraction within the band range of shortwave infrared used. The abundant spatial and spectral information contained in hyperspectral images enables the differentiation of different objects of any object into targeted applications for exploration such as exploration detection, mining.

Keywords: Hyperion, hyperspectral, sensor, Landsat-8

Procedia PDF Downloads 96
637 Vehicle Type Classification with Geometric and Appearance Attributes

Authors: Ghada S. Moussa

Abstract:

With the increase in population along with economic prosperity, an enormous increase in the number and types of vehicles on the roads occurred. This fact brings a growing need for efficiently yet effectively classifying vehicles into their corresponding categories, which play a crucial role in many areas of infrastructure planning and traffic management. This paper presents two vehicle-type classification approaches; 1) geometric-based and 2) appearance-based. The two classification approaches are used for two tasks: multi-class and intra-class vehicle classifications. For the evaluation purpose of the proposed classification approaches’ performance and the identification of the most effective yet efficient one, 10-fold cross-validation technique is used with a large dataset. The proposed approaches are distinguishable from previous research on vehicle classification in which: i) they consider both geometric and appearance attributes of vehicles, and ii) they perform remarkably well in both multi-class and intra-class vehicle classification. Experimental results exhibit promising potentials implementations of the proposed vehicle classification approaches into real-world applications.

Keywords: appearance attributes, geometric attributes, support vector machine, vehicle classification

Procedia PDF Downloads 317
636 Intelligent Software Architecture and Automatic Re-Architecting Based on Machine Learning

Authors: Gebremeskel Hagos Gebremedhin, Feng Chong, Heyan Huang

Abstract:

Software system is the combination of architecture and organized components to accomplish a specific function or set of functions. A good software architecture facilitates application system development, promotes achievement of functional requirements, and supports system reconfiguration. We describe three studies demonstrating the utility of our architecture in the subdomain of mobile office robots and identify software engineering principles embodied in the architecture. The main aim of this paper is to analyze prove architecture design and automatic re-architecting using machine learning. Intelligence software architecture and automatic re-architecting process is reorganizing in to more suitable one of the software organizational structure system using the user access dataset for creating relationship among the components of the system. The 3-step approach of data mining was used to analyze effective recovery, transformation and implantation with the use of clustering algorithm. Therefore, automatic re-architecting without changing the source code is possible to solve the software complexity problem and system software reuse.

Keywords: intelligence, software architecture, re-architecting, software reuse, High level design

Procedia PDF Downloads 95
635 Image Retrieval Based on Multi-Feature Fusion for Heterogeneous Image Databases

Authors: N. W. U. D. Chathurani, Shlomo Geva, Vinod Chandran, Proboda Rajapaksha

Abstract:

Selecting an appropriate image representation is the most important factor in implementing an effective Content-Based Image Retrieval (CBIR) system. This paper presents a multi-feature fusion approach for efficient CBIR, based on the distance distribution of features and relative feature weights at the time of query processing. It is a simple yet effective approach, which is free from the effect of features' dimensions, ranges, internal feature normalization and the distance measure. This approach can easily be adopted in any feature combination to improve retrieval quality. The proposed approach is empirically evaluated using two benchmark datasets for image classification (a subset of the Corel dataset and Oliva and Torralba) and compared with existing approaches. The performance of the proposed approach is confirmed with the significantly improved performance in comparison with the independently evaluated baseline of the previously proposed feature fusion approaches.

Keywords: feature fusion, image retrieval, membership function, normalization

Procedia PDF Downloads 325
634 HPPDFIM-HD: Transaction Distortion and Connected Perturbation Approach for Hierarchical Privacy Preserving Distributed Frequent Itemset Mining over Horizontally-Partitioned Dataset

Authors: Fuad Ali Mohammed Al-Yarimi

Abstract:

Many algorithms have been proposed to provide privacy preserving in data mining. These protocols are based on two main approaches named as: the perturbation approach and the Cryptographic approach. The first one is based on perturbation of the valuable information while the second one uses cryptographic techniques. The perturbation approach is much more efficient with reduced accuracy while the cryptographic approach can provide solutions with perfect accuracy. However, the cryptographic approach is a much slower method and requires considerable computation and communication overhead. In this paper, a new scalable protocol is proposed which combines the advantages of the perturbation and distortion along with cryptographic approach to perform privacy preserving in distributed frequent itemset mining on horizontally distributed data. Both the privacy and performance characteristics of the proposed protocol are studied empirically.

Keywords: anonymity data, data mining, distributed frequent itemset mining, gaussian perturbation, perturbation approach, privacy preserving data mining

Procedia PDF Downloads 484
633 Towards an Enhanced Compartmental Model for Profiling Malware Dynamics

Authors: Jessemyn Modiini, Timothy Lynar, Elena Sitnikova

Abstract:

We present a novel enhanced compartmental model for malware spread analysis in cyber security. This paper applies cyber security data features to epidemiological compartmental models to model the infectious potential of malware. Compartmental models are most efficient for calculating the infectious potential of a disease. In this paper, we discuss and profile epidemiologically relevant data features from a Domain Name System (DNS) dataset. We then apply these features to epidemiological compartmental models to network traffic features. This paper demonstrates how epidemiological principles can be applied to the novel analysis of key cybersecurity behaviours and trends and provides insight into threat modelling above that of kill-chain analysis. In applying deterministic compartmental models to a cyber security use case, the authors analyse the deficiencies and provide an enhanced stochastic model for cyber epidemiology. This enhanced compartmental model (SUEICRN model) is contrasted with the traditional SEIR model to demonstrate its efficacy.

Keywords: cybersecurity, epidemiology, cyber epidemiology, malware

Procedia PDF Downloads 87
632 Using Machine Learning to Predict Answers to Big-Five Personality Questions

Authors: Aadityaa Singla

Abstract:

The big five personality traits are as follows: openness, conscientiousness, extraversion, agreeableness, and neuroticism. In order to get an insight into their personality, many flocks to these categories, which each have different meanings/characteristics. This information is important not only to individuals but also to career professionals and psychologists who can use this information for candidate assessment or job recruitment. The links between AI and psychology have been well studied in cognitive science, but it is still a rather novel development. It is possible for various AI classification models to accurately predict a personality question via ten input questions. This would contrast with the hundred questions that normal humans have to answer to gain a complete picture of their five personality traits. In order to approach this problem, various AI classification models were used on a dataset to predict what a user may answer. From there, the model's prediction was compared to its actual response. Normally, there are five answer choices (a 20% chance of correct guess), and the models exceed that value to different degrees, proving their significance. By utilizing an MLP classifier, decision tree, linear model, and K-nearest neighbors, they were able to obtain a test accuracy of 86.643, 54.625, 47.875, and 52.125, respectively. These approaches display that there is potential in the future for more nuanced predictions to be made regarding personality.

Keywords: machine learning, personally, big five personality traits, cognitive science

Procedia PDF Downloads 119
631 A Reliable Multi-Type Vehicle Classification System

Authors: Ghada S. Moussa

Abstract:

Vehicle classification is an important task in traffic surveillance and intelligent transportation systems. Classification of vehicle images is facing several problems such as: high intra-class vehicle variations, occlusion, shadow, illumination. These problems and others must be considered to develop a reliable vehicle classification system. In this study, a reliable multi-type vehicle classification system based on Bag-of-Words (BoW) paradigm is developed. Our proposed system used and compared four well-known classifiers; Linear Discriminant Analysis (LDA), Support Vector Machine (SVM), k-Nearest Neighbour (KNN), and Decision Tree to classify vehicles into four categories: motorcycles, small, medium and large. Experiments on a large dataset show that our approach is efficient and reliable in classifying vehicles with accuracy of 95.7%. The SVM outperforms other classification algorithms in terms of both accuracy and robustness alongside considerable reduction in execution time. The innovativeness of developed system is it can serve as a framework for many vehicle classification systems.

Keywords: vehicle classification, bag-of-words technique, SVM classifier, LDA classifier, KNN classifier, decision tree classifier, SIFT algorithm

Procedia PDF Downloads 332
630 Large-Scale Electroencephalogram Biometrics through Contrastive Learning

Authors: Mostafa ‘Neo’ Mohsenvand, Mohammad Rasool Izadi, Pattie Maes

Abstract:

EEG-based biometrics (user identification) has been explored on small datasets of no more than 157 subjects. Here we show that the accuracy of modern supervised methods falls rapidly as the number of users increases to a few thousand. Moreover, supervised methods require a large amount of labeled data for training which limits their applications in real-world scenarios where acquiring data for training should not take more than a few minutes. We show that using contrastive learning for pre-training, it is possible to maintain high accuracy on a dataset of 2130 subjects while only using a fraction of labels. We compare 5 different self-supervised tasks for pre-training of the encoder where our proposed method achieves the accuracy of 96.4%, improving the baseline supervised models by 22.75% and the competing self-supervised model by 3.93%. We also study the effects of the length of the signal and the number of channels on the accuracy of the user-identification models. Our results reveal that signals from temporal and frontal channels contain more identifying features compared to other channels.

Keywords: brainprint, contrastive learning, electroencephalo-gram, self-supervised learning, user identification

Procedia PDF Downloads 135
629 Subspace Rotation Algorithm for Implementing Restricted Hopfield Network as an Auto-Associative Memory

Authors: Ci Lin, Tet Yeap, Iluju Kiringa

Abstract:

This paper introduces the subspace rotation algorithm (SRA) to train the Restricted Hopfield Network (RHN) as an auto-associative memory. Subspace rotation algorithm is a gradient-free subspace tracking approach based on the singular value decomposition (SVD). In comparison with Backpropagation Through Time (BPTT) on training RHN, it is observed that SRA could always converge to the optimal solution and BPTT could not achieve the same performance when the model becomes complex, and the number of patterns is large. The AUTS case study showed that the RHN model trained by SRA could achieve a better structure of attraction basin with larger radius(in general) than the Hopfield Network(HNN) model trained by Hebbian learning rule. Through learning 10000 patterns from MNIST dataset with RHN models with different number of hidden nodes, it is observed that an several components could be adjusted to achieve a balance between recovery accuracy and noise resistance.

Keywords: hopfield neural network, restricted hopfield network, subspace rotation algorithm, hebbian learning rule

Procedia PDF Downloads 96
628 Automatic Detection of Proliferative Cells in Immunohistochemically Images of Meningioma Using Fuzzy C-Means Clustering and HSV Color Space

Authors: Vahid Anari, Mina Bakhshi

Abstract:

Visual search and identification of immunohistochemically stained tissue of meningioma was performed manually in pathologic laboratories to detect and diagnose the cancers type of meningioma. This task is very tedious and time-consuming. Moreover, because of cell's complex nature, it still remains a challenging task to segment cells from its background and analyze them automatically. In this paper, we develop and test a computerized scheme that can automatically identify cells in microscopic images of meningioma and classify them into positive (proliferative) and negative (normal) cells. Dataset including 150 images are used to test the scheme. The scheme uses Fuzzy C-means algorithm as a color clustering method based on perceptually uniform hue, saturation, value (HSV) color space. Since the cells are distinguishable by the human eye, the accuracy and stability of the algorithm are quantitatively compared through application to a wide variety of real images.

Keywords: positive cell, color segmentation, HSV color space, immunohistochemistry, meningioma, thresholding, fuzzy c-means

Procedia PDF Downloads 188
627 An Auxiliary Technique for Coronary Heart Disease Prediction by Analyzing Electrocardiogram Based on ResNet and Bi-Long Short-Term Memory

Authors: Yang Zhang, Jian He

Abstract:

Heart disease is one of the leading causes of death in the world, and coronary heart disease (CHD) is one of the major heart diseases. Electrocardiogram (ECG) is widely used in the detection of heart diseases, but the traditional manual method for CHD prediction by analyzing ECG requires lots of professional knowledge for doctors. This paper introduces sliding window and continuous wavelet transform (CWT) to transform ECG signals into images, and then ResNet and Bi-LSTM are introduced to build the ECG feature extraction network (namely ECGNet). At last, an auxiliary system for coronary heart disease prediction was developed based on modified ResNet18 and Bi-LSTM, and the public ECG dataset of CHD from MIMIC-3 was used to train and test the system. The experimental results show that the accuracy of the method is 83%, and the F1-score is 83%. Compared with the available methods for CHD prediction based on ECG, such as kNN, decision tree, VGGNet, etc., this method not only improves the prediction accuracy but also could avoid the degradation phenomenon of the deep learning network.

Keywords: Bi-LSTM, CHD, ECG, ResNet, sliding window

Procedia PDF Downloads 65
626 Multinomial Dirichlet Gaussian Process Model for Classification of Multidimensional Data

Authors: Wanhyun Cho, Soonja Kang, Sanggoon Kim, Soonyoung Park

Abstract:

We present probabilistic multinomial Dirichlet classification model for multidimensional data and Gaussian process priors. Here, we have considered an efficient computational method that can be used to obtain the approximate posteriors for latent variables and parameters needed to define the multiclass Gaussian process classification model. We first investigated the process of inducing a posterior distribution for various parameters and latent function by using the variational Bayesian approximations and important sampling method, and next we derived a predictive distribution of latent function needed to classify new samples. The proposed model is applied to classify the synthetic multivariate dataset in order to verify the performance of our model. Experiment result shows that our model is more accurate than the other approximation methods.

Keywords: multinomial dirichlet classification model, Gaussian process priors, variational Bayesian approximation, importance sampling, approximate posterior distribution, marginal likelihood evidence

Procedia PDF Downloads 415
625 Key Factors Influencing Individual Knowledge Capability in KIFs

Authors: Salman Iqbal

Abstract:

Knowledge management (KM) literature has mainly focused on the antecedents of KM. The purpose of this study is to investigate the effect of specific human resource management (HRM) practices on employee knowledge sharing and its outcome as individual knowledge capability. Based on previous literature, a model is proposed for the study and hypotheses are formulated. The cross-sectional dataset comes from a sample of 19 knowledge intensive firms (KIFs). This study has run an item parceling technique followed by Confirmatory Factor Analysis (CFA) on the latent constructs of the research model. Employees’ collaboration and their interpersonal trust can help to improve their knowledge sharing behaviour and knowledge capability within organisations. This study suggests that in future, by using a larger sample, better statistical insight is possible. The findings of this study are beneficial for scholars, policy makers and practitioners. The empirical results of this study are entirely based on employees’ perceptions and make a significant research contribution, given there is a dearth of empirical research focusing on the subcontinent.

Keywords: employees’ collaboration, individual knowledge capability, knowledge sharing, monetary rewards, structural equation modelling

Procedia PDF Downloads 251
624 NFResNet: Multi-Scale and U-Shaped Networks for Deblurring

Authors: Tanish Mittal, Preyansh Agrawal, Esha Pahwa, Aarya Makwana

Abstract:

Multi-Scale and U-shaped Networks are widely used in various image restoration problems, including deblurring. Keeping in mind the wide range of applications, we present a comparison of these architectures and their effects on image deblurring. We also introduce a new block called as NFResblock. It consists of a Fast Fourier Transformation layer and a series of modified Non-Linear Activation Free Blocks. Based on these architectures and additions, we introduce NFResnet and NFResnet+, which are modified multi-scale and U-Net architectures, respectively. We also use three differ-ent loss functions to train these architectures: Charbonnier Loss, Edge Loss, and Frequency Reconstruction Loss. Extensive experiments on the Deep Video Deblurring dataset, along with ablation studies for each component, have been presented in this paper. The proposed architectures achieve a considerable increase in Peak Signal to Noise (PSNR) ratio and Structural Similarity Index (SSIM) value.

Keywords: multi-scale, Unet, deblurring, FFT, resblock, NAF-block, nfresnet, charbonnier, edge, frequency reconstruction

Procedia PDF Downloads 100
623 Multidirectional Product Support System for Decision Making in Textile Industry Using Collaborative Filtering Methods

Authors: A. Senthil Kumar, V. Murali Bhaskaran

Abstract:

In the information technology ground, people are using various tools and software for their official use and personal reasons. Nowadays, people are worrying to choose data accessing and extraction tools at the time of buying and selling their products. In addition, worry about various quality factors such as price, durability, color, size, and availability of the product. The main purpose of the research study is to find solutions to these unsolved existing problems. The proposed algorithm is a Multidirectional Rank Prediction (MDRP) decision making algorithm in order to take an effective strategic decision at all the levels of data extraction, uses a real time textile dataset and analyzes the results. Finally, the results are obtained and compared with the existing measurement methods such as PCC, SLCF, and VSS. The result accuracy is higher than the existing rank prediction methods.

Keywords: Knowledge Discovery in Database (KDD), Multidirectional Rank Prediction (MDRP), Pearson’s Correlation Coefficient (PCC), VSS (Vector Space Similarity)

Procedia PDF Downloads 260
622 Plant Leaf Recognition Using Deep Learning

Authors: Aadhya Kaul, Gautam Manocha, Preeti Nagrath

Abstract:

Our environment comprises of a wide variety of plants that are similar to each other and sometimes the similarity between the plants makes the identification process tedious thus increasing the workload of the botanist all over the world. Now all the botanists cannot be accessible all the time for such laborious plant identification; therefore, there is an urge for a quick classification model. Also, along with the identification of the plants, it is also necessary to classify the plant as healthy or not as for a good lifestyle, humans require good food and this food comes from healthy plants. A large number of techniques have been applied to classify the plants as healthy or diseased in order to provide the solution. This paper proposes one such method known as anomaly detection using autoencoders using a set of collections of leaves. In this method, an autoencoder model is built using Keras and then the reconstruction of the original images of the leaves is done and the threshold loss is found in order to classify the plant leaves as healthy or diseased. A dataset of plant leaves is considered to judge the reconstructed performance by convolutional autoencoders and the average accuracy obtained is 71.55% for the purpose.

Keywords: convolutional autoencoder, anomaly detection, web application, FLASK

Procedia PDF Downloads 139
621 Efficacy of Deep Learning for Below-Canopy Reconstruction of Satellite and Aerial Sensing Point Clouds through Fractal Tree Symmetry

Authors: Dhanuj M. Gandikota

Abstract:

Sensor-derived three-dimensional (3D) point clouds of trees are invaluable in remote sensing analysis for the accurate measurement of key structural metrics, bio-inventory values, spatial planning/visualization, and ecological modeling. Machine learning (ML) holds the potential in addressing the restrictive tradeoffs in cost, spatial coverage, resolution, and information gain that exist in current point cloud sensing methods. Terrestrial laser scanning (TLS) remains the highest fidelity source of both canopy and below-canopy structural features, but usage is limited in both coverage and cost, requiring manual deployment to map out large, forested areas. While aerial laser scanning (ALS) remains a reliable avenue of LIDAR active remote sensing, ALS is also cost-restrictive in deployment methods. Space-borne photogrammetry from high-resolution satellite constellations is an avenue of passive remote sensing with promising viability in research for the accurate construction of vegetation 3-D point clouds. It provides both the lowest comparative cost and the largest spatial coverage across remote sensing methods. However, both space-borne photogrammetry and ALS demonstrate technical limitations in the capture of valuable below-canopy point cloud data. Looking to minimize these tradeoffs, we explored a class of powerful ML algorithms called Deep Learning (DL) that show promise in recent research on 3-D point cloud reconstruction and interpolation. Our research details the efficacy of applying these DL techniques to reconstruct accurate below-canopy point clouds from space-borne and aerial remote sensing through learned patterns of tree species fractal symmetry properties and the supplementation of locally sourced bio-inventory metrics. From our dataset, consisting of tree point clouds obtained from TLS, we deconstructed the point clouds of each tree into those that would be obtained through ALS and satellite photogrammetry of varying resolutions. We fed this ALS/satellite point cloud dataset, along with the simulated local bio-inventory metrics, into the DL point cloud reconstruction architectures to generate the full 3-D tree point clouds (the truth values are denoted by the full TLS tree point clouds containing the below-canopy information). Point cloud reconstruction accuracy was validated both through the measurement of error from the original TLS point clouds as well as the error of extraction of key structural metrics, such as crown base height, diameter above root crown, and leaf/wood volume. The results of this research additionally demonstrate the supplemental performance gain of using minimum locally sourced bio-inventory metric information as an input in ML systems to reach specified accuracy thresholds of tree point cloud reconstruction. This research provides insight into methods for the rapid, cost-effective, and accurate construction of below-canopy tree 3-D point clouds, as well as the supported potential of ML and DL to learn complex, unmodeled patterns of fractal tree growth symmetry.

Keywords: deep learning, machine learning, satellite, photogrammetry, aerial laser scanning, terrestrial laser scanning, point cloud, fractal symmetry

Procedia PDF Downloads 70
620 Amharic Text News Classification Using Supervised Learning

Authors: Misrak Assefa

Abstract:

The Amharic language is the second most widely spoken Semitic language in the world. There are several new overloaded on the web. Searching some useful documents from the web on a specific topic, which is written in the Amharic language, is a challenging task. Hence, document categorization is required for managing and filtering important information. In the classification of Amharic text news, there is still a gap in the domain of information that needs to be launch. This study attempts to design an automatic Amharic news classification using a supervised learning mechanism on four un-touch classes. To achieve this research, 4,182 news articles were used. Naive Bayes (NB) and Decision tree (j48) algorithms were used to classify the given Amharic dataset. In this paper, k-fold cross-validation is used to estimate the accuracy of the classifier. As a result, it shows those algorithms can be applicable in Amharic news categorization. The best average accuracy result is achieved by j48 decision tree and naïve Bayes is 95.2345 %, and 94.6245 % respectively using three categories. This research indicated that a typical decision tree algorithm is more applicable to Amharic news categorization.

Keywords: text categorization, supervised machine learning, naive Bayes, decision tree

Procedia PDF Downloads 167
619 Improved Rare Species Identification Using Focal Loss Based Deep Learning Models

Authors: Chad Goldsworthy, B. Rajeswari Matam

Abstract:

The use of deep learning for species identification in camera trap images has revolutionised our ability to study, conserve and monitor species in a highly efficient and unobtrusive manner, with state-of-the-art models achieving accuracies surpassing the accuracy of manual human classification. The high imbalance of camera trap datasets, however, results in poor accuracies for minority (rare or endangered) species due to their relative insignificance to the overall model accuracy. This paper investigates the use of Focal Loss, in comparison to the traditional Cross Entropy Loss function, to improve the identification of minority species in the “255 Bird Species” dataset from Kaggle. The results show that, although Focal Loss slightly decreased the accuracy of the majority species, it was able to increase the F1-score by 0.06 and improve the identification of the bottom two, five and ten (minority) species by 37.5%, 15.7% and 10.8%, respectively, as well as resulting in an improved overall accuracy of 2.96%.

Keywords: convolutional neural networks, data imbalance, deep learning, focal loss, species classification, wildlife conservation

Procedia PDF Downloads 159
618 An Experimental Study on Some Conventional and Hybrid Models of Fuzzy Clustering

Authors: Jeugert Kujtila, Kristi Hoxhalli, Ramazan Dalipi, Erjon Cota, Ardit Murati, Erind Bedalli

Abstract:

Clustering is a versatile instrument in the analysis of collections of data providing insights of the underlying structures of the dataset and enhancing the modeling capabilities. The fuzzy approach to the clustering problem increases the flexibility involving the concept of partial memberships (some value in the continuous interval [0, 1]) of the instances in the clusters. Several fuzzy clustering algorithms have been devised like FCM, Gustafson-Kessel, Gath-Geva, kernel-based FCM, PCM etc. Each of these algorithms has its own advantages and drawbacks, so none of these algorithms would be able to perform superiorly in all datasets. In this paper we will experimentally compare FCM, GK, GG algorithm and a hybrid two-stage fuzzy clustering model combining the FCM and Gath-Geva algorithms. Firstly we will theoretically dis-cuss the advantages and drawbacks for each of these algorithms and we will describe the hybrid clustering model exploiting the advantages and diminishing the drawbacks of each algorithm. Secondly we will experimentally compare the accuracy of the hybrid model by applying it on several benchmark and synthetic datasets.

Keywords: fuzzy clustering, fuzzy c-means algorithm (FCM), Gustafson-Kessel algorithm, hybrid clustering model

Procedia PDF Downloads 484
617 Estimating Gait Parameter from Digital RGB Camera Using Real Time AlphaPose Learning Architecture

Authors: Murad Almadani, Khalil Abu-Hantash, Xinyu Wang, Herbert Jelinek, Kinda Khalaf

Abstract:

Gait analysis is used by healthcare professionals as a tool to gain a better understanding of the movement impairment and track progress. In most circumstances, monitoring patients in their real-life environments with low-cost equipment such as cameras and wearable sensors is more important. Inertial sensors, on the other hand, cannot provide enough information on angular dynamics. This research offers a method for tracking 2D joint coordinates using cutting-edge vision algorithms and a single RGB camera. We provide an end-to-end comprehensive deep learning pipeline for marker-less gait parameter estimation, which, to our knowledge, has never been done before. To make our pipeline function in real-time for real-world applications, we leverage the AlphaPose human posture prediction model and a deep learning transformer. We tested our approach on the well-known GPJATK dataset, which produces promising results.

Keywords: gait analysis, human pose estimation, deep learning, real time gait estimation, AlphaPose, transformer

Procedia PDF Downloads 95
616 Examining Bulling Rates among Youth with Intellectual Disabilities

Authors: Kaycee L. Bills

Abstract:

Adolescents and youth who are members of a minority group are more likely to experience higher rates of bullying in comparison to other student demographics. Specifically, adolescents with intellectual disabilities are a minority population that is more susceptible to experience unfair treatment in social settings. This study employs the 2015 Wave of the National Crime Victimization Survey – School Crime Supplement (NCVS/SCS) longitudinal dataset to explore bullying rates experienced among adolescents with intellectual disabilities. This study uses chi-square testing and a logistic regression to analyze if having a disability influences the likelihood of being bullied in comparison to other student demographics. Results of the chi-square testing and the logistic regression indicate that adolescent students who were identified as having a disability were approximately four times more likely to experience higher bullying rates in comparison to all other majority and minority student populations. Thus, it means having a disability resulted in higher bullying rates in comparison to all student groups.

Keywords: disability, bullying, social work, school bullying

Procedia PDF Downloads 111