Search results for: dataset
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1171

Search results for: dataset

1081 Digital Forensics Showdown: Encase and FTK Head-to-Head

Authors: Rida Nasir, Waseem Iqbal

Abstract:

Due to the constant revolution in technology and the increase in anti-forensic techniques used by attackers to remove their traces, professionals often struggle to choose the best tool to be used in digital forensic investigations. This paper compares two of the most well-known and widely used licensed commercial tools, i.e., Encase & FTK. The comparison was drawn on various parameters and features to provide an authentic evaluation of licensed versions of these well-known commercial tools against various real-world scenarios. In order to discover the popularity of these tools within the digital forensic community, a survey was conducted publicly to determine the preferred choice. The dataset used is the Computer Forensics Reference Dataset (CFReDS). A total of 70 features were selected from various categories. Upon comparison, both FTK and EnCase produce remarkable results. However, each tool has some limitations, and none of the tools is declared best. The comparison drawn is completely unbiased, based on factual data.

Keywords: digital forensics, commercial tools, investigation, forensic evaluation

Procedia PDF Downloads 20
1080 Fast Short-Term Electrical Load Forecasting under High Meteorological Variability with a Multiple Equation Time Series Approach

Authors: Charline David, Alexandre Blondin Massé, Arnaud Zinflou

Abstract:

In 2016, Clements, Hurn, and Li proposed a multiple equation time series approach for the short-term load forecasting, reporting an average mean absolute percentage error (MAPE) of 1.36% on an 11-years dataset for the Queensland region in Australia. We present an adaptation of their model to the electrical power load consumption for the whole Quebec province in Canada. More precisely, we take into account two additional meteorological variables — cloudiness and wind speed — on top of temperature, as well as the use of multiple meteorological measurements taken at different locations on the territory. We also consider other minor improvements. Our final model shows an average MAPE score of 1:79% over an 8-years dataset.

Keywords: short-term load forecasting, special days, time series, multiple equations, parallelization, clustering

Procedia PDF Downloads 104
1079 Hybrid Reliability-Similarity-Based Approach for Supervised Machine Learning

Authors: Walid Cherif

Abstract:

Data mining has, over recent years, seen big advances because of the spread of internet, which generates everyday a tremendous volume of data, and also the immense advances in technologies which facilitate the analysis of these data. In particular, classification techniques are a subdomain of Data Mining which determines in which group each data instance is related within a given dataset. It is used to classify data into different classes according to desired criteria. Generally, a classification technique is either statistical or machine learning. Each type of these techniques has its own limits. Nowadays, current data are becoming increasingly heterogeneous; consequently, current classification techniques are encountering many difficulties. This paper defines new measure functions to quantify the resemblance between instances and then combines them in a new approach which is different from actual algorithms by its reliability computations. Results of the proposed approach exceeded most common classification techniques with an f-measure exceeding 97% on the IRIS Dataset.

Keywords: data mining, knowledge discovery, machine learning, similarity measurement, supervised classification

Procedia PDF Downloads 465
1078 Improving the Performance of Deep Learning in Facial Emotion Recognition with Image Sharpening

Authors: Ksheeraj Sai Vepuri, Nada Attar

Abstract:

We as humans use words with accompanying visual and facial cues to communicate effectively. Classifying facial emotion using computer vision methodologies has been an active research area in the computer vision field. In this paper, we propose a simple method for facial expression recognition that enhances accuracy. We tested our method on the FER-2013 dataset that contains static images. Instead of using Histogram equalization to preprocess the dataset, we used Unsharp Mask to emphasize texture and details and sharpened the edges. We also used ImageDataGenerator from Keras library for data augmentation. Then we used Convolutional Neural Networks (CNN) model to classify the images into 7 different facial expressions, yielding an accuracy of 69.46% on the test set. Our results show that using image preprocessing such as the sharpening technique for a CNN model can improve the performance, even when the CNN model is relatively simple.

Keywords: facial expression recognittion, image preprocessing, deep learning, CNN

Procedia PDF Downloads 143
1077 Sourcing and Compiling a Maltese Traffic Dataset MalTra

Authors: Gabriele Borg, Alexei De Bono, Charlie Abela

Abstract:

There on a constant rise in the availability of high volumes of data gathered from multiple sources, resulting in an abundance of unprocessed information that can be used to monitor patterns and trends in user behaviour. Similarly, year after year, Malta is also constantly experiencing ongoing population growth and an increase in mobilization demand. This research takes advantage of data which is continuously being sourced and converting it into useful information related to the traffic problem on the Maltese roads. The scope of this paper is to provide a methodology to create a custom dataset (MalTra - Malta Traffic) compiled from multiple participants from various locations across the island to identify the most common routes taken to expose the main areas of activity. This use of big data is seen being used in various technologies and is referred to as ITSs (Intelligent Transportation Systems), which has been concluded that there is significant potential in utilising such sources of data on a nationwide scale.

Keywords: Big Data, vehicular traffic, traffic management, mobile data patterns

Procedia PDF Downloads 109
1076 Image Ranking to Assist Object Labeling for Training Detection Models

Authors: Tonislav Ivanov, Oleksii Nedashkivskyi, Denis Babeshko, Vadim Pinskiy, Matthew Putman

Abstract:

Training a machine learning model for object detection that generalizes well is known to benefit from a training dataset with diverse examples. However, training datasets usually contain many repeats of common examples of a class and lack rarely seen examples. This is due to the process commonly used during human annotation where a person would proceed sequentially through a list of images labeling a sufficiently high total number of examples. Instead, the method presented involves an active process where, after the initial labeling of several images is completed, the next subset of images for labeling is selected by an algorithm. This process of algorithmic image selection and manual labeling continues in an iterative fashion. The algorithm used for the image selection is a deep learning algorithm, based on the U-shaped architecture, which quantifies the presence of unseen data in each image in order to find images that contain the most novel examples. Moreover, the location of the unseen data in each image is highlighted, aiding the labeler in spotting these examples. Experiments performed using semiconductor wafer data show that labeling a subset of the data, curated by this algorithm, resulted in a model with a better performance than a model produced from sequentially labeling the same amount of data. Also, similar performance is achieved compared to a model trained on exhaustive labeling of the whole dataset. Overall, the proposed approach results in a dataset that has a diverse set of examples per class as well as more balanced classes, which proves beneficial when training a deep learning model.

Keywords: computer vision, deep learning, object detection, semiconductor

Procedia PDF Downloads 136
1075 Global Based Histogram for 3D Object Recognition

Authors: Somar Boubou, Tatsuo Narikiyo, Michihiro Kawanishi

Abstract:

In this work, we address the problem of 3D object recognition with depth sensors such as Kinect or Structure sensor. Compared with traditional approaches based on local descriptors, which depends on local information around the object key points, we propose a global features based descriptor. Proposed descriptor, which we name as Differential Histogram of Normal Vectors (DHONV), is designed particularly to capture the surface geometric characteristics of the 3D objects represented by depth images. We describe the 3D surface of an object in each frame using a 2D spatial histogram capturing the normalized distribution of differential angles of the surface normal vectors. The object recognition experiments on the benchmark RGB-D object dataset and a self-collected dataset show that our proposed descriptor outperforms two others descriptors based on spin-images and histogram of normal vectors with linear-SVM classifier.

Keywords: vision in control, robotics, histogram, differential histogram of normal vectors

Procedia PDF Downloads 279
1074 Efficient Fake News Detection Using Machine Learning and Deep Learning Approaches

Authors: Chaima Babi, Said Gadri

Abstract:

The rapid increase in fake news continues to grow at a very fast rate; this requires implementing efficient techniques that allow testing the re-liability of online content. For that, the current research strives to illuminate the fake news problem using deep learning DL and machine learning ML ap-proaches. We have developed the traditional LSTM (Long short-term memory), and the bidirectional BiLSTM model. A such process is to perform a training task on almost of samples of the dataset, validate the model on a subset called the test set to provide an unbiased evaluation of the final model fit on the training dataset, then compute the accuracy of detecting classifica-tion and comparing the results. For the programming stage, we used Tensor-Flow and Keras libraries on Python to support Graphical Processing Units (GPUs) that are being used for developing deep learning applications.

Keywords: machine learning, deep learning, natural language, fake news, Bi-LSTM, LSTM, multiclass classification

Procedia PDF Downloads 95
1073 Semi-Supervised Outlier Detection Using a Generative and Adversary Framework

Authors: Jindong Gu, Matthias Schubert, Volker Tresp

Abstract:

In many outlier detection tasks, only training data belonging to one class, i.e., the positive class, is available. The task is then to predict a new data point as belonging either to the positive class or to the negative class, in which case the data point is considered an outlier. For this task, we propose a novel corrupted Generative Adversarial Network (CorGAN). In the adversarial process of training CorGAN, the Generator generates outlier samples for the negative class, and the Discriminator is trained to distinguish the positive training data from the generated negative data. The proposed framework is evaluated using an image dataset and a real-world network intrusion dataset. Our outlier-detection method achieves state-of-the-art performance on both tasks.

Keywords: one-class classification, outlier detection, generative adversary networks, semi-supervised learning

Procedia PDF Downloads 151
1072 Effect of Genuine Missing Data Imputation on Prediction of Urinary Incontinence

Authors: Suzan Arslanturk, Mohammad-Reza Siadat, Theophilus Ogunyemi, Ananias Diokno

Abstract:

Missing data is a common challenge in statistical analyses of most clinical survey datasets. A variety of methods have been developed to enable analysis of survey data to deal with missing values. Imputation is the most commonly used among the above methods. However, in order to minimize the bias introduced due to imputation, one must choose the right imputation technique and apply it to the correct type of missing data. In this paper, we have identified different types of missing values: missing data due to skip pattern (SPMD), undetermined missing data (UMD), and genuine missing data (GMD) and applied rough set imputation on only the GMD portion of the missing data. We have used rough set imputation to evaluate the effect of such imputation on prediction by generating several simulation datasets based on an existing epidemiological dataset (MESA). To measure how well each dataset lends itself to the prediction model (logistic regression), we have used p-values from the Wald test. To evaluate the accuracy of the prediction, we have considered the width of 95% confidence interval for the probability of incontinence. Both imputed and non-imputed simulation datasets were fit to the prediction model, and they both turned out to be significant (p-value < 0.05). However, the Wald score shows a better fit for the imputed compared to non-imputed datasets (28.7 vs. 23.4). The average confidence interval width was decreased by 10.4% when the imputed dataset was used, meaning higher precision. The results show that using the rough set method for missing data imputation on GMD data improve the predictive capability of the logistic regression. Further studies are required to generalize this conclusion to other clinical survey datasets.

Keywords: rough set, imputation, clinical survey data simulation, genuine missing data, predictive index

Procedia PDF Downloads 168
1071 Parallel Genetic Algorithms Clustering for Handling Recruitment Problem

Authors: Walid Moudani, Ahmad Shahin

Abstract:

This research presents a study to handle the recruitment services system. It aims to enhance a business intelligence system by embedding data mining in its core engine and to facilitate the link between job searchers and recruiters companies. The purpose of this study is to present an intelligent management system for supporting recruitment services based on data mining methods. It consists to apply segmentation on the extracted job postings offered by the different recruiters. The details of the job postings are associated to a set of relevant features that are extracted from the web and which are based on critical criterion in order to define consistent clusters. Thereafter, we assign the job searchers to the best cluster while providing a ranking according to the job postings of the selected cluster. The performance of the proposed model used is analyzed, based on a real case study, with the clustered job postings dataset and classified job searchers dataset by using some metrics.

Keywords: job postings, job searchers, clustering, genetic algorithms, business intelligence

Procedia PDF Downloads 329
1070 Cascaded Neural Network for Internal Temperature Forecasting in Induction Motor

Authors: Hidir S. Nogay

Abstract:

In this study, two systems were created to predict interior temperature in induction motor. One of them consisted of a simple ANN model which has two layers, ten input parameters and one output parameter. The other one consisted of eight ANN models connected each other as cascaded. Cascaded ANN system has 17 inputs. Main reason of cascaded system being used in this study is to accomplish more accurate estimation by increasing inputs in the ANN system. Cascaded ANN system is compared with simple conventional ANN model to prove mentioned advantages. Dataset was obtained from experimental applications. Small part of the dataset was used to obtain more understandable graphs. Number of data is 329. 30% of the data was used for testing and validation. Test data and validation data were determined for each ANN model separately and reliability of each model was tested. As a result of this study, it has been understood that the cascaded ANN system produced more accurate estimates than conventional ANN model.

Keywords: cascaded neural network, internal temperature, inverter, three-phase induction motor

Procedia PDF Downloads 345
1069 A t-SNE and UMAP Based Neural Network Image Classification Algorithm

Authors: Shelby Simpson, William Stanley, Namir Naba, Xiaodi Wang

Abstract:

Both t-SNE and UMAP are brand new state of art tools to predominantly preserve the local structure that is to group neighboring data points together, which indeed provides a very informative visualization of heterogeneity in our data. In this research, we develop a t-SNE and UMAP base neural network image classification algorithm to embed the original dataset to a corresponding low dimensional dataset as a preprocessing step, then use this embedded database as input to our specially designed neural network classifier for image classification. We use the fashion MNIST data set, which is a labeled data set of images of clothing objects in our experiments. t-SNE and UMAP are used for dimensionality reduction of the data set and thus produce low dimensional embeddings. Furthermore, we use the embeddings from t-SNE and UMAP to feed into two neural networks. The accuracy of the models from the two neural networks is then compared to a dense neural network that does not use embedding as an input to show which model can classify the images of clothing objects more accurately.

Keywords: t-SNE, UMAP, fashion MNIST, neural networks

Procedia PDF Downloads 198
1068 A Deep Learning Approach to Detect Complete Safety Equipment for Construction Workers Based on YOLOv7

Authors: Shariful Islam, Sharun Akter Khushbu, S. M. Shaqib, Shahriar Sultan Ramit

Abstract:

In the construction sector, ensuring worker safety is of the utmost significance. In this study, a deep learning-based technique is presented for identifying safety gear worn by construction workers, such as helmets, goggles, jackets, gloves, and footwear. The suggested method precisely locates these safety items by using the YOLO v7 (You Only Look Once) object detection algorithm. The dataset utilized in this work consists of labeled images split into training, testing and validation sets. Each image has bounding box labels that indicate where the safety equipment is located within the image. The model is trained to identify and categorize the safety equipment based on the labeled dataset through an iterative training approach. We used custom dataset to train this model. Our trained model performed admirably well, with good precision, recall, and F1-score for safety equipment recognition. Also, the model's evaluation produced encouraging results, with a [email protected] score of 87.7%. The model performs effectively, making it possible to quickly identify safety equipment violations on building sites. A thorough evaluation of the outcomes reveals the model's advantages and points up potential areas for development. By offering an automatic and trustworthy method for safety equipment detection, this research contributes to the fields of computer vision and workplace safety. The proposed deep learning-based approach will increase safety compliance and reduce the risk of accidents in the construction industry.

Keywords: deep learning, safety equipment detection, YOLOv7, computer vision, workplace safety

Procedia PDF Downloads 68
1067 Computational Cell Segmentation in Immunohistochemically Image of Meningioma Tumor Using Fuzzy C-Means and Adaptive Vector Directional Filter

Authors: Vahid Anari, Leila Shahmohammadi

Abstract:

Diagnosing and interpreting manually from a large cohort dataset of immunohistochemically stained tissue of tumors using an optical microscope involves subjectivity and also is tedious for pathologist specialists. Moreover, digital pathology today represents more of an evolution than a revolution in pathology. In this paper, we develop and test an unsupervised algorithm that can automatically enhance the IHC image of a meningioma tumor and classify cells into positive (proliferative) and negative (normal) cells. A dataset including 150 images is used to test the scheme. In addition, a new adaptive color image enhancement method is proposed based on a vector directional filter (VDF) and statistical properties of filtering the window. Since the cells are distinguishable by the human eye, the accuracy and stability of the algorithm are quantitatively compared through application to a wide variety of real images.

Keywords: digital pathology, cell segmentation, immunohistochemically, noise reduction

Procedia PDF Downloads 67
1066 COVID-19 Detection from Computed Tomography Images Using UNet Segmentation, Region Extraction, and Classification Pipeline

Authors: Kenan Morani, Esra Kaya Ayana

Abstract:

This study aimed to develop a novel pipeline for COVID-19 detection using a large and rigorously annotated database of computed tomography (CT) images. The pipeline consists of UNet-based segmentation, lung extraction, and a classification part, with the addition of optional slice removal techniques following the segmentation part. In this work, a batch normalization was added to the original UNet model to produce lighter and better localization, which is then utilized to build a full pipeline for COVID-19 diagnosis. To evaluate the effectiveness of the proposed pipeline, various segmentation methods were compared in terms of their performance and complexity. The proposed segmentation method with batch normalization outperformed traditional methods and other alternatives, resulting in a higher dice score on a publicly available dataset. Moreover, at the slice level, the proposed pipeline demonstrated high validation accuracy, indicating the efficiency of predicting 2D slices. At the patient level, the full approach exhibited higher validation accuracy and macro F1 score compared to other alternatives, surpassing the baseline. The classification component of the proposed pipeline utilizes a convolutional neural network (CNN) to make final diagnosis decisions. The COV19-CT-DB dataset, which contains a large number of CT scans with various types of slices and rigorously annotated for COVID-19 detection, was utilized for classification. The proposed pipeline outperformed many other alternatives on the dataset.

Keywords: classification, computed tomography, lung extraction, macro F1 score, UNet segmentation

Procedia PDF Downloads 132
1065 Automated Pothole Detection Using Convolution Neural Networks and 3D Reconstruction Using Stereovision

Authors: Eshta Ranyal, Kamal Jain, Vikrant Ranyal

Abstract:

Potholes are a severe threat to road safety and a major contributing factor towards road distress. In the Indian context, they are a major road hazard. Timely detection of potholes and subsequent repair can prevent the roads from deteriorating. To facilitate the roadway authorities in the timely detection and repair of potholes, we propose a pothole detection methodology using convolutional neural networks. The YOLOv3 model is used as it is fast and accurate in comparison to other state-of-the-art models. You only look once v3 (YOLOv3) is a state-of-the-art, real-time object detection system that features multi-scale detection. A mean average precision(mAP) of 73% was obtained on a training dataset of 200 images. The dataset was then increased to 500 images, resulting in an increase in mAP. We further calculated the depth of the potholes using stereoscopic vision by reconstruction of 3D potholes. This enables calculating pothole volume, its extent, which can then be used to evaluate the pothole severity as low, moderate, high.

Keywords: CNN, pothole detection, pothole severity, YOLO, stereovision

Procedia PDF Downloads 136
1064 Botnet Detection with ML Techniques by Using the BoT-IoT Dataset

Authors: Adnan Baig, Ishteeaq Naeem, Saad Mansoor

Abstract:

The Internet of Things (IoT) gadgets have advanced quickly in recent years, and their use is steadily rising daily. However, cyber-attackers can target these gadgets due to their distributed nature. Additionally, many IoT devices have significant security flaws in their implementation and design, making them vulnerable to security threats. Hence, these threats can cause important data security and privacy loss from a single attack on network devices or systems. Botnets are a significant security risk that can harm the IoT network; hence, sophisticated techniques are required to mitigate the risk. This work uses a machine learning-based method to identify IoT orchestrated by botnets. The proposed technique identifies the net attack by distinguishing between legitimate and malicious traffic. This article proposes a hyperparameter tuning model to improvise the method to improve the accuracy of existing processes. The results demonstrated an improved and more accurate indication of botnet-based cyber-attacks.

Keywords: Internet of Things, Botnet, BoT-IoT dataset, ML techniques

Procedia PDF Downloads 13
1063 Automatic Detection of Sugarcane Diseases: A Computer Vision-Based Approach

Authors: Himanshu Sharma, Karthik Kumar, Harish Kumar

Abstract:

The major problem in crop cultivation is the occurrence of multiple crop diseases. During the growth stage, timely identification of crop diseases is paramount to ensure the high yield of crops, lower production costs, and minimize pesticide usage. In most cases, crop diseases produce observable characteristics and symptoms. The Surveyors usually diagnose crop diseases when they walk through the fields. However, surveyor inspections tend to be biased and error-prone due to the nature of the monotonous task and the subjectivity of individuals. In addition, visual inspection of each leaf or plant is costly, time-consuming, and labour-intensive. Furthermore, the plant pathologists and experts who can often identify the disease within the plant according to their symptoms in early stages are not readily available in remote regions. Therefore, this study specifically addressed early detection of leaf scald, red rot, and eyespot types of diseases within sugarcane plants. The study proposes a computer vision-based approach using a convolutional neural network (CNN) for automatic identification of crop diseases. To facilitate this, firstly, images of sugarcane diseases were taken from google without modifying the scene, background, or controlling the illumination to build the training dataset. Then, the testing dataset was developed based on the real-time collected images from the sugarcane field from India. Then, the image dataset is pre-processed for feature extraction and selection. Finally, the CNN-based Visual Geometry Group (VGG) model was deployed on the training and testing dataset to classify the images into diseased and healthy sugarcane plants and measure the model's performance using various parameters, i.e., accuracy, sensitivity, specificity, and F1-score. The promising result of the proposed model lays the groundwork for the automatic early detection of sugarcane disease. The proposed research directly sustains an increase in crop yield.

Keywords: automatic classification, computer vision, convolutional neural network, image processing, sugarcane disease, visual geometry group

Procedia PDF Downloads 116
1062 Clothes Identification Using Inception ResNet V2 and MobileNet V2

Authors: Subodh Chandra Shakya, Badal Shrestha, Suni Thapa, Ashutosh Chauhan, Saugat Adhikari

Abstract:

To tackle our problem of clothes identification, we used different architectures of Convolutional Neural Networks. Among different architectures, the outcome from Inception ResNet V2 and MobileNet V2 seemed promising. On comparison of the metrices, we observed that the Inception ResNet V2 slightly outperforms MobileNet V2 for this purpose. So this paper of ours proposes the cloth identifier using Inception ResNet V2 and also contains the comparison between the outcome of ResNet V2 and MobileNet V2. The document here contains the results and findings of the research that we performed on the DeepFashion Dataset. To improve the dataset, we used different image preprocessing techniques like image shearing, image rotation, and denoising. The whole experiment was conducted with the intention of testing the efficiency of convolutional neural networks on cloth identification so that we could develop a reliable system that is good enough in identifying the clothes worn by the users. The whole system can be integrated with some kind of recommendation system.

Keywords: inception ResNet, convolutional neural net, deep learning, confusion matrix, data augmentation, data preprocessing

Procedia PDF Downloads 187
1061 A Hybrid System for Boreholes Soil Sample

Authors: Ali Ulvi Uzer

Abstract:

Data reduction is an important topic in the field of pattern recognition applications. The basic concept is the reduction of multitudinous amounts of data down to the meaningful parts. The Principal Component Analysis (PCA) method is frequently used for data reduction. The Support Vector Machine (SVM) method is a discriminative classifier formally defined by a separating hyperplane. In other words, given labeled training data, the algorithm outputs an optimal hyperplane which categorizes new examples. This study offers a hybrid approach that uses the PCA for data reduction and Support Vector Machines (SVM) for classification. In order to detect the accuracy of the suggested system, two boreholes taken from the soil sample was used. The classification accuracies for this dataset were obtained through using ten-fold cross-validation method. As the results suggest, this system, which is performed through size reduction, is a feasible system for faster recognition of dataset so our study result appears to be very promising.

Keywords: feature selection, sequential forward selection, support vector machines, soil sample

Procedia PDF Downloads 455
1060 Continual Learning Using Data Generation for Hyperspectral Remote Sensing Scene Classification

Authors: Samiah Alammari, Nassim Ammour

Abstract:

When providing a massive number of tasks successively to a deep learning process, a good performance of the model requires preserving the previous tasks data to retrain the model for each upcoming classification. Otherwise, the model performs poorly due to the catastrophic forgetting phenomenon. To overcome this shortcoming, we developed a successful continual learning deep model for remote sensing hyperspectral image regions classification. The proposed neural network architecture encapsulates two trainable subnetworks. The first module adapts its weights by minimizing the discrimination error between the land-cover classes during the new task learning, and the second module tries to learn how to replicate the data of the previous tasks by discovering the latent data structure of the new task dataset. We conduct experiments on HSI dataset Indian Pines. The results confirm the capability of the proposed method.

Keywords: continual learning, data reconstruction, remote sensing, hyperspectral image segmentation

Procedia PDF Downloads 266
1059 Machine Learning-Enabled Classification of Climbing Using Small Data

Authors: Nicholas Milburn, Yu Liang, Dalei Wu

Abstract:

Athlete performance scoring within the climbing do-main presents interesting challenges as the sport does not have an objective way to assign skill. Assessing skill levels within any sport is valuable as it can be used to mark progress while training, and it can help an athlete choose appropriate climbs to attempt. Machine learning-based methods are popular for complex problems like this. The dataset available was composed of dynamic force data recorded during climbing; however, this dataset came with challenges such as data scarcity, imbalance, and it was temporally heterogeneous. Investigated solutions to these challenges include data augmentation, temporal normalization, conversion of time series to the spectral domain, and cross validation strategies. The investigated solutions to the classification problem included light weight machine classifiers KNN and SVM as well as the deep learning with CNN. The best performing model had an 80% accuracy. In conclusion, there seems to be enough information within climbing force data to accurately categorize climbers by skill.

Keywords: classification, climbing, data imbalance, data scarcity, machine learning, time sequence

Procedia PDF Downloads 143
1058 A Web-Based Systems Immunology Toolkit Allowing the Visualization and Comparative Analysis of Publically Available Collective Data to Decipher Immune Regulation in Early Life

Authors: Mahbuba Rahman, Sabri Boughorbel, Scott Presnell, Charlie Quinn, Darawan Rinchai, Damien Chaussabel, Nico Marr

Abstract:

Collections of large-scale datasets made available in public repositories can be used to identify and fill gaps in biomedical knowledge. But first, these data need to be made readily accessible to researchers for analysis and interpretation. Here a collection of transcriptome datasets was made available to investigate the functional programming of human hematopoietic cells in early life. Thirty two datasets were retrieved from the NCBI Gene Expression Omnibus (GEO) and loaded in a custom, interactive web application called the Gene Expression browser (GXB), designed for visualization and query of integrated large-scale data. Multiple sample groupings and gene rank lists were created based on the study design and variables in each dataset. Web links to customized graphical views can be generated by users and subsequently be used to graphically present data in manuscripts for publication. The GXB tool also enables browsing of a single gene across datasets, which can provide information on the role of a given molecule across biological systems. The dataset collection is available online. As a proof-of-principle, one of the datasets (GSE25087) was re-analyzed to identify genes that are differentially expressed by regulatory T cells in early life. Re-analysis of this dataset and a cross-study comparison using multiple other datasets in the above mentioned collection revealed that PMCH, a gene encoding a precursor of melanin-concentrating hormone (MCH), a cyclic neuropeptide, is highly expressed in a variety of other hematopoietic cell types, including neonatal erythroid cells as well as plasmacytoid dendritic cells upon viral infection. Our findings suggest an as yet unrecognized role of MCH in immune regulation, thereby highlighting the unique potential of the curated dataset collection and systems biology approach to generate new hypotheses which can be tested in future mechanistic studies.

Keywords: early-life, GEO datasets, PMCH, interactive query, systems biology

Procedia PDF Downloads 296
1057 A Non-parametric Clustering Approach for Multivariate Geostatistical Data

Authors: Francky Fouedjio

Abstract:

Multivariate geostatistical data have become omnipresent in the geosciences and pose substantial analysis challenges. One of them is the grouping of data locations into spatially contiguous clusters so that data locations within the same cluster are more similar while clusters are different from each other, in some sense. Spatially contiguous clusters can significantly improve the interpretation that turns the resulting clusters into meaningful geographical subregions. In this paper, we develop an agglomerative hierarchical clustering approach that takes into account the spatial dependency between observations. It relies on a dissimilarity matrix built from a non-parametric kernel estimator of the spatial dependence structure of data. It integrates existing methods to find the optimal cluster number and to evaluate the contribution of variables to the clustering. The capability of the proposed approach to provide spatially compact, connected and meaningful clusters is assessed using bivariate synthetic dataset and multivariate geochemical dataset. The proposed clustering method gives satisfactory results compared to other similar geostatistical clustering methods.

Keywords: clustering, geostatistics, multivariate data, non-parametric

Procedia PDF Downloads 477
1056 Using Machine Learning Techniques for Autism Spectrum Disorder Analysis and Detection in Children

Authors: Norah Mohammed Alshahrani, Abdulaziz Almaleh

Abstract:

Autism Spectrum Disorder (ASD) is a condition related to issues with brain development that affects how a person recognises and communicates with others which results in difficulties with interaction and communication socially and it is constantly growing. Early recognition of ASD allows children to lead safe and healthy lives and helps doctors with accurate diagnoses and management of conditions. Therefore, it is crucial to develop a method that will achieve good results and with high accuracy for the measurement of ASD in children. In this paper, ASD datasets of toddlers and children have been analyzed. We employed the following machine learning techniques to attempt to explore ASD and they are Random Forest (RF), Decision Tree (DT), Na¨ıve Bayes (NB) and Support Vector Machine (SVM). Then Feature selection was used to provide fewer attributes from ASD datasets while preserving model performance. As a result, we found that the best result has been provided by the Support Vector Machine (SVM), achieving 0.98% in the toddler dataset and 0.99% in the children dataset.

Keywords: autism spectrum disorder, machine learning, feature selection, support vector machine

Procedia PDF Downloads 152
1055 Architectural Adaptation for Road Humps Detection in Adverse Light Scenario

Authors: Padmini S. Navalgund, Manasi Naik, Ujwala Patil

Abstract:

Road hump is a semi-cylindrical elevation on the road made across specific locations of the road. The vehicle needs to maneuver the hump by reducing the speed to avoid car damage and pass over the road hump safely. Road Humps on road surfaces, if identified in advance, help to maintain the security and stability of vehicles, especially in adverse visibility conditions, viz. night scenarios. We have proposed a deep learning architecture adaptation by implementing the MISH activation function and developing a new classification loss function called "Effective Focal Loss" for Indian road humps detection in adverse light scenarios. We captured images comprising of marked and unmarked road humps from two different types of cameras across South India to build a heterogeneous dataset. A heterogeneous dataset enabled the algorithm to train and improve the accuracy of detection. The images were pre-processed, annotated for two classes viz, marked hump and unmarked hump. The dataset from these images was used to train the single-stage object detection algorithm. We utilised an algorithm to synthetically generate reduced visible road humps scenarios. We observed that our proposed framework effectively detected the marked and unmarked hump in the images in clear and ad-verse light environments. This architectural adaptation sets up an option for early detection of Indian road humps in reduced visibility conditions, thereby enhancing the autonomous driving technology to handle a wider range of real-world scenarios.

Keywords: Indian road hump, reduced visibility condition, low light condition, adverse light condition, marked hump, unmarked hump, YOLOv9

Procedia PDF Downloads 27
1054 Propagation of DEM Varying Accuracy into Terrain-Based Analysis

Authors: Wassim Katerji, Mercedes Farjas, Carmen Morillo

Abstract:

Terrain-Based Analysis results in derived products from an input DEM and these products are needed to perform various analyses. To efficiently use these products in decision-making, their accuracies must be estimated systematically. This paper proposes a procedure to assess the accuracy of these derived products, by calculating the accuracy of the slope dataset and its significance, taking as an input the accuracy of the DEM. Based on the output of previously published research on modeling the relative accuracy of a DEM, specifically ASTER and SRTM DEMs with Lebanon coverage as the area of study, analysis have showed that ASTER has a low significance in the majority of the area where only 2% of the modeled terrain has 50% or more significance. On the other hand, SRTM showed a better significance, where 37% of the modeled terrain has 50% or more significance. Statistical analysis deduced that the accuracy of the slope dataset, calculated on a cell-by-cell basis, is highly correlated to the accuracy of the input DEM. However, this correlation becomes lower between the slope accuracy and the slope significance, whereas it becomes much higher between the modeled slope and the slope significance.

Keywords: terrain-based analysis, slope, accuracy assessment, Digital Elevation Model (DEM)

Procedia PDF Downloads 446
1053 Towards Real-Time Classification of Finger Movement Direction Using Encephalography Independent Components

Authors: Mohamed Mounir Tellache, Hiroyuki Kambara, Yasuharu Koike, Makoto Miyakoshi, Natsue Yoshimura

Abstract:

This study explores the practicality of using electroencephalographic (EEG) independent components to predict eight-direction finger movements in pseudo-real-time. Six healthy participants with individual-head MRI images performed finger movements in eight directions with two different arm configurations. The analysis was performed in two stages. The first stage consisted of using independent component analysis (ICA) to separate the signals representing brain activity from non-brain activity signals and to obtain the unmixing matrix. The resulting independent components (ICs) were checked, and those reflecting brain-activity were selected. Finally, the time series of the selected ICs were used to predict eight finger-movement directions using Sparse Logistic Regression (SLR). The second stage consisted of using the previously obtained unmixing matrix, the selected ICs, and the model obtained by applying SLR to classify a different EEG dataset. This method was applied to two different settings, namely the single-participant level and the group-level. For the single-participant level, the EEG dataset used in the first stage and the EEG dataset used in the second stage originated from the same participant. For the group-level, the EEG datasets used in the first stage were constructed by temporally concatenating each combination without repetition of the EEG datasets of five participants out of six, whereas the EEG dataset used in the second stage originated from the remaining participants. The average test classification results across datasets (mean ± S.D.) were 38.62 ± 8.36% for the single-participant, which was significantly higher than the chance level (12.50 ± 0.01%), and 27.26 ± 4.39% for the group-level which was also significantly higher than the chance level (12.49% ± 0.01%). The classification accuracy within [–45°, 45°] of the true direction is 70.03 ± 8.14% for single-participant and 62.63 ± 6.07% for group-level which may be promising for some real-life applications. Clustering and contribution analyses further revealed the brain regions involved in finger movement and the temporal aspect of their contribution to the classification. These results showed the possibility of using the ICA-based method in combination with other methods to build a real-time system to control prostheses.

Keywords: brain-computer interface, electroencephalography, finger motion decoding, independent component analysis, pseudo real-time motion decoding

Procedia PDF Downloads 138
1052 Floodnet: Classification for Post Flood Scene with a High-Resolution Aerial Imaginary Dataset

Authors: Molakala Mourya Vardhan Reddy, Kandimala Revanth, Koduru Sumanth, Beena B. M.

Abstract:

Emergency response and recovery operations are severely hampered by natural catastrophes, especially floods. Understanding post-flood scenarios is essential to disaster management because it facilitates quick evaluation and decision-making. To this end, we introduce FloodNet, a brand-new high-resolution aerial picture collection created especially for comprehending post-flood scenes. A varied collection of excellent aerial photos taken during and after flood occurrences make up FloodNet, which offers comprehensive representations of flooded landscapes, damaged infrastructure, and changed topographies. The dataset provides a thorough resource for training and assessing computer vision models designed to handle the complexity of post-flood scenarios, including a variety of environmental conditions and geographic regions. Pixel-level semantic segmentation masks are used to label the pictures in FloodNet, allowing for a more detailed examination of flood-related characteristics, including debris, water bodies, and damaged structures. Furthermore, temporal and positional metadata improve the dataset's usefulness for longitudinal research and spatiotemporal analysis. For activities like flood extent mapping, damage assessment, and infrastructure recovery projection, we provide baseline standards and evaluation metrics to promote research and development in the field of post-flood scene comprehension. By integrating FloodNet into machine learning pipelines, it will be easier to create reliable algorithms that will help politicians, urban planners, and first responders make choices both before and after floods. The goal of the FloodNet dataset is to support advances in computer vision, remote sensing, and disaster response technologies by providing a useful resource for researchers. FloodNet helps to create creative solutions for boosting communities' resilience in the face of natural catastrophes by tackling the particular problems presented by post-flood situations.

Keywords: image classification, segmentation, computer vision, nature disaster, unmanned arial vehicle(UAV), machine learning.

Procedia PDF Downloads 79