Search results for: interpretable machine learning models
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 13699

Search results for: interpretable machine learning models

13219 Next Generation Radiation Risk Assessment and Prediction Tools Generation Applying AI-Machine (Deep) Learning Algorithms

Authors: Selim M. Khan

Abstract:

Indoor air quality is strongly influenced by the presence of radioactive radon (222Rn) gas. Indeed, exposure to high 222Rn concentrations is unequivocally linked to DNA damage and lung cancer and is a worsening issue in North American and European built environments, having increased over time within newer housing stocks as a function of as yet unclear variables. Indoor air radon concentration can be influenced by a wide range of environmental, structural, and behavioral factors. As some of these factors are quantitative while others are qualitative, no single statistical model can determine indoor radon level precisely while simultaneously considering all these variables across a complex and highly diverse dataset. The ability of AI- machine (deep) learning to simultaneously analyze multiple quantitative and qualitative features makes it suitable to predict radon with a high degree of precision. Using Canadian and Swedish long-term indoor air radon exposure data, we are using artificial deep neural network models with random weights and polynomial statistical models in MATLAB to assess and predict radon health risk to human as a function of geospatial, human behavioral, and built environmental metrics. Our initial artificial neural network with random weights model run by sigmoid activation tested different combinations of variables and showed the highest prediction accuracy (>96%) within the reasonable iterations. Here, we present details of these emerging methods and discuss strengths and weaknesses compared to the traditional artificial neural network and statistical methods commonly used to predict indoor air quality in different countries. We propose an artificial deep neural network with random weights as a highly effective method for assessing and predicting indoor radon.

Keywords: radon, radiation protection, lung cancer, aI-machine deep learnng, risk assessment, risk prediction, Europe, North America

Procedia PDF Downloads 81
13218 Using Machine-Learning Methods for Allergen Amino Acid Sequence's Permutations

Authors: Kuei-Ling Sun, Emily Chia-Yu Su

Abstract:

Allergy is a hypersensitive overreaction of the immune system to environmental stimuli, and a major health problem. These overreactions include rashes, sneezing, fever, food allergies, anaphylaxis, asthmatic, shock, or other abnormal conditions. Allergies can be caused by food, insect stings, pollen, animal wool, and other allergens. Their development of allergies is due to both genetic and environmental factors. Allergies involve immunoglobulin E antibodies, a part of the body’s immune system. Immunoglobulin E antibodies will bind to an allergen and then transfer to a receptor on mast cells or basophils triggering the release of inflammatory chemicals such as histamine. Based on the increasingly serious problem of environmental change, changes in lifestyle, air pollution problem, and other factors, in this study, we both collect allergens and non-allergens from several databases and use several machine learning methods for classification, including logistic regression (LR), stepwise regression, decision tree (DT) and neural networks (NN) to do the model comparison and determine the permutations of allergen amino acid’s sequence.

Keywords: allergy, classification, decision tree, logistic regression, machine learning

Procedia PDF Downloads 285
13217 Autism Disease Detection Using Transfer Learning Techniques: Performance Comparison between Central Processing Unit vs. Graphics Processing Unit Functions for Neural Networks

Authors: Mst Shapna Akter, Hossain Shahriar

Abstract:

Neural network approaches are machine learning methods used in many domains, such as healthcare and cyber security. Neural networks are mostly known for dealing with image datasets. While training with the images, several fundamental mathematical operations are carried out in the Neural Network. The operation includes a number of algebraic and mathematical functions, including derivative, convolution, and matrix inversion and transposition. Such operations require higher processing power than is typically needed for computer usage. Central Processing Unit (CPU) is not appropriate for a large image size of the dataset as it is built with serial processing. While Graphics Processing Unit (GPU) has parallel processing capabilities and, therefore, has higher speed. This paper uses advanced Neural Network techniques such as VGG16, Resnet50, Densenet, Inceptionv3, Xception, Mobilenet, XGBOOST-VGG16, and our proposed models to compare CPU and GPU resources. A system for classifying autism disease using face images of an autistic and non-autistic child was used to compare performance during testing. We used evaluation matrices such as Accuracy, F1 score, Precision, Recall, and Execution time. It has been observed that GPU runs faster than the CPU in all tests performed. Moreover, the performance of the Neural Network models in terms of accuracy increases on GPU compared to CPU.

Keywords: autism disease, neural network, CPU, GPU, transfer learning

Procedia PDF Downloads 96
13216 An Assessment of Floodplain Vegetation Response to Groundwater Changes Using the Soil & Water Assessment Tool Hydrological Model, Geographic Information System, and Machine Learning in the Southeast Australian River Basin

Authors: Newton Muhury, Armando A. Apan, Tek N. Marasani, Gebiaw T. Ayele

Abstract:

The changing climate has degraded freshwater availability in Australia that influencing vegetation growth to a great extent. This study assessed the vegetation responses to groundwater using Terra’s moderate resolution imaging spectroradiometer (MODIS), Normalised Difference Vegetation Index (NDVI), and soil water content (SWC). A hydrological model, SWAT, has been set up in a southeast Australian river catchment for groundwater analysis. The model was calibrated and validated against monthly streamflow from 2001 to 2006 and 2007 to 2010, respectively. The SWAT simulated soil water content for 43 sub-basins and monthly MODIS NDVI data for three different types of vegetation (forest, shrub, and grass) were applied in the machine learning tool, Waikato Environment for Knowledge Analysis (WEKA), using two supervised machine learning algorithms, i.e., support vector machine (SVM) and random forest (RF). The assessment shows that different types of vegetation response and soil water content vary in the dry and wet seasons. The WEKA model generated high positive relationships (r = 0.76, 0.73, and 0.81) between NDVI values of all vegetation in the sub-basins against soil water content (SWC), the groundwater flow (GW), and the combination of these two variables, respectively, during the dry season. However, these responses were reduced by 36.8% (r = 0.48) and 13.6% (r = 0.63) against GW and SWC, respectively, in the wet season. Although the rainfall pattern is highly variable in the study area, the summer rainfall is very effective for the growth of the grass vegetation type. This study has enriched our knowledge of vegetation responses to groundwater in each season, which will facilitate better floodplain vegetation management.

Keywords: ArcSWAT, machine learning, floodplain vegetation, MODIS NDVI, groundwater

Procedia PDF Downloads 81
13215 The Use of Boosted Multivariate Trees in Medical Decision-Making for Repeated Measurements

Authors: Ebru Turgal, Beyza Doganay Erdogan

Abstract:

Machine learning aims to model the relationship between the response and features. Medical decision-making researchers would like to make decisions about patients’ course and treatment, by examining the repeated measurements over time. Boosting approach is now being used in machine learning area for these aims as an influential tool. The aim of this study is to show the usage of multivariate tree boosting in this field. The main reason for utilizing this approach in the field of decision-making is the ease solutions of complex relationships. To show how multivariate tree boosting method can be used to identify important features and feature-time interaction, we used the data, which was collected retrospectively from Ankara University Chest Diseases Department records. Dataset includes repeated PF ratio measurements. The follow-up time is planned for 120 hours. A set of different models is tested. In conclusion, main idea of classification with weighed combination of classifiers is a reliable method which was shown with simulations several times. Furthermore, time varying variables will be taken into consideration within this concept and it could be possible to make accurate decisions about regression and survival problems.

Keywords: boosted multivariate trees, longitudinal data, multivariate regression tree, panel data

Procedia PDF Downloads 188
13214 Machine Learning Predictive Models for Hydroponic Systems: A Case Study Nutrient Film Technique and Deep Flow Technique

Authors: Kritiyaporn Kunsook

Abstract:

Machine learning algorithms (MLAs) such us artificial neural networks (ANNs), decision tree, support vector machines (SVMs), Naïve Bayes, and ensemble classifier by voting are powerful data driven methods that are relatively less widely used in the mapping of technique of system, and thus have not been comparatively evaluated together thoroughly in this field. The performances of a series of MLAs, ANNs, decision tree, SVMs, Naïve Bayes, and ensemble classifier by voting in technique of hydroponic systems prospectively modeling are compared based on the accuracy of each model. Classification of hydroponic systems only covers the test samples from vegetables grown with Nutrient film technique (NFT) and Deep flow technique (DFT). The feature, which are the characteristics of vegetables compose harvesting height width, temperature, require light and color. The results indicate that the classification performance of the ANNs is 98%, decision tree is 98%, SVMs is 97.33%, Naïve Bayes is 96.67%, and ensemble classifier by voting is 98.96% algorithm respectively.

Keywords: artificial neural networks, decision tree, support vector machines, naïve Bayes, ensemble classifier by voting

Procedia PDF Downloads 345
13213 The Outcome of Using Machine Learning in Medical Imaging

Authors: Adel Edwar Waheeb Louka

Abstract:

Purpose AI-driven solutions are at the forefront of many pathology and medical imaging methods. Using algorithms designed to better the experience of medical professionals within their respective fields, the efficiency and accuracy of diagnosis can improve. In particular, X-rays are a fast and relatively inexpensive test that can diagnose diseases. In recent years, X-rays have not been widely used to detect and diagnose COVID-19. The under use of Xrays is mainly due to the low diagnostic accuracy and confounding with pneumonia, another respiratory disease. However, research in this field has expressed a possibility that artificial neural networks can successfully diagnose COVID-19 with high accuracy. Models and Data The dataset used is the COVID-19 Radiography Database. This dataset includes images and masks of chest X-rays under the labels of COVID-19, normal, and pneumonia. The classification model developed uses an autoencoder and a pre-trained convolutional neural network (DenseNet201) to provide transfer learning to the model. The model then uses a deep neural network to finalize the feature extraction and predict the diagnosis for the input image. This model was trained on 4035 images and validated on 807 separate images from the ones used for training. The images used to train the classification model include an important feature: the pictures are cropped beforehand to eliminate distractions when training the model. The image segmentation model uses an improved U-Net architecture. This model is used to extract the lung mask from the chest X-ray image. The model is trained on 8577 images and validated on a validation split of 20%. These models are calculated using the external dataset for validation. The models’ accuracy, precision, recall, f1-score, IOU, and loss are calculated. Results The classification model achieved an accuracy of 97.65% and a loss of 0.1234 when differentiating COVID19-infected, pneumonia-infected, and normal lung X-rays. The segmentation model achieved an accuracy of 97.31% and an IOU of 0.928. Conclusion The models proposed can detect COVID-19, pneumonia, and normal lungs with high accuracy and derive the lung mask from a chest X-ray with similarly high accuracy. The hope is for these models to elevate the experience of medical professionals and provide insight into the future of the methods used.

Keywords: artificial intelligence, convolutional neural networks, deeplearning, image processing, machine learningSarapin, intraarticular, chronic knee pain, osteoarthritisFNS, trauma, hip, neck femur fracture, minimally invasive surgery

Procedia PDF Downloads 45
13212 Machine Learning Assisted Performance Optimization in Memory Tiering

Authors: Derssie Mebratu

Abstract:

As a large variety of micro services, web services, social graphic applications, and media applications are continuously developed, it is substantially vital to design and build a reliable, efficient, and faster memory tiering system. Despite limited design, implementation, and deployment in the last few years, several techniques are currently developed to improve a memory tiering system in a cloud. Some of these techniques are to develop an optimal scanning frequency; improve and track pages movement; identify pages that recently accessed; store pages across each tiering, and then identify pages as a hot, warm, and cold so that hot pages can store in the first tiering Dynamic Random Access Memory (DRAM) and warm pages store in the second tiering Compute Express Link(CXL) and cold pages store in the third tiering Non-Volatile Memory (NVM). Apart from the current proposal and implementation, we also develop a new technique based on a machine learning algorithm in that the throughput produced 25% improved performance compared to the performance produced by the baseline as well as the latency produced 95% improved performance compared to the performance produced by the baseline.

Keywords: machine learning, bayesian optimization, memory tiering, CXL, DRAM

Procedia PDF Downloads 79
13211 Consumer Load Profile Determination with Entropy-Based K-Means Algorithm

Authors: Ioannis P. Panapakidis, Marios N. Moschakis

Abstract:

With the continuous increment of smart meter installations across the globe, the need for processing of the load data is evident. Clustering-based load profiling is built upon the utilization of unsupervised machine learning tools for the purpose of formulating the typical load curves or load profiles. The most commonly used algorithm in the load profiling literature is the K-means. While the algorithm has been successfully tested in a variety of applications, its drawback is the strong dependence in the initialization phase. This paper proposes a novel modified form of the K-means that addresses the aforementioned problem. Simulation results indicate the superiority of the proposed algorithm compared to the K-means.

Keywords: clustering, load profiling, load modeling, machine learning, energy efficiency and quality

Procedia PDF Downloads 144
13210 Critical Evaluation of Groundwater Monitoring Networks for Machine Learning Applications

Authors: Pedro Martinez-Santos, Víctor Gómez-Escalonilla, Silvia Díaz-Alcaide, Esperanza Montero, Miguel Martín-Loeches

Abstract:

Groundwater monitoring networks are critical in evaluating the vulnerability of groundwater resources to depletion and contamination, both in space and time. Groundwater monitoring networks typically grow over decades, often in organic fashion, with relatively little overall planning. The groundwater monitoring networks in the Madrid area, Spain, were reviewed for the purpose of identifying gaps and opportunities for improvement. Spatial analysis reveals the presence of various monitoring networks belonging to different institutions, with several hundred observation wells in an area of approximately 4000 km2. This represents several thousand individual data entries, some going back to the early 1970s. Major issues included overlap between the networks, unknown screen depth/vertical distribution for many observation boreholes, uneven time series, uneven monitored species, and potentially suboptimal locations. Results also reveal there is sufficient information to carry out a spatial and temporal analysis of groundwater vulnerability based on machine learning applications. These can contribute to improve the overall planning of monitoring networks’ expansion into the future.

Keywords: groundwater monitoring, observation networks, machine learning, madrid

Procedia PDF Downloads 62
13209 Self-Supervised Pretraining on Sequences of Functional Magnetic Resonance Imaging Data for Transfer Learning to Brain Decoding Tasks

Authors: Sean Paulsen, Michael Casey

Abstract:

In this work we present a self-supervised pretraining framework for transformers on functional Magnetic Resonance Imaging (fMRI) data. First, we pretrain our architecture on two self-supervised tasks simultaneously to teach the model a general understanding of the temporal and spatial dynamics of human auditory cortex during music listening. Our pretraining results are the first to suggest a synergistic effect of multitask training on fMRI data. Second, we finetune the pretrained models and train additional fresh models on a supervised fMRI classification task. We observe significantly improved accuracy on held-out runs with the finetuned models, which demonstrates the ability of our pretraining tasks to facilitate transfer learning. This work contributes to the growing body of literature on transformer architectures for pretraining and transfer learning with fMRI data, and serves as a proof of concept for our pretraining tasks and multitask pretraining on fMRI data.

Keywords: transfer learning, fMRI, self-supervised, brain decoding, transformer, multitask training

Procedia PDF Downloads 71
13208 Dynamic Measurement System Modeling with Machine Learning Algorithms

Authors: Changqiao Wu, Guoqing Ding, Xin Chen

Abstract:

In this paper, ways of modeling dynamic measurement systems are discussed. Specially, for linear system with single-input single-output, it could be modeled with shallow neural network. Then, gradient based optimization algorithms are used for searching the proper coefficients. Besides, method with normal equation and second order gradient descent are proposed to accelerate the modeling process, and ways of better gradient estimation are discussed. It shows that the mathematical essence of the learning objective is maximum likelihood with noises under Gaussian distribution. For conventional gradient descent, the mini-batch learning and gradient with momentum contribute to faster convergence and enhance model ability. Lastly, experimental results proved the effectiveness of second order gradient descent algorithm, and indicated that optimization with normal equation was the most suitable for linear dynamic models.

Keywords: dynamic system modeling, neural network, normal equation, second order gradient descent

Procedia PDF Downloads 110
13207 Ensemble Methods in Machine Learning: An Algorithmic Approach to Derive Distinctive Behaviors of Criminal Activity Applied to the Poaching Domain

Authors: Zachary Blanks, Solomon Sonya

Abstract:

Poaching presents a serious threat to endangered animal species, environment conservations, and human life. Additionally, some poaching activity has even been linked to supplying funds to support terrorist networks elsewhere around the world. Consequently, agencies dedicated to protecting wildlife habitats have a near intractable task of adequately patrolling an entire area (spanning several thousand kilometers) given limited resources, funds, and personnel at their disposal. Thus, agencies need predictive tools that are both high-performing and easily implementable by the user to help in learning how the significant features (e.g. animal population densities, topography, behavior patterns of the criminals within the area, etc) interact with each other in hopes of abating poaching. This research develops a classification model using machine learning algorithms to aid in forecasting future attacks that is both easy to train and performs well when compared to other models. In this research, we demonstrate how data imputation methods (specifically predictive mean matching, gradient boosting, and random forest multiple imputation) can be applied to analyze data and create significant predictions across a varied data set. Specifically, we apply these methods to improve the accuracy of adopted prediction models (Logistic Regression, Support Vector Machine, etc). Finally, we assess the performance of the model and the accuracy of our data imputation methods by learning on a real-world data set constituting four years of imputed data and testing on one year of non-imputed data. This paper provides three main contributions. First, we extend work done by the Teamcore and CREATE (Center for Risk and Economic Analysis of Terrorism Events) research group at the University of Southern California (USC) working in conjunction with the Department of Homeland Security to apply game theory and machine learning algorithms to develop more efficient ways of reducing poaching. This research introduces ensemble methods (Random Forests and Stochastic Gradient Boosting) and applies it to real-world poaching data gathered from the Ugandan rain forest park rangers. Next, we consider the effect of data imputation on both the performance of various algorithms and the general accuracy of the method itself when applied to a dependent variable where a large number of observations are missing. Third, we provide an alternate approach to predict the probability of observing poaching both by season and by month. The results from this research are very promising. We conclude that by using Stochastic Gradient Boosting to predict observations for non-commercial poaching by season, we are able to produce statistically equivalent results while being orders of magnitude faster in computation time and complexity. Additionally, when predicting potential poaching incidents by individual month vice entire seasons, boosting techniques produce a mean area under the curve increase of approximately 3% relative to previous prediction schedules by entire seasons.

Keywords: ensemble methods, imputation, machine learning, random forests, statistical analysis, stochastic gradient boosting, wildlife protection

Procedia PDF Downloads 270
13206 Determination of Klebsiella Pneumoniae Susceptibility to Antibiotics Using Infrared Spectroscopy and Machine Learning Algorithms

Authors: Manal Suleiman, George Abu-Aqil, Uraib Sharaha, Klaris Riesenberg, Itshak Lapidot, Ahmad Salman, Mahmoud Huleihel

Abstract:

Klebsiella pneumoniae is one of the most aggressive multidrug-resistant bacteria associated with human infections resulting in high mortality and morbidity. Thus, for an effective treatment, it is important to diagnose both the species of infecting bacteria and their susceptibility to antibiotics. Current used methods for diagnosing the bacterial susceptibility to antibiotics are time-consuming (about 24h following the first culture). Thus, there is a clear need for rapid methods to determine the bacterial susceptibility to antibiotics. Infrared spectroscopy is a well-known method that is known as sensitive and simple which is able to detect minor biomolecular changes in biological samples associated with developing abnormalities. The main goal of this study is to evaluate the potential of infrared spectroscopy in tandem with Random Forest and XGBoost machine learning algorithms to diagnose the susceptibility of Klebsiella pneumoniae to antibiotics within approximately 20 minutes following the first culture. In this study, 1190 Klebsiella pneumoniae isolates were obtained from different patients with urinary tract infections. The isolates were measured by the infrared spectrometer, and the spectra were analyzed by machine learning algorithms Random Forest and XGBoost to determine their susceptibility regarding nine specific antibiotics. Our results confirm that it was possible to classify the isolates into sensitive and resistant to specific antibiotics with a success rate range of 80%-85% for the different tested antibiotics. These results prove the promising potential of infrared spectroscopy as a powerful diagnostic method for determining the Klebsiella pneumoniae susceptibility to antibiotics.

Keywords: urinary tract infection (UTI), Klebsiella pneumoniae, bacterial susceptibility, infrared spectroscopy, machine learning

Procedia PDF Downloads 144
13205 Day Ahead and Intraday Electricity Demand Forecasting in Himachal Region using Machine Learning

Authors: Milan Joshi, Harsh Agrawal, Pallaw Mishra, Sanand Sule

Abstract:

Predicting electricity usage is a crucial aspect of organizing and controlling sustainable energy systems. The task of forecasting electricity load is intricate and requires a lot of effort due to the combined impact of social, economic, technical, environmental, and cultural factors on power consumption in communities. As a result, it is important to create strong models that can handle the significant non-linear and complex nature of the task. The objective of this study is to create and compare three machine learning techniques for predicting electricity load for both the day ahead and intraday, taking into account various factors such as meteorological data and social events including holidays and festivals. The proposed methods include a LightGBM, FBProphet, combination of FBProphet and LightGBM for day ahead and Motifs( Stumpy) based on Mueens algorithm for similarity search for intraday. We utilize these techniques to predict electricity usage during normal days and social events in the Himachal Region. We then assess their performance by measuring the MSE, RMSE, and MAPE values. The outcomes demonstrate that the combination of FBProphet and LightGBM method is the most accurate for day ahead and Motifs for intraday forecasting of electricity usage, surpassing other models in terms of MAPE, RMSE, and MSE. Moreover, the FBProphet - LightGBM approach proves to be highly effective in forecasting electricity load during social events, exhibiting precise day ahead predictions. In summary, our proposed electricity forecasting techniques display excellent performance in predicting electricity usage during normal days and special events in the Himachal Region.

Keywords: feature engineering, FBProphet, LightGBM, MASS, Motifs, MAPE

Procedia PDF Downloads 51
13204 Reinforcement Learning for Quality-Oriented Production Process Parameter Optimization Based on Predictive Models

Authors: Akshay Paranjape, Nils Plettenberg, Robert Schmitt

Abstract:

Producing faulty products can be costly for manufacturing companies and wastes resources. To reduce scrap rates in manufacturing, process parameters can be optimized using machine learning. Thus far, research mainly focused on optimizing specific processes using traditional algorithms. To develop a framework that enables real-time optimization based on a predictive model for an arbitrary production process, this study explores the application of reinforcement learning (RL) in this field. Based on a thorough review of literature about RL and process parameter optimization, a model based on maximum a posteriori policy optimization that can handle both numerical and categorical parameters is proposed. A case study compares the model to state–of–the–art traditional algorithms and shows that RL can find optima of similar quality while requiring significantly less time. These results are confirmed in a large-scale validation study on data sets from both production and other fields. Finally, multiple ways to improve the model are discussed.

Keywords: reinforcement learning, production process optimization, evolutionary algorithms, policy optimization, actor critic approach

Procedia PDF Downloads 79
13203 The Factors Affecting the Use of Massive Open Online Courses in Blended Learning by Lecturers in Universities

Authors: Taghreed Alghamdi, Wendy Hall, David Millard

Abstract:

Massive Open Online Courses (MOOCs) have recently gained widespread interest in the academic world, starting a wide range of discussion of a number of issues. One of these issues, using MOOCs in teaching and learning in the higher education by integrating MOOCs’ contents with traditional face-to-face activities in blended learning format, is called blended MOOCs (bMOOCs) and is intended not to replace traditional learning but to enhance students learning. Most research on MOOCs has focused on students’ perception and institutional threats whereas there is a lack of published research on academics’ experiences and practices. Thus, the first aim of the study is to develop a classification of blended MOOCs models by conducting a systematic literature review, classifying 19 different case studies, and identifying the broad types of bMOOCs models namely: Supplementary Model and Integrated Model. Thus, the analyses phase will emphasize on these different types of bMOOCs models in terms of adopting MOOCs by lecturers. The second aim of the study is to improve the understanding of lecturers’ acceptance of bMOOCs by investigate the factors that influence academics’ acceptance of using MOOCs in traditional learning by distributing an online survey to lecturers who participate in MOOCs platforms. These factors can help institutions to encourage their lecturers to integrate MOOCs with their traditional courses in universities.

Keywords: acceptance, blended learning, blended MOOCs, higher education, lecturers, MOOCs, professors

Procedia PDF Downloads 119
13202 Ontology-Driven Knowledge Discovery and Validation from Admission Databases: A Structural Causal Model Approach for Polytechnic Education in Nigeria

Authors: Bernard Igoche Igoche, Olumuyiwa Matthew, Peter Bednar, Alexander Gegov

Abstract:

This study presents an ontology-driven approach for knowledge discovery and validation from admission databases in Nigerian polytechnic institutions. The research aims to address the challenges of extracting meaningful insights from vast amounts of admission data and utilizing them for decision-making and process improvement. The proposed methodology combines the knowledge discovery in databases (KDD) process with a structural causal model (SCM) ontological framework. The admission database of Benue State Polytechnic Ugbokolo (Benpoly) is used as a case study. The KDD process is employed to mine and distill knowledge from the database, while the SCM ontology is designed to identify and validate the important features of the admission process. The SCM validation is performed using the conditional independence test (CIT) criteria, and an algorithm is developed to implement the validation process. The identified features are then used for machine learning (ML) modeling and prediction of admission status. The results demonstrate the adequacy of the SCM ontological framework in representing the admission process and the high predictive accuracies achieved by the ML models, with k-nearest neighbors (KNN) and support vector machine (SVM) achieving 92% accuracy. The study concludes that the proposed ontology-driven approach contributes to the advancement of educational data mining and provides a foundation for future research in this domain.

Keywords: admission databases, educational data mining, machine learning, ontology-driven knowledge discovery, polytechnic education, structural causal model

Procedia PDF Downloads 38
13201 Two Concurrent Convolution Neural Networks TC*CNN Model for Face Recognition Using Edge

Authors: T. Alghamdi, G. Alaghband

Abstract:

In this paper we develop a model that couples Two Concurrent Convolution Neural Network with different filters (TC*CNN) for face recognition and compare its performance to an existing sequential CNN (base model). We also test and compare the quality and performance of the models on three datasets with various levels of complexity (easy, moderate, and difficult) and show that for the most complex datasets, edges will produce the most accurate and efficient results. We further show that in such cases while Support Vector Machine (SVM) models are fast, they do not produce accurate results.

Keywords: Convolution Neural Network, Edges, Face Recognition , Support Vector Machine.

Procedia PDF Downloads 133
13200 Machine Learning Prediction of Compressive Damage and Energy Absorption in Carbon Fiber-Reinforced Polymer Tubular Structures

Authors: Milad Abbasi

Abstract:

Carbon fiber-reinforced polymer (CFRP) composite structures are increasingly being utilized in the automotive industry due to their lightweight and specific energy absorption capabilities. Although it is impossible to predict composite mechanical properties directly using theoretical methods, various research has been conducted so far in the literature for accurate simulation of CFRP structures' energy-absorbing behavior. In this research, axial compression experiments were carried out on hand lay-up unidirectional CFRP composite tubes. The fabrication method allowed the authors to extract the material properties of the CFRPs using ASTM D3039, D3410, and D3518 standards. A neural network machine learning algorithm was then utilized to build a robust prediction model to forecast the axial compressive properties of CFRP tubes while reducing high-cost experimental efforts. The predicted results have been compared with the experimental outcomes in terms of load-carrying capacity and energy absorption capability. The results showed high accuracy and precision in the prediction of the energy-absorption capacity of the CFRP tubes. This research also demonstrates the effectiveness and challenges of machine learning techniques in the robust simulation of composites' energy-absorption behavior. Interestingly, the proposed method considerably condensed numerical and experimental efforts in the simulation and calibration of CFRP composite tubes subjected to compressive loading.

Keywords: CFRP composite tubes, energy absorption, crushing behavior, machine learning, neural network

Procedia PDF Downloads 126
13199 Smart Textiles Integration for Monitoring Real-time Air Pollution

Authors: Akshay Dirisala

Abstract:

Humans had developed a highly organized and efficient civilization to live in by improving the basic needs of humans like housing, transportation, and utilities. These developments have made a huge impact on major environmental factors. Air pollution is one prominent environmental factor that needs to be addressed to maintain a sustainable and healthier lifestyle. Textiles have always been at the forefront of helping humans shield from environmental conditions. With the growth in the field of electronic textiles, we now have the capability of monitoring the atmosphere in real time to understand and analyze the environment that a particular person is mostly spending their time at. Integrating textiles with the particulate matter sensors that measure air quality and pollutants that have a direct impact on human health will help to understand what type of air we are breathing. This research idea aims to develop a textile product and a process of collecting the pollutants through particulate matter sensors, which are equipped inside a smart textile product and store the data to develop a machine learning model to analyze the health conditions of the person wearing the garment and periodically notifying them not only will help to be cautious of airborne diseases but will help to regulate the diseases and could also help to take care of skin conditions.

Keywords: air pollution, e-textiles, particulate matter sensors, environment, machine learning models

Procedia PDF Downloads 87
13198 Comparing the Detection of Autism Spectrum Disorder within Males and Females Using Machine Learning Techniques

Authors: Joseph Wolff, Jeffrey Eilbott

Abstract:

Autism Spectrum Disorders (ASD) are a spectrum of social disorders characterized by deficits in social communication, verbal ability, and interaction that can vary in severity. In recent years, researchers have used magnetic resonance imaging (MRI) to help detect how neural patterns in individuals with ASD differ from those of neurotypical (NT) controls for classification purposes. This study analyzed the classification of ASD within males and females using functional MRI data. Functional connectivity (FC) correlations among brain regions were used as feature inputs for machine learning algorithms. Analysis was performed on 558 cases from the Autism Brain Imaging Data Exchange (ABIDE) I dataset. When trained specifically on females, the algorithm underperformed in classifying the ASD subset of our testing population. Although the subject size was relatively smaller in the female group, the manual matching of both male and female training groups helps explain the algorithm’s bias, indicating the altered sex abnormalities in functional brain networks compared to typically developing peers. These results highlight the importance of taking sex into account when considering how generalizations of findings on males with ASD apply to females.

Keywords: autism spectrum disorder, machine learning, neuroimaging, sex differences

Procedia PDF Downloads 191
13197 Methods for Enhancing Ensemble Learning or Improving Classifiers of This Technique in the Analysis and Classification of Brain Signals

Authors: Seyed Mehdi Ghezi, Hesam Hasanpoor

Abstract:

This scientific article explores enhancement methods for ensemble learning with the aim of improving the performance of classifiers in the analysis and classification of brain signals. The research approach in this field consists of two main parts, each with its own strengths and weaknesses. The choice of approach depends on the specific research question and available resources. By combining these approaches and leveraging their respective strengths, researchers can enhance the accuracy and reliability of classification results, consequently advancing our understanding of the brain and its functions. The first approach focuses on utilizing machine learning methods to identify the best features among the vast array of features present in brain signals. The selection of features varies depending on the research objective, and different techniques have been employed for this purpose. For instance, the genetic algorithm has been used in some studies to identify the best features, while optimization methods have been utilized in others to identify the most influential features. Additionally, machine learning techniques have been applied to determine the influential electrodes in classification. Ensemble learning plays a crucial role in identifying the best features that contribute to learning, thereby improving the overall results. The second approach concentrates on designing and implementing methods for selecting the best classifier or utilizing meta-classifiers to enhance the final results in ensemble learning. In a different section of the research, a single classifier is used instead of multiple classifiers, employing different sets of features to improve the results. The article provides an in-depth examination of each technique, highlighting their advantages and limitations. By integrating these techniques, researchers can enhance the performance of classifiers in the analysis and classification of brain signals. This advancement in ensemble learning methodologies contributes to a better understanding of the brain and its functions, ultimately leading to improved accuracy and reliability in brain signal analysis and classification.

Keywords: ensemble learning, brain signals, classification, feature selection, machine learning, genetic algorithm, optimization methods, influential features, influential electrodes, meta-classifiers

Procedia PDF Downloads 59
13196 Enhance Engineering Learning Using Cognitive Simulator

Authors: Lior Davidovitch

Abstract:

Traditional training based on static models and case studies is the backbone of most teaching and training programs of engineering education. However, project management learning is characterized by dynamics models that requires new and enhanced learning method. The results of empirical experiments evaluating the effectiveness and efficiency of using cognitive simulator as a new training technique are reported. The empirical findings are focused on the impact of keeping and reviewing learning history in a dynamic and interactive simulation environment of engineering education. The cognitive simulator for engineering project management learning had two learning history keeping modes: manual (student-controlled), automatic (simulator-controlled) and a version with no history keeping. A group of industrial engineering students performed four simulation-runs divided into three identical simple scenarios and one complicated scenario. The performances of participants running the simulation with the manual history mode were significantly better than users running the simulation with the automatic history mode. Moreover, the effects of using the undo enhanced further the learning process. The findings indicate an enhancement of engineering students’ learning and decision making when they use the record functionality of the history during their engineering training process. Furthermore, the cognitive simulator as educational innovation improves students learning and training. The practical implications of using simulators in the field of engineering education are discussed.

Keywords: cognitive simulator, decision making, engineering learning, project management

Procedia PDF Downloads 235
13195 Hull Detection from Handwritten Digit Image

Authors: Sriraman Kothuri, Komal Teja Mattupalli

Abstract:

In this paper we proposed a novel algorithm for recognizing hulls in a hand written digits. This is an extension to the work on “Digit Recognition Using Freeman Chain code”. In order to find out the hulls in a user given digit it is necessary to follow three steps. Those are pre-processing, Boundary Extraction and at last apply the Hull Detection system in a way to attain the better results. The detection of Hull Regions is mainly intended to increase the machine learning capability in detection of characters or digits. This can also extend this in order to get the hull regions and their intensities in Black Holes in Space Exploration.

Keywords: chain code, machine learning, hull regions, hull recognition system, SASK algorithm

Procedia PDF Downloads 387
13194 Exploring Data Leakage in EEG Based Brain-Computer Interfaces: Overfitting Challenges

Authors: Khalida Douibi, Rodrigo Balp, Solène Le Bars

Abstract:

In the medical field, applications related to human experiments are frequently linked to reduced samples size, which makes the training of machine learning models quite sensitive and therefore not very robust nor generalizable. This is notably the case in Brain-Computer Interface (BCI) studies, where the sample size rarely exceeds 20 subjects or a few number of trials. To address this problem, several resampling approaches are often used during the data preparation phase, which is an overly critical step in a data science analysis process. One of the naive approaches that is usually applied by data scientists consists in the transformation of the entire database before the resampling phase. However, this can cause model’ s performance to be incorrectly estimated when making predictions on unseen data. In this paper, we explored the effect of data leakage observed during our BCI experiments for device control through the real-time classification of SSVEPs (Steady State Visually Evoked Potentials). We also studied potential ways to ensure optimal validation of the classifiers during the calibration phase to avoid overfitting. The results show that the scaling step is crucial for some algorithms, and it should be applied after the resampling phase to avoid data leackage and improve results.

Keywords: data leackage, data science, machine learning, SSVEP, BCI, overfitting

Procedia PDF Downloads 138
13193 Supervised/Unsupervised Mahalanobis Algorithm for Improving Performance for Cyberattack Detection over Communications Networks

Authors: Radhika Ranjan Roy

Abstract:

Deployment of machine learning (ML)/deep learning (DL) algorithms for cyberattack detection in operational communications networks (wireless and/or wire-line) is being delayed because of low-performance parameters (e.g., recall, precision, and f₁-score). If datasets become imbalanced, which is the usual case for communications networks, the performance tends to become worse. Complexities in handling reducing dimensions of the feature sets for increasing performance are also a huge problem. Mahalanobis algorithms have been widely applied in scientific research because Mahalanobis distance metric learning is a successful framework. In this paper, we have investigated the Mahalanobis binary classifier algorithm for increasing cyberattack detection performance over communications networks as a proof of concept. We have also found that high-dimensional information in intermediate features that are not utilized as much for classification tasks in ML/DL algorithms are the main contributor to the state-of-the-art of improved performance of the Mahalanobis method, even for imbalanced and sparse datasets. With no feature reduction, MD offers uniform results for precision, recall, and f₁-score for unbalanced and sparse NSL-KDD datasets.

Keywords: Mahalanobis distance, machine learning, deep learning, NS-KDD, local intrinsic dimensionality, chi-square, positive semi-definite, area under the curve

Procedia PDF Downloads 59
13192 Comparative Evaluation of Accuracy of Selected Machine Learning Classification Techniques for Diagnosis of Cancer: A Data Mining Approach

Authors: Rajvir Kaur, Jeewani Anupama Ginige

Abstract:

With recent trends in Big Data and advancements in Information and Communication Technologies, the healthcare industry is at the stage of its transition from clinician oriented to technology oriented. Many people around the world die of cancer because the diagnosis of disease was not done at an early stage. Nowadays, the computational methods in the form of Machine Learning (ML) are used to develop automated decision support systems that can diagnose cancer with high confidence in a timely manner. This paper aims to carry out the comparative evaluation of a selected set of ML classifiers on two existing datasets: breast cancer and cervical cancer. The ML classifiers compared in this study are Decision Tree (DT), Support Vector Machine (SVM), k-Nearest Neighbor (k-NN), Logistic Regression, Ensemble (Bagged Tree) and Artificial Neural Networks (ANN). The evaluation is carried out based on standard evaluation metrics Precision (P), Recall (R), F1-score and Accuracy. The experimental results based on the evaluation metrics show that ANN showed the highest-level accuracy (99.4%) when tested with breast cancer dataset. On the other hand, when these ML classifiers are tested with the cervical cancer dataset, Ensemble (Bagged Tree) technique gave better accuracy (93.1%) in comparison to other classifiers.

Keywords: artificial neural networks, breast cancer, classifiers, cervical cancer, f-score, machine learning, precision, recall

Procedia PDF Downloads 263
13191 Decision Support System for Diagnosis of Breast Cancer

Authors: Oluwaponmile D. Alao

Abstract:

In this paper, two models have been developed to ascertain the best network needed for diagnosis of breast cancer. Breast cancer has been a disease that required the attention of the medical practitioner. Experience has shown that misdiagnose of the disease has been a major challenge in the medical field. Therefore, designing a system with adequate performance for will help in making diagnosis of the disease faster and accurate. In this paper, two models: backpropagation neural network and support vector machine has been developed. The performance obtained is also compared with other previously obtained algorithms to ascertain the best algorithms.

Keywords: breast cancer, data mining, neural network, support vector machine

Procedia PDF Downloads 325
13190 Comparison of Different Machine Learning Algorithms for Solubility Prediction

Authors: Muhammet Baldan, Emel Timuçin

Abstract:

Molecular solubility prediction plays a crucial role in various fields, such as drug discovery, environmental science, and material science. In this study, we compare the performance of five machine learning algorithms—linear regression, support vector machines (SVM), random forests, gradient boosting machines (GBM), and neural networks—for predicting molecular solubility using the AqSolDB dataset. The dataset consists of 9981 data points with their corresponding solubility values. MACCS keys (166 bits), RDKit properties (20 properties), and structural properties(3) features are extracted for every smile representation in the dataset. A total of 189 features were used for training and testing for every molecule. Each algorithm is trained on a subset of the dataset and evaluated using metrics accuracy scores. Additionally, computational time for training and testing is recorded to assess the efficiency of each algorithm. Our results demonstrate that random forest model outperformed other algorithms in terms of predictive accuracy, achieving an 0.93 accuracy score. Gradient boosting machines and neural networks also exhibit strong performance, closely followed by support vector machines. Linear regression, while simpler in nature, demonstrates competitive performance but with slightly higher errors compared to ensemble methods. Overall, this study provides valuable insights into the performance of machine learning algorithms for molecular solubility prediction, highlighting the importance of algorithm selection in achieving accurate and efficient predictions in practical applications.

Keywords: random forest, machine learning, comparison, feature extraction

Procedia PDF Downloads 20