Search results for: explainable machine learning
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8604

Search results for: explainable machine learning

8184 Liquid Biopsy Based Microbial Biomarker in Coronary Artery Disease Diagnosis

Authors: Eyup Ozkan, Ozkan U. Nalbantoglu, Aycan Gundogdu, Mehmet Hora, A. Emre Onuk

Abstract:

The human microbiome has been associated with cardiological conditions and this relationship is becoming to be defined beyond the gastrointestinal track. In this study, we investigate the alteration in circulatory microbiota in the context of Coronary Artery Disease (CAD). We received circulatory blood samples from suspected CAD patients and maintain 16S ribosomal RNA sequencing to identify each patient’s microbiome. It was found that Corynebacterium and Methanobacteria genera show statistically significant differences between healthy and CAD patients. The overall biodiversities between the groups were observed to be different revealed by machine learning classification models. We also achieve and demonstrate the performance of a diagnostic method using circulatory blood microbiome-based estimation.

Keywords: coronary artery disease, blood microbiome, machine learning, angiography, next-generation sequencing

Procedia PDF Downloads 159
8183 Short-Term Forecast of Wind Turbine Production with Machine Learning Methods: Direct Approach and Indirect Approach

Authors: Mamadou Dione, Eric Matzner-lober, Philippe Alexandre

Abstract:

The Energy Transition Act defined by the French State has precise implications on Renewable Energies, in particular on its remuneration mechanism. Until then, a purchase obligation contract permitted the sale of wind-generated electricity at a fixed rate. Tomorrow, it will be necessary to sell this electricity on the Market (at variable rates) before obtaining additional compensation intended to reduce the risk. This sale on the market requires to announce in advance (about 48 hours before) the production that will be delivered on the network, so to be able to predict (in the short term) this production. The fundamental problem remains the variability of the Wind accentuated by the geographical situation. The objective of the project is to provide, every day, short-term forecasts (48-hour horizon) of wind production using weather data. The predictions of the GFS model and those of the ECMWF model are used as explanatory variables. The variable to be predicted is the production of a wind farm. We do two approaches: a direct approach that predicts wind generation directly from weather data, and an integrated approach that estimâtes wind from weather data and converts it into wind power by power curves. We used machine learning techniques to predict this production. The models tested are random forests, CART + Bagging, CART + Boosting, SVM (Support Vector Machine). The application is made on a wind farm of 22MW (11 wind turbines) of the Compagnie du Vent (that became Engie Green France). Our results are very conclusive compared to the literature.

Keywords: forecast aggregation, machine learning, spatio-temporal dynamics modeling, wind power forcast

Procedia PDF Downloads 220
8182 Enhancing Fall Detection Accuracy with a Transfer Learning-Aided Transformer Model Using Computer Vision

Authors: Sheldon McCall, Miao Yu, Liyun Gong, Shigang Yue, Stefanos Kollias

Abstract:

Falls are a significant health concern for older adults globally, and prompt identification is critical to providing necessary healthcare support. Our study proposes a new fall detection method using computer vision based on modern deep learning techniques. Our approach involves training a trans- former model on a large 2D pose dataset for general action recognition, followed by transfer learning. Specifically, we freeze the first few layers of the trained transformer model and train only the last two layers for fall detection. Our experimental results demonstrate that our proposed method outperforms both classical machine learning and deep learning approaches in fall/non-fall classification. Overall, our study suggests that our proposed methodology could be a valuable tool for identifying falls.

Keywords: healthcare, fall detection, transformer, transfer learning

Procedia PDF Downloads 152
8181 Reinforcement Learning Optimization: Unraveling Trends and Advancements in Metaheuristic Algorithms

Authors: Rahul Paul, Kedar Nath Das

Abstract:

The field of machine learning (ML) is experiencing rapid development, resulting in a multitude of theoretical advancements and extensive practical implementations across various disciplines. The objective of ML is to facilitate the ability of machines to perform cognitive tasks by leveraging knowledge gained from prior experiences and effectively addressing complex problems, even in situations that deviate from previously encountered instances. Reinforcement Learning (RL) has emerged as a prominent subfield within ML and has gained considerable attention in recent times from researchers. This surge in interest can be attributed to the practical applications of RL, the increasing availability of data, and the rapid advancements in computing power. At the same time, optimization algorithms play a pivotal role in the field of ML and have attracted considerable interest from researchers. A multitude of proposals have been put forth to address optimization problems or improve optimization techniques within the domain of ML. The necessity of a thorough examination and implementation of optimization algorithms within the context of ML is of utmost importance in order to provide guidance for the advancement of research in both optimization and ML. This article provides a comprehensive overview of the application of metaheuristic evolutionary optimization algorithms in conjunction with RL to address a diverse range of scientific challenges. Furthermore, this article delves into the various challenges and unresolved issues pertaining to the optimization of RL models.

Keywords: machine learning, reinforcement learning, loss function, evolutionary optimization techniques

Procedia PDF Downloads 77
8180 Feature Weighting Comparison Based on Clustering Centers in the Detection of Diabetic Retinopathy

Authors: Kemal Polat

Abstract:

In this paper, three feature weighting methods have been used to improve the classification performance of diabetic retinopathy (DR). To classify the diabetic retinopathy, features extracted from the output of several retinal image processing algorithms, such as image-level, lesion-specific and anatomical components, have been used and fed them into the classifier algorithms. The dataset used in this study has been taken from University of California, Irvine (UCI) machine learning repository. Feature weighting methods including the fuzzy c-means clustering based feature weighting, subtractive clustering based feature weighting, and Gaussian mixture clustering based feature weighting, have been used and compered with each other in the classification of DR. After feature weighting, five different classifier algorithms comprising multi-layer perceptron (MLP), k- nearest neighbor (k-NN), decision tree, support vector machine (SVM), and Naïve Bayes have been used. The hybrid method based on combination of subtractive clustering based feature weighting and decision tree classifier has been obtained the classification accuracy of 100% in the screening of DR. These results have demonstrated that the proposed hybrid scheme is very promising in the medical data set classification.

Keywords: machine learning, data weighting, classification, data mining

Procedia PDF Downloads 328
8179 One-Class Support Vector Machine for Sentiment Analysis of Movie Review Documents

Authors: Chothmal, Basant Agarwal

Abstract:

Sentiment analysis means to classify a given review document into positive or negative polar document. Sentiment analysis research has been increased tremendously in recent times due to its large number of applications in the industry and academia. Sentiment analysis models can be used to determine the opinion of the user towards any entity or product. E-commerce companies can use sentiment analysis model to improve their products on the basis of users’ opinion. In this paper, we propose a new One-class Support Vector Machine (One-class SVM) based sentiment analysis model for movie review documents. In the proposed approach, we initially extract features from one class of documents, and further test the given documents with the one-class SVM model if a given new test document lies in the model or it is an outlier. Experimental results show the effectiveness of the proposed sentiment analysis model.

Keywords: feature selection methods, machine learning, NB, one-class SVM, sentiment analysis, support vector machine

Procedia PDF Downloads 520
8178 Normal and Peaberry Coffee Beans Classification from Green Coffee Bean Images Using Convolutional Neural Networks and Support Vector Machine

Authors: Hira Lal Gope, Hidekazu Fukai

Abstract:

The aim of this study is to develop a system which can identify and sort peaberries automatically at low cost for coffee producers in developing countries. In this paper, the focus is on the classification of peaberries and normal coffee beans using image processing and machine learning techniques. The peaberry is not bad and not a normal bean. The peaberry is born in an only single seed, relatively round seed from a coffee cherry instead of the usual flat-sided pair of beans. It has another value and flavor. To make the taste of the coffee better, it is necessary to separate the peaberry and normal bean before green coffee beans roasting. Otherwise, the taste of total beans will be mixed, and it will be bad. In roaster procedure time, all the beans shape, size, and weight must be unique; otherwise, the larger bean will take more time for roasting inside. The peaberry has a different size and different shape even though they have the same weight as normal beans. The peaberry roasts slower than other normal beans. Therefore, neither technique provides a good option to select the peaberries. Defect beans, e.g., sour, broken, black, and fade bean, are easy to check and pick up manually by hand. On the other hand, the peaberry pick up is very difficult even for trained specialists because the shape and color of the peaberry are similar to normal beans. In this study, we use image processing and machine learning techniques to discriminate the normal and peaberry bean as a part of the sorting system. As the first step, we applied Deep Convolutional Neural Networks (CNN) and Support Vector Machine (SVM) as machine learning techniques to discriminate the peaberry and normal bean. As a result, better performance was obtained with CNN than with SVM for the discrimination of the peaberry. The trained artificial neural network with high performance CPU and GPU in this work will be simply installed into the inexpensive and low in calculation Raspberry Pi system. We assume that this system will be used in under developed countries. The study evaluates and compares the feasibility of the methods in terms of accuracy of classification and processing speed.

Keywords: convolutional neural networks, coffee bean, peaberry, sorting, support vector machine

Procedia PDF Downloads 146
8177 Land Suitability Prediction Modelling for Agricultural Crops Using Machine Learning Approach: A Case Study of Khuzestan Province, Iran

Authors: Saba Gachpaz, Hamid Reza Heidari

Abstract:

The sharp increase in population growth leads to more pressure on agricultural areas to satisfy the food supply. To achieve this, more resources should be consumed and, besides other environmental concerns, highlight sustainable agricultural development. Land-use management is a crucial factor in obtaining optimum productivity. Machine learning is a widely used technique in the agricultural sector, from yield prediction to customer behavior. This method focuses on learning and provides patterns and correlations from our data set. In this study, nine physical control factors, namely, soil classification, electrical conductivity, normalized difference water index (NDWI), groundwater level, elevation, annual precipitation, pH of water, annual mean temperature, and slope in the alluvial plain in Khuzestan (an agricultural hotspot in Iran) are used to decide the best agricultural land use for both rainfed and irrigated agriculture for ten different crops. For this purpose, each variable was imported into Arc GIS, and a raster layer was obtained. In the next level, by using training samples, all layers were imported into the python environment. A random forest model was applied, and the weight of each variable was specified. In the final step, results were visualized using a digital elevation model, and the importance of all factors for each one of the crops was obtained. Our results show that despite 62% of the study area being allocated to agricultural purposes, only 42.9% of these areas can be defined as a suitable class for cultivation purposes.

Keywords: land suitability, machine learning, random forest, sustainable agriculture

Procedia PDF Downloads 86
8176 Market Index Trend Prediction using Deep Learning and Risk Analysis

Authors: Shervin Alaei, Reza Moradi

Abstract:

Trading in financial markets is subject to risks due to their high volatilities. Here, using an LSTM neural network, and by doing some risk-based feature engineering tasks, we developed a method that can accurately predict trends of the Tehran stock exchange market index from a few days ago. Our test results have shown that the proposed method with an average prediction accuracy of more than 94% is superior to the other common machine learning algorithms. To the best of our knowledge, this is the first work incorporating deep learning and risk factors to accurately predict market trends.

Keywords: deep learning, LSTM, trend prediction, risk management, artificial neural networks

Procedia PDF Downloads 157
8175 Leveraging Learning Analytics to Inform Learning Design in Higher Education

Authors: Mingming Jiang

Abstract:

This literature review aims to offer an overview of existing research on learning analytics and learning design, the alignment between the two, and how learning analytics has been leveraged to inform learning design in higher education. Current research suggests a need to create more alignment and integration between learning analytics and learning design in order to not only ground learning analytics on learning sciences but also enable data-driven decisions in learning design to improve learning outcomes. In addition, multiple conceptual frameworks have been proposed to enhance the synergy and alignment between learning analytics and learning design. Future research should explore this synergy further in the unique context of higher education, identifying learning analytics metrics in higher education that can offer insight into learning processes, evaluating the effect of learning analytics outcomes on learning design decision-making in higher education, and designing learning environments in higher education that make the capturing and deployment of learning analytics outcomes more efficient.

Keywords: learning analytics, learning design, big data in higher education, online learning environments

Procedia PDF Downloads 177
8174 Study on Dynamic Stiffness Matching and Optimization Design Method of a Machine Tool

Authors: Lu Xi, Li Pan, Wen Mengmeng

Abstract:

The stiffness of each component has different influences on the stiffness of the machine tool. Taking the five-axis gantry machining center as an example, we made the modal analysis of the machine tool, followed by raising and lowering the stiffness of the pillar, slide plate, beam, ram and saddle so as to study the stiffness matching among these components on the standard of whether the stiffness of the modified machine tool changes more than 50% relative to the stiffness of the original machine tool. The structural optimization of the machine tool can be realized by changing the stiffness of the components whose stiffness is mismatched. For example, the stiffness of the beam is mismatching. The natural frequencies of the first six orders of the beam increased by 7.70%, 0.38%, 6.82%, 7.96%, 18.72% and 23.13%, with the weight increased by 28Kg, leading to the natural frequencies of several orders which had a great influence on the dynamic performance of the whole machine increased by 1.44%, 0.43%, 0.065%, which verified the correctness of the optimization method based on stiffness matching proposed in this paper.

Keywords: machine tool, optimization, modal analysis, stiffness matching

Procedia PDF Downloads 104
8173 Software Defect Analysis- Eclipse Dataset

Authors: Amrane Meriem, Oukid Salyha

Abstract:

The presence of defects or bugs in software can lead to costly setbacks, operational inefficiencies, and compromised user experiences. The integration of Machine Learning(ML) techniques has emerged to predict and preemptively address software defects. ML represents a proactive strategy aimed at identifying potential anomalies, errors, or vulnerabilities within code before they manifest as operational issues. By analyzing historical data, such as code changes, feature im- plementations, and defect occurrences. This en- ables development teams to anticipate and mitigate these issues, thus enhancing software quality, reducing maintenance costs, and ensuring smoother user interactions. In this work, we used a recommendation system to improve the performance of ML models in terms of predicting the code severity and effort estimation.

Keywords: software engineering, machine learning, bugs detection, effort estimation

Procedia PDF Downloads 89
8172 Integrating Natural Language Processing (NLP) and Machine Learning in Lung Cancer Diagnosis

Authors: Mehrnaz Mostafavi

Abstract:

The assessment and categorization of incidental lung nodules present a considerable challenge in healthcare, often necessitating resource-intensive multiple computed tomography (CT) scans for growth confirmation. This research addresses this issue by introducing a distinct computational approach leveraging radiomics and deep-learning methods. However, understanding local services is essential before implementing these advancements. With diverse tracking methods in place, there is a need for efficient and accurate identification approaches, especially in the context of managing lung nodules alongside pre-existing cancer scenarios. This study explores the integration of text-based algorithms in medical data curation, indicating their efficacy in conjunction with machine learning and deep-learning models for identifying lung nodules. Combining medical images with text data has demonstrated superior data retrieval compared to using each modality independently. While deep learning and text analysis show potential in detecting previously missed nodules, challenges persist, such as increased false positives. The presented research introduces a Structured-Query-Language (SQL) algorithm designed for identifying pulmonary nodules in a tertiary cancer center, externally validated at another hospital. Leveraging natural language processing (NLP) and machine learning, the algorithm categorizes lung nodule reports based on sentence features, aiming to facilitate research and assess clinical pathways. The hypothesis posits that the algorithm can accurately identify lung nodule CT scans and predict concerning nodule features using machine-learning classifiers. Through a retrospective observational study spanning a decade, CT scan reports were collected, and an algorithm was developed to extract and classify data. Results underscore the complexity of lung nodule cohorts in cancer centers, emphasizing the importance of careful evaluation before assuming a metastatic origin. The SQL and NLP algorithms demonstrated high accuracy in identifying lung nodule sentences, indicating potential for local service evaluation and research dataset creation. Machine-learning models exhibited strong accuracy in predicting concerning changes in lung nodule scan reports. While limitations include variability in disease group attribution, the potential for correlation rather than causality in clinical findings, and the need for further external validation, the algorithm's accuracy and potential to support clinical decision-making and healthcare automation represent a significant stride in lung nodule management and research.

Keywords: lung cancer diagnosis, structured-query-language (SQL), natural language processing (NLP), machine learning, CT scans

Procedia PDF Downloads 106
8171 Comparison of Deep Learning and Machine Learning Algorithms to Diagnose and Predict Breast Cancer

Authors: F. Ghazalnaz Sharifonnasabi, Iman Makhdoom

Abstract:

Breast cancer is a serious health concern that affects many people around the world. According to a study published in the Breast journal, the global burden of breast cancer is expected to increase significantly over the next few decades. The number of deaths from breast cancer has been increasing over the years, but the age-standardized mortality rate has decreased in some countries. It’s important to be aware of the risk factors for breast cancer and to get regular check- ups to catch it early if it does occur. Machin learning techniques have been used to aid in the early detection and diagnosis of breast cancer. These techniques, that have been shown to be effective in predicting and diagnosing the disease, have become a research hotspot. In this study, we consider two deep learning approaches including: Multi-Layer Perceptron (MLP), and Convolutional Neural Network (CNN). We also considered the five-machine learning algorithm titled: Decision Tree (C4.5), Naïve Bayesian (NB), Support Vector Machine (SVM), K-Nearest Neighbors (KNN) Algorithm and XGBoost (eXtreme Gradient Boosting) on the Breast Cancer Wisconsin Diagnostic dataset. We have carried out the process of evaluating and comparing classifiers involving selecting appropriate metrics to evaluate classifier performance and selecting an appropriate tool to quantify this performance. The main purpose of the study is predicting and diagnosis breast cancer, applying the mentioned algorithms and also discovering of the most effective with respect to confusion matrix, accuracy and precision. It is realized that CNN outperformed all other classifiers and achieved the highest accuracy (0.982456). The work is implemented in the Anaconda environment based on Python programing language.

Keywords: breast cancer, multi-layer perceptron, Naïve Bayesian, SVM, decision tree, convolutional neural network, XGBoost, KNN

Procedia PDF Downloads 79
8170 Machine Learning and Metaheuristic Algorithms in Short Femoral Stem Custom Design to Reduce Stress Shielding

Authors: Isabel Moscol, Carlos J. Díaz, Ciro Rodríguez

Abstract:

Hip replacement becomes necessary when a person suffers severe pain or considerable functional limitations and the best option to enhance their quality of life is through the replacement of the damaged joint. One of the main components in femoral prostheses is the stem which distributes the loads from the joint to the proximal femur. To preserve more bone stock and avoid weakening of the diaphysis, a short starting stem was selected, generated from the intramedullary morphology of the patient's femur. It ensures the implantability of the design and leads to geometric delimitation for personalized optimization with machine learning (ML) and metaheuristic algorithms. The present study attempts to design a cementless short stem to make the strain deviation before and after implantation close to zero, promoting its fixation and durability. Regression models developed to estimate the percentage change of maximum principal stresses were used as objective optimization functions by the metaheuristic algorithm. The latter evaluated different geometries of the short stem with the modification of certain parameters in oblique sections from the osteotomy plane. The optimized geometry reached a global stress shielding (SS) of 18.37% with a determination factor (R²) of 0.667. The predicted results favour implantability integration in the short stem optimization to effectively reduce SS in the proximal femur.

Keywords: machine learning techniques, metaheuristic algorithms, short-stem design, stress shielding, hip replacement

Procedia PDF Downloads 197
8169 Automatic Method for Classification of Informative and Noninformative Images in Colonoscopy Video

Authors: Nidhal K. Azawi, John M. Gauch

Abstract:

Colorectal cancer is one of the leading causes of cancer death in the US and the world, which is why millions of colonoscopy examinations are performed annually. Unfortunately, noise, specular highlights, and motion artifacts corrupt many images in a typical colonoscopy exam. The goal of our research is to produce automated techniques to detect and correct or remove these noninformative images from colonoscopy videos, so physicians can focus their attention on informative images. In this research, we first automatically extract features from images. Then we use machine learning and deep neural network to classify colonoscopy images as either informative or noninformative. Our results show that we achieve image classification accuracy between 92-98%. We also show how the removal of noninformative images together with image alignment can aid in the creation of image panoramas and other visualizations of colonoscopy images.

Keywords: colonoscopy classification, feature extraction, image alignment, machine learning

Procedia PDF Downloads 253
8168 Machine Learning Facing Behavioral Noise Problem in an Imbalanced Data Using One Side Behavioral Noise Reduction: Application to a Fraud Detection

Authors: Salma El Hajjami, Jamal Malki, Alain Bouju, Mohammed Berrada

Abstract:

With the expansion of machine learning and data mining in the context of Big Data analytics, the common problem that affects data is class imbalance. It refers to an imbalanced distribution of instances belonging to each class. This problem is present in many real world applications such as fraud detection, network intrusion detection, medical diagnostics, etc. In these cases, data instances labeled negatively are significantly more numerous than the instances labeled positively. When this difference is too large, the learning system may face difficulty when tackling this problem, since it is initially designed to work in relatively balanced class distribution scenarios. Another important problem, which usually accompanies these imbalanced data, is the overlapping instances between the two classes. It is commonly referred to as noise or overlapping data. In this article, we propose an approach called: One Side Behavioral Noise Reduction (OSBNR). This approach presents a way to deal with the problem of class imbalance in the presence of a high noise level. OSBNR is based on two steps. Firstly, a cluster analysis is applied to groups similar instances from the minority class into several behavior clusters. Secondly, we select and eliminate the instances of the majority class, considered as behavioral noise, which overlap with behavior clusters of the minority class. The results of experiments carried out on a representative public dataset confirm that the proposed approach is efficient for the treatment of class imbalances in the presence of noise.

Keywords: machine learning, imbalanced data, data mining, big data

Procedia PDF Downloads 133
8167 Using Swarm Intelligence to Forecast Outcomes of English Premier League Matches

Authors: Hans Schumann, Colin Domnauer, Louis Rosenberg

Abstract:

In this study, machine learning techniques were deployed on real-time human swarm data to forecast the likelihood of outcomes for English Premier League matches in the 2020/21 season. These techniques included ensemble models in combination with neural networks and were tested against an industry standard of Vegas Oddsmakers. Predictions made from the collective intelligence of human swarm participants managed to achieve a positive return on investment over a full season on matches, empirically proving the usefulness of a new artificial intelligence valuing human instinct and intelligence.

Keywords: artificial intelligence, data science, English Premier League, human swarming, machine learning, sports betting, swarm intelligence

Procedia PDF Downloads 214
8166 Optimizing Machine Vision System Setup Accuracy by Six-Sigma DMAIC Approach

Authors: Joseph C. Chen

Abstract:

Machine vision system provides automatic inspection to reduce manufacturing costs considerably. However, only a few principles have been found to optimize machine vision system and help it function more accurately in industrial practice. Mostly, there were complicated and impractical design techniques to improve the accuracy of machine vision system. This paper discusses implementing the Six Sigma Define, Measure, Analyze, Improve, and Control (DMAIC) approach to optimize the setup parameters of machine vision system when it is used as a direct measurement technique. This research follows a case study showing how Six Sigma DMAIC methodology has been put into use.

Keywords: DMAIC, machine vision system, process capability, Taguchi Parameter Design

Procedia PDF Downloads 440
8165 Disease Level Assessment in Wheat Plots Using a Residual Deep Learning Algorithm

Authors: Felipe A. Guth, Shane Ward, Kevin McDonnell

Abstract:

The assessment of disease levels in crop fields is an important and time-consuming task that generally relies on expert knowledge of trained individuals. Image classification in agriculture problems historically has been based on classical machine learning strategies that make use of hand-engineered features in the top of a classification algorithm. This approach tends to not produce results with high accuracy and generalization to the classes classified by the system when the nature of the elements has a significant variability. The advent of deep convolutional neural networks has revolutionized the field of machine learning, especially in computer vision tasks. These networks have great resourcefulness of learning and have been applied successfully to image classification and object detection tasks in the last years. The objective of this work was to propose a new method based on deep learning convolutional neural networks towards the task of disease level monitoring. Common RGB images of winter wheat were obtained during a growing season. Five categories of disease levels presence were produced, in collaboration with agronomists, for the algorithm classification. Disease level tasks performed by experts provided ground truth data for the disease score of the same winter wheat plots were RGB images were acquired. The system had an overall accuracy of 84% on the discrimination of the disease level classes.

Keywords: crop disease assessment, deep learning, precision agriculture, residual neural networks

Procedia PDF Downloads 335
8164 Machine Learning Approaches to Water Usage Prediction in Kocaeli: A Comparative Study

Authors: Kasim Görenekli, Ali Gülbağ

Abstract:

This study presents a comprehensive analysis of water consumption patterns in Kocaeli province, Turkey, utilizing various machine learning approaches. We analyzed data from 5,000 water subscribers across residential, commercial, and official categories over an 80-month period from January 2016 to August 2022, resulting in a total of 400,000 records. The dataset encompasses water consumption records, weather information, weekends and holidays, previous months' consumption, and the influence of the COVID-19 pandemic.We implemented and compared several machine learning models, including Linear Regression, Random Forest, Support Vector Regression (SVR), XGBoost, Artificial Neural Networks (ANN), Long Short-Term Memory (LSTM), and Gated Recurrent Units (GRU). Particle Swarm Optimization (PSO) was applied to optimize hyperparameters for all models.Our results demonstrate varying performance across subscriber types and models. For official subscribers, Random Forest achieved the highest R² of 0.699 with PSO optimization. For commercial subscribers, Linear Regression performed best with an R² of 0.730 with PSO. Residential water usage proved more challenging to predict, with XGBoost achieving the highest R² of 0.572 with PSO.The study identified key factors influencing water consumption, with previous months' consumption, meter diameter, and weather conditions being among the most significant predictors. The impact of the COVID-19 pandemic on consumption patterns was also observed, particularly in residential usage.This research provides valuable insights for effective water resource management in Kocaeli and similar regions, considering Turkey's high water loss rate and below-average per capita water supply. The comparative analysis of different machine learning approaches offers a comprehensive framework for selecting appropriate models for water consumption prediction in urban settings.

Keywords: mMachine learning, water consumption prediction, particle swarm optimization, COVID-19, water resource management

Procedia PDF Downloads 20
8163 Using Machine Learning to Classify Human Fetal Health and Analyze Feature Importance

Authors: Yash Bingi, Yiqiao Yin

Abstract:

Reduction of child mortality is an ongoing struggle and a commonly used factor in determining progress in the medical field. The under-5 mortality number is around 5 million around the world, with many of the deaths being preventable. In light of this issue, Cardiotocograms (CTGs) have emerged as a leading tool to determine fetal health. By using ultrasound pulses and reading the responses, CTGs help healthcare professionals assess the overall health of the fetus to determine the risk of child mortality. However, interpreting the results of the CTGs is time-consuming and inefficient, especially in underdeveloped areas where an expert obstetrician is hard to come by. Using a support vector machine (SVM) and oversampling, this paper proposed a model that classifies fetal health with an accuracy of 99.59%. To further explain the CTG measurements, an algorithm based on Randomized Input Sampling for Explanation ((RISE) of Black-box Models was created, called Feature Alteration for explanation of Black Box Models (FAB), and compared the findings to Shapley Additive Explanations (SHAP) and Local Interpretable Model Agnostic Explanations (LIME). This allows doctors and medical professionals to classify fetal health with high accuracy and determine which features were most influential in the process.

Keywords: machine learning, fetal health, gradient boosting, support vector machine, Shapley values, local interpretable model agnostic explanations

Procedia PDF Downloads 145
8162 Data-Driven Market Segmentation in Hospitality Using Unsupervised Machine Learning

Authors: Rik van Leeuwen, Ger Koole

Abstract:

Within hospitality, marketing departments use segmentation to create tailored strategies to ensure personalized marketing. This study provides a data-driven approach by segmenting guest profiles via hierarchical clustering based on an extensive set of features. The industry requires understandable outcomes that contribute to adaptability for marketing departments to make data-driven decisions and ultimately driving profit. A marketing department specified a business question that guides the unsupervised machine learning algorithm. Features of guests change over time; therefore, there is a probability that guests transition from one segment to another. The purpose of the study is to provide steps in the process from raw data to actionable insights, which serve as a guideline for how hospitality companies can adopt an algorithmic approach.

Keywords: hierarchical cluster analysis, hospitality, market segmentation

Procedia PDF Downloads 109
8161 Analyzing Tools and Techniques for Classification In Educational Data Mining: A Survey

Authors: D. I. George Amalarethinam, A. Emima

Abstract:

Educational Data Mining (EDM) is one of the newest topics to emerge in recent years, and it is concerned with developing methods for analyzing various types of data gathered from the educational circle. EDM methods and techniques with machine learning algorithms are used to extract meaningful and usable information from huge databases. For scientists and researchers, realistic applications of Machine Learning in the EDM sectors offer new frontiers and present new problems. One of the most important research areas in EDM is predicting student success. The prediction algorithms and techniques must be developed to forecast students' performance, which aids the tutor, institution to boost the level of student’s performance. This paper examines various classification techniques in prediction methods and data mining tools used in EDM.

Keywords: classification technique, data mining, EDM methods, prediction methods

Procedia PDF Downloads 119
8160 Prediction of Music Track Popularity: A Machine Learning Approach

Authors: Syed Atif Hassan, Luv Mehta, Syed Asif Hassan

Abstract:

Hit song science is a field of investigation wherein machine learning techniques are applied to music tracks in order to extract such features from audio signals which can capture information that could explain the popularity of respective tracks. Record companies invest huge amounts of money into recruiting fresh talents and churning out new music each year. Gaining insight into the basis of why a song becomes popular will result in tremendous benefits for the music industry. This paper aims to extract basic musical and more advanced, acoustic features from songs while also taking into account external factors that play a role in making a particular song popular. We use a dataset derived from popular Spotify playlists divided by genre. We use ten genres (blues, classical, country, disco, hip-hop, jazz, metal, pop, reggae, rock), chosen on the basis of clear to ambiguous delineation in the typical sound of their genres. We feed these features into three different classifiers, namely, SVM with RBF kernel, a deep neural network, and a recurring neural network, to build separate predictive models and choosing the best performing model at the end. Predicting song popularity is particularly important for the music industry as it would allow record companies to produce better content for the masses resulting in a more competitive market.

Keywords: classifier, machine learning, music tracks, popularity, prediction

Procedia PDF Downloads 667
8159 Designing Energy Efficient Buildings for Seasonal Climates Using Machine Learning Techniques

Authors: Kishor T. Zingre, Seshadhri Srinivasan

Abstract:

Energy consumption by the building sector is increasing at an alarming rate throughout the world and leading to more building-related CO₂ emissions into the environment. In buildings, the main contributors to energy consumption are heating, ventilation, and air-conditioning (HVAC) systems, lighting, and electrical appliances. It is hypothesised that the energy efficiency in buildings can be achieved by implementing sustainable technologies such as i) enhancing the thermal resistance of fabric materials for reducing heat gain (in hotter climates) and heat loss (in colder climates), ii) enhancing daylight and lighting system, iii) HVAC system and iv) occupant localization. Energy performance of various sustainable technologies is highly dependent on climatic conditions. This paper investigated the use of machine learning techniques for accurate prediction of air-conditioning energy in seasonal climates. The data required to train the machine learning techniques is obtained using the computational simulations performed on a 3-story commercial building using EnergyPlus program plugged-in with OpenStudio and Google SketchUp. The EnergyPlus model was calibrated against experimental measurements of surface temperatures and heat flux prior to employing for the simulations. It has been observed from the simulations that the performance of sustainable fabric materials (for walls, roof, and windows) such as phase change materials, insulation, cool roof, etc. vary with the climate conditions. Various renewable technologies were also used for the building flat roofs in various climates to investigate the potential for electricity generation. It has been observed that the proposed technique overcomes the shortcomings of existing approaches, such as local linearization or over-simplifying assumptions. In addition, the proposed method can be used for real-time estimation of building air-conditioning energy.

Keywords: building energy efficiency, energyplus, machine learning techniques, seasonal climates

Procedia PDF Downloads 115
8158 An Automated R-Peak Detection Method Using Common Vector Approach

Authors: Ali Kirkbas

Abstract:

R peaks in an electrocardiogram (ECG) are signs of cardiac activity in individuals that reveal valuable information about cardiac abnormalities, which can lead to mortalities in some cases. This paper examines the problem of detecting R-peaks in ECG signals, which is a two-class pattern classification problem in fact. To handle this problem with a reliable high accuracy, we propose to use the common vector approach which is a successful machine learning algorithm. The dataset used in the proposed method is obtained from MIT-BIH, which is publicly available. The results are compared with the other popular methods under the performance metrics. The obtained results show that the proposed method shows good performance than that of the other. methods compared in the meaning of diagnosis accuracy and simplicity which can be operated on wearable devices.

Keywords: ECG, R-peak classification, common vector approach, machine learning

Procedia PDF Downloads 66
8157 Predicting Football Player Performance: Integrating Data Visualization and Machine Learning

Authors: Saahith M. S., Sivakami R.

Abstract:

In the realm of football analytics, particularly focusing on predicting football player performance, the ability to forecast player success accurately is of paramount importance for teams, managers, and fans. This study introduces an elaborate examination of predicting football player performance through the integration of data visualization methods and machine learning algorithms. The research entails the compilation of an extensive dataset comprising player attributes, conducting data preprocessing, feature selection, model selection, and model training to construct predictive models. The analysis within this study will involve delving into feature significance using methodologies like Select Best and Recursive Feature Elimination (RFE) to pinpoint pertinent attributes for predicting player performance. Various machine learning algorithms, including Random Forest, Decision Tree, Linear Regression, Support Vector Regression (SVR), and Artificial Neural Networks (ANN), will be explored to develop predictive models. The evaluation of each model's performance utilizing metrics such as Mean Squared Error (MSE) and R-squared will be executed to gauge their efficacy in predicting player performance. Furthermore, this investigation will encompass a top player analysis to recognize the top-performing players based on the anticipated overall performance scores. Nationality analysis will entail scrutinizing the player distribution based on nationality and investigating potential correlations between nationality and player performance. Positional analysis will concentrate on examining the player distribution across various positions and assessing the average performance of players in each position. Age analysis will evaluate the influence of age on player performance and identify any discernible trends or patterns associated with player age groups. The primary objective is to predict a football player's overall performance accurately based on their individual attributes, leveraging data-driven insights to enrich the comprehension of player success on the field. By amalgamating data visualization and machine learning methodologies, the aim is to furnish valuable tools for teams, managers, and fans to effectively analyze and forecast player performance. This research contributes to the progression of sports analytics by showcasing the potential of machine learning in predicting football player performance and offering actionable insights for diverse stakeholders in the football industry.

Keywords: football analytics, player performance prediction, data visualization, machine learning algorithms, random forest, decision tree, linear regression, support vector regression, artificial neural networks, model evaluation, top player analysis, nationality analysis, positional analysis

Procedia PDF Downloads 40
8156 Unsupervised Echocardiogram View Detection via Autoencoder-Based Representation Learning

Authors: Andrea Treviño Gavito, Diego Klabjan, Sanjiv J. Shah

Abstract:

Echocardiograms serve as pivotal resources for clinicians in diagnosing cardiac conditions, offering non-invasive insights into a heart’s structure and function. When echocardiographic studies are conducted, no standardized labeling of the acquired views is performed. Employing machine learning algorithms for automated echocardiogram view detection has emerged as a promising solution to enhance efficiency in echocardiogram use for diagnosis. However, existing approaches predominantly rely on supervised learning, necessitating labor-intensive expert labeling. In this paper, we introduce a fully unsupervised echocardiographic view detection framework that leverages convolutional autoencoders to obtain lower dimensional representations and the K-means algorithm for clustering them into view-related groups. Our approach focuses on discriminative patches from echocardiographic frames. Additionally, we propose a trainable inverse average layer to optimize decoding of average operations. By integrating both public and proprietary datasets, we obtain a marked improvement in model performance when compared to utilizing a proprietary dataset alone. Our experiments show boosts of 15.5% in accuracy and 9.0% in the F-1 score for frame-based clustering, and 25.9% in accuracy and 19.8% in the F-1 score for view-based clustering. Our research highlights the potential of unsupervised learning methodologies and the utilization of open-sourced data in addressing the complexities of echocardiogram interpretation, paving the way for more accurate and efficient cardiac diagnoses.

Keywords: artificial intelligence, echocardiographic view detection, echocardiography, machine learning, self-supervised representation learning, unsupervised learning

Procedia PDF Downloads 40
8155 Machine Learning Based Approach for Measuring Promotion Effectiveness in Multiple Parallel Promotions’ Scenarios

Authors: Revoti Prasad Bora, Nikita Katyal

Abstract:

Promotion is a key element in the retail business. Thus, analysis of promotions to quantify their effectiveness in terms of Revenue and/or Margin is an essential activity in the retail industry. However, measuring the sales/revenue uplift is based on estimations, as the actual sales/revenue without the promotion is not present. Further, the presence of Halo and Cannibalization in a multiple parallel promotions’ scenario complicates the problem. Calculating Baseline by considering inter-brand/competitor items or using Halo and Cannibalization's impact on Revenue calculations by considering Baseline as an interpretation of items’ unit sales in neighboring nonpromotional weeks individually may not capture the overall Revenue uplift in the case of multiple parallel promotions. Hence, this paper proposes a Machine Learning based method for calculating the Revenue uplift by considering the Halo and Cannibalization impact on the Baseline and the Revenue. In the first section of the proposed methodology, Baseline of an item is calculated by incorporating the impact of the promotions on its related items. In the later section, the Revenue of an item is calculated by considering both Halo and Cannibalization impacts. Hence, this methodology enables correct calculation of the overall Revenue uplift due a given promotion.

Keywords: Halo, Cannibalization, promotion, Baseline, temporary price reduction, retail, elasticity, cross price elasticity, machine learning, random forest, linear regression

Procedia PDF Downloads 182