Search results for: machine learning invariants
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8207

Search results for: machine learning invariants

7817 Automatic Method for Classification of Informative and Noninformative Images in Colonoscopy Video

Authors: Nidhal K. Azawi, John M. Gauch

Abstract:

Colorectal cancer is one of the leading causes of cancer death in the US and the world, which is why millions of colonoscopy examinations are performed annually. Unfortunately, noise, specular highlights, and motion artifacts corrupt many images in a typical colonoscopy exam. The goal of our research is to produce automated techniques to detect and correct or remove these noninformative images from colonoscopy videos, so physicians can focus their attention on informative images. In this research, we first automatically extract features from images. Then we use machine learning and deep neural network to classify colonoscopy images as either informative or noninformative. Our results show that we achieve image classification accuracy between 92-98%. We also show how the removal of noninformative images together with image alignment can aid in the creation of image panoramas and other visualizations of colonoscopy images.

Keywords: colonoscopy classification, feature extraction, image alignment, machine learning

Procedia PDF Downloads 230
7816 Using Swarm Intelligence to Forecast Outcomes of English Premier League Matches

Authors: Hans Schumann, Colin Domnauer, Louis Rosenberg

Abstract:

In this study, machine learning techniques were deployed on real-time human swarm data to forecast the likelihood of outcomes for English Premier League matches in the 2020/21 season. These techniques included ensemble models in combination with neural networks and were tested against an industry standard of Vegas Oddsmakers. Predictions made from the collective intelligence of human swarm participants managed to achieve a positive return on investment over a full season on matches, empirically proving the usefulness of a new artificial intelligence valuing human instinct and intelligence.

Keywords: artificial intelligence, data science, English Premier League, human swarming, machine learning, sports betting, swarm intelligence

Procedia PDF Downloads 184
7815 Data-Driven Market Segmentation in Hospitality Using Unsupervised Machine Learning

Authors: Rik van Leeuwen, Ger Koole

Abstract:

Within hospitality, marketing departments use segmentation to create tailored strategies to ensure personalized marketing. This study provides a data-driven approach by segmenting guest profiles via hierarchical clustering based on an extensive set of features. The industry requires understandable outcomes that contribute to adaptability for marketing departments to make data-driven decisions and ultimately driving profit. A marketing department specified a business question that guides the unsupervised machine learning algorithm. Features of guests change over time; therefore, there is a probability that guests transition from one segment to another. The purpose of the study is to provide steps in the process from raw data to actionable insights, which serve as a guideline for how hospitality companies can adopt an algorithmic approach.

Keywords: hierarchical cluster analysis, hospitality, market segmentation

Procedia PDF Downloads 80
7814 Analyzing Tools and Techniques for Classification In Educational Data Mining: A Survey

Authors: D. I. George Amalarethinam, A. Emima

Abstract:

Educational Data Mining (EDM) is one of the newest topics to emerge in recent years, and it is concerned with developing methods for analyzing various types of data gathered from the educational circle. EDM methods and techniques with machine learning algorithms are used to extract meaningful and usable information from huge databases. For scientists and researchers, realistic applications of Machine Learning in the EDM sectors offer new frontiers and present new problems. One of the most important research areas in EDM is predicting student success. The prediction algorithms and techniques must be developed to forecast students' performance, which aids the tutor, institution to boost the level of student’s performance. This paper examines various classification techniques in prediction methods and data mining tools used in EDM.

Keywords: classification technique, data mining, EDM methods, prediction methods

Procedia PDF Downloads 101
7813 Study on Dynamic Stiffness Matching and Optimization Design Method of a Machine Tool

Authors: Lu Xi, Li Pan, Wen Mengmeng

Abstract:

The stiffness of each component has different influences on the stiffness of the machine tool. Taking the five-axis gantry machining center as an example, we made the modal analysis of the machine tool, followed by raising and lowering the stiffness of the pillar, slide plate, beam, ram and saddle so as to study the stiffness matching among these components on the standard of whether the stiffness of the modified machine tool changes more than 50% relative to the stiffness of the original machine tool. The structural optimization of the machine tool can be realized by changing the stiffness of the components whose stiffness is mismatched. For example, the stiffness of the beam is mismatching. The natural frequencies of the first six orders of the beam increased by 7.70%, 0.38%, 6.82%, 7.96%, 18.72% and 23.13%, with the weight increased by 28Kg, leading to the natural frequencies of several orders which had a great influence on the dynamic performance of the whole machine increased by 1.44%, 0.43%, 0.065%, which verified the correctness of the optimization method based on stiffness matching proposed in this paper.

Keywords: machine tool, optimization, modal analysis, stiffness matching

Procedia PDF Downloads 70
7812 Prediction of Music Track Popularity: A Machine Learning Approach

Authors: Syed Atif Hassan, Luv Mehta, Syed Asif Hassan

Abstract:

Hit song science is a field of investigation wherein machine learning techniques are applied to music tracks in order to extract such features from audio signals which can capture information that could explain the popularity of respective tracks. Record companies invest huge amounts of money into recruiting fresh talents and churning out new music each year. Gaining insight into the basis of why a song becomes popular will result in tremendous benefits for the music industry. This paper aims to extract basic musical and more advanced, acoustic features from songs while also taking into account external factors that play a role in making a particular song popular. We use a dataset derived from popular Spotify playlists divided by genre. We use ten genres (blues, classical, country, disco, hip-hop, jazz, metal, pop, reggae, rock), chosen on the basis of clear to ambiguous delineation in the typical sound of their genres. We feed these features into three different classifiers, namely, SVM with RBF kernel, a deep neural network, and a recurring neural network, to build separate predictive models and choosing the best performing model at the end. Predicting song popularity is particularly important for the music industry as it would allow record companies to produce better content for the masses resulting in a more competitive market.

Keywords: classifier, machine learning, music tracks, popularity, prediction

Procedia PDF Downloads 623
7811 Using Machine Learning to Classify Human Fetal Health and Analyze Feature Importance

Authors: Yash Bingi, Yiqiao Yin

Abstract:

Reduction of child mortality is an ongoing struggle and a commonly used factor in determining progress in the medical field. The under-5 mortality number is around 5 million around the world, with many of the deaths being preventable. In light of this issue, Cardiotocograms (CTGs) have emerged as a leading tool to determine fetal health. By using ultrasound pulses and reading the responses, CTGs help healthcare professionals assess the overall health of the fetus to determine the risk of child mortality. However, interpreting the results of the CTGs is time-consuming and inefficient, especially in underdeveloped areas where an expert obstetrician is hard to come by. Using a support vector machine (SVM) and oversampling, this paper proposed a model that classifies fetal health with an accuracy of 99.59%. To further explain the CTG measurements, an algorithm based on Randomized Input Sampling for Explanation ((RISE) of Black-box Models was created, called Feature Alteration for explanation of Black Box Models (FAB), and compared the findings to Shapley Additive Explanations (SHAP) and Local Interpretable Model Agnostic Explanations (LIME). This allows doctors and medical professionals to classify fetal health with high accuracy and determine which features were most influential in the process.

Keywords: machine learning, fetal health, gradient boosting, support vector machine, Shapley values, local interpretable model agnostic explanations

Procedia PDF Downloads 121
7810 Designing Energy Efficient Buildings for Seasonal Climates Using Machine Learning Techniques

Authors: Kishor T. Zingre, Seshadhri Srinivasan

Abstract:

Energy consumption by the building sector is increasing at an alarming rate throughout the world and leading to more building-related CO₂ emissions into the environment. In buildings, the main contributors to energy consumption are heating, ventilation, and air-conditioning (HVAC) systems, lighting, and electrical appliances. It is hypothesised that the energy efficiency in buildings can be achieved by implementing sustainable technologies such as i) enhancing the thermal resistance of fabric materials for reducing heat gain (in hotter climates) and heat loss (in colder climates), ii) enhancing daylight and lighting system, iii) HVAC system and iv) occupant localization. Energy performance of various sustainable technologies is highly dependent on climatic conditions. This paper investigated the use of machine learning techniques for accurate prediction of air-conditioning energy in seasonal climates. The data required to train the machine learning techniques is obtained using the computational simulations performed on a 3-story commercial building using EnergyPlus program plugged-in with OpenStudio and Google SketchUp. The EnergyPlus model was calibrated against experimental measurements of surface temperatures and heat flux prior to employing for the simulations. It has been observed from the simulations that the performance of sustainable fabric materials (for walls, roof, and windows) such as phase change materials, insulation, cool roof, etc. vary with the climate conditions. Various renewable technologies were also used for the building flat roofs in various climates to investigate the potential for electricity generation. It has been observed that the proposed technique overcomes the shortcomings of existing approaches, such as local linearization or over-simplifying assumptions. In addition, the proposed method can be used for real-time estimation of building air-conditioning energy.

Keywords: building energy efficiency, energyplus, machine learning techniques, seasonal climates

Procedia PDF Downloads 96
7809 An Automated R-Peak Detection Method Using Common Vector Approach

Authors: Ali Kirkbas

Abstract:

R peaks in an electrocardiogram (ECG) are signs of cardiac activity in individuals that reveal valuable information about cardiac abnormalities, which can lead to mortalities in some cases. This paper examines the problem of detecting R-peaks in ECG signals, which is a two-class pattern classification problem in fact. To handle this problem with a reliable high accuracy, we propose to use the common vector approach which is a successful machine learning algorithm. The dataset used in the proposed method is obtained from MIT-BIH, which is publicly available. The results are compared with the other popular methods under the performance metrics. The obtained results show that the proposed method shows good performance than that of the other. methods compared in the meaning of diagnosis accuracy and simplicity which can be operated on wearable devices.

Keywords: ECG, R-peak classification, common vector approach, machine learning

Procedia PDF Downloads 31
7808 Optimizing Machine Vision System Setup Accuracy by Six-Sigma DMAIC Approach

Authors: Joseph C. Chen

Abstract:

Machine vision system provides automatic inspection to reduce manufacturing costs considerably. However, only a few principles have been found to optimize machine vision system and help it function more accurately in industrial practice. Mostly, there were complicated and impractical design techniques to improve the accuracy of machine vision system. This paper discusses implementing the Six Sigma Define, Measure, Analyze, Improve, and Control (DMAIC) approach to optimize the setup parameters of machine vision system when it is used as a direct measurement technique. This research follows a case study showing how Six Sigma DMAIC methodology has been put into use.

Keywords: DMAIC, machine vision system, process capability, Taguchi Parameter Design

Procedia PDF Downloads 404
7807 Online Learning Versus Face to Face Learning: A Sentiment Analysis on General Education Mathematics in the Modern World of University of San Carlos School of Arts and Sciences Students Using Natural Language Processing

Authors: Derek Brandon G. Yu, Clyde Vincent O. Pilapil, Christine F. Peña

Abstract:

College students of Cebu province have been indoors since March 2020, and a challenge encountered is the sudden shift from face to face to online learning and with the lack of empirical data on online learning on Higher Education Institutions (HEIs) in the Philippines. Sentiments on face to face and online learning will be collected from University of San Carlos (USC), School of Arts and Sciences (SAS) students regarding Mathematics in the Modern World (MMW), a General Education (GE) course. Natural Language Processing with machine learning algorithms will be used to classify the sentiments of the students. Results of the research study are the themes identified through topic modelling and the overall sentiments of the students in USC SAS

Keywords: natural language processing, online learning, sentiment analysis, topic modelling

Procedia PDF Downloads 211
7806 Predicting Football Player Performance: Integrating Data Visualization and Machine Learning

Authors: Saahith M. S., Sivakami R.

Abstract:

In the realm of football analytics, particularly focusing on predicting football player performance, the ability to forecast player success accurately is of paramount importance for teams, managers, and fans. This study introduces an elaborate examination of predicting football player performance through the integration of data visualization methods and machine learning algorithms. The research entails the compilation of an extensive dataset comprising player attributes, conducting data preprocessing, feature selection, model selection, and model training to construct predictive models. The analysis within this study will involve delving into feature significance using methodologies like Select Best and Recursive Feature Elimination (RFE) to pinpoint pertinent attributes for predicting player performance. Various machine learning algorithms, including Random Forest, Decision Tree, Linear Regression, Support Vector Regression (SVR), and Artificial Neural Networks (ANN), will be explored to develop predictive models. The evaluation of each model's performance utilizing metrics such as Mean Squared Error (MSE) and R-squared will be executed to gauge their efficacy in predicting player performance. Furthermore, this investigation will encompass a top player analysis to recognize the top-performing players based on the anticipated overall performance scores. Nationality analysis will entail scrutinizing the player distribution based on nationality and investigating potential correlations between nationality and player performance. Positional analysis will concentrate on examining the player distribution across various positions and assessing the average performance of players in each position. Age analysis will evaluate the influence of age on player performance and identify any discernible trends or patterns associated with player age groups. The primary objective is to predict a football player's overall performance accurately based on their individual attributes, leveraging data-driven insights to enrich the comprehension of player success on the field. By amalgamating data visualization and machine learning methodologies, the aim is to furnish valuable tools for teams, managers, and fans to effectively analyze and forecast player performance. This research contributes to the progression of sports analytics by showcasing the potential of machine learning in predicting football player performance and offering actionable insights for diverse stakeholders in the football industry.

Keywords: football analytics, player performance prediction, data visualization, machine learning algorithms, random forest, decision tree, linear regression, support vector regression, artificial neural networks, model evaluation, top player analysis, nationality analysis, positional analysis

Procedia PDF Downloads 18
7805 Machine Learning Based Approach for Measuring Promotion Effectiveness in Multiple Parallel Promotions’ Scenarios

Authors: Revoti Prasad Bora, Nikita Katyal

Abstract:

Promotion is a key element in the retail business. Thus, analysis of promotions to quantify their effectiveness in terms of Revenue and/or Margin is an essential activity in the retail industry. However, measuring the sales/revenue uplift is based on estimations, as the actual sales/revenue without the promotion is not present. Further, the presence of Halo and Cannibalization in a multiple parallel promotions’ scenario complicates the problem. Calculating Baseline by considering inter-brand/competitor items or using Halo and Cannibalization's impact on Revenue calculations by considering Baseline as an interpretation of items’ unit sales in neighboring nonpromotional weeks individually may not capture the overall Revenue uplift in the case of multiple parallel promotions. Hence, this paper proposes a Machine Learning based method for calculating the Revenue uplift by considering the Halo and Cannibalization impact on the Baseline and the Revenue. In the first section of the proposed methodology, Baseline of an item is calculated by incorporating the impact of the promotions on its related items. In the later section, the Revenue of an item is calculated by considering both Halo and Cannibalization impacts. Hence, this methodology enables correct calculation of the overall Revenue uplift due a given promotion.

Keywords: Halo, Cannibalization, promotion, Baseline, temporary price reduction, retail, elasticity, cross price elasticity, machine learning, random forest, linear regression

Procedia PDF Downloads 144
7804 Automatic Detection of Suicidal Behaviors Using an RGB-D Camera: Azure Kinect

Authors: Maha Jazouli

Abstract:

Suicide is one of the most important causes of death in the prison environment, both in Canada and internationally. Rates of attempts of suicide and self-harm have been on the rise in recent years, with hangings being the most frequent method resorted to. The objective of this article is to propose a method to automatically detect in real time suicidal behaviors. We present a gesture recognition system that consists of three modules: model-based movement tracking, feature extraction, and gesture recognition using machine learning algorithms (MLA). Our proposed system gives us satisfactory results. This smart video surveillance system can help assist staff responsible for the safety and health of inmates by alerting them when suicidal behavior is detected, which helps reduce mortality rates and save lives.

Keywords: suicide detection, Kinect azure, RGB-D camera, SVM, machine learning, gesture recognition

Procedia PDF Downloads 157
7803 A Monte Carlo Fuzzy Logistic Regression Framework against Imbalance and Separation

Authors: Georgios Charizanos, Haydar Demirhan, Duygu Icen

Abstract:

Two of the most impactful issues in classical logistic regression are class imbalance and complete separation. These can result in model predictions heavily leaning towards the imbalanced class on the binary response variable or over-fitting issues. Fuzzy methodology offers key solutions for handling these problems. However, most studies propose the transformation of the binary responses into a continuous format limited within [0,1]. This is called the possibilistic approach within fuzzy logistic regression. Following this approach is more aligned with straightforward regression since a logit-link function is not utilized, and fuzzy probabilities are not generated. In contrast, we propose a method of fuzzifying binary response variables that allows for the use of the logit-link function; hence, a probabilistic fuzzy logistic regression model with the Monte Carlo method. The fuzzy probabilities are then classified by selecting a fuzzy threshold. Different combinations of fuzzy and crisp input, output, and coefficients are explored, aiming to understand which of these perform better under different conditions of imbalance and separation. We conduct numerical experiments using both synthetic and real datasets to demonstrate the performance of the fuzzy logistic regression framework against seven crisp machine learning methods. The proposed framework shows better performance irrespective of the degree of imbalance and presence of separation in the data, while the considered machine learning methods are significantly impacted.

Keywords: fuzzy logistic regression, fuzzy, logistic, machine learning

Procedia PDF Downloads 43
7802 A Non-Destructive Estimation Method for Internal Time in Perilla Leaf Using Hyperspectral Data

Authors: Shogo Nagano, Yusuke Tanigaki, Hirokazu Fukuda

Abstract:

Vegetables harvested early in the morning or late in the afternoon are valued in plant production, and so the time of harvest is important. The biological functions known as circadian clocks have a significant effect on this harvest timing. The purpose of this study was to non-destructively estimate the circadian clock and so construct a method for determining a suitable harvest time. We took eight samples of green busil (Perilla frutescens var. crispa) every 4 hours, six times for 1 day and analyzed all samples at the same time. A hyperspectral camera was used to collect spectrum intensities at 141 different wavelengths (350–1050 nm). Calculation of correlations between spectrum intensity of each wavelength and harvest time suggested the suitability of the hyperspectral camera for non-destructive estimation. However, even the highest correlated wavelength had a weak correlation, so we used machine learning to raise the accuracy of estimation and constructed a machine learning model to estimate the internal time of the circadian clock. Artificial neural networks (ANN) were used for machine learning because this is an effective analysis method for large amounts of data. Using the estimation model resulted in an error between estimated and real times of 3 min. The estimations were made in less than 2 hours. Thus, we successfully demonstrated this method of non-destructively estimating internal time.

Keywords: artificial neural network (ANN), circadian clock, green busil, hyperspectral camera, non-destructive evaluation

Procedia PDF Downloads 275
7801 Using Combination of Different Sets of Features of Molecules for Improved Prediction of Solubility

Authors: Muhammet Baldan, Emel Timuçin

Abstract:

Generally, absorption and bioavailability increase if solubility increases; therefore, it is crucial to predict them in drug discovery applications. Molecular descriptors and Molecular properties are traditionally used for the prediction of water solubility. There are various key descriptors that are used for this purpose, namely Drogan Descriptors, Morgan Descriptors, Maccs keys, etc., and each has different prediction capabilities with differentiating successes between different data sets. Another source for the prediction of solubility is structural features; they are commonly used for the prediction of solubility. However, there are little to no studies that combine three or more properties or descriptors for prediction to produce a more powerful prediction model. Unlike available models, we used a combination of those features in a random forest machine learning model for improved solubility prediction to better predict and, therefore, contribute to drug discovery systems.

Keywords: solubility, molecular descriptors, machine learning, random forest

Procedia PDF Downloads 28
7800 Vibration-Based Data-Driven Model for Road Health Monitoring

Authors: Guru Prakash, Revanth Dugalam

Abstract:

A road’s condition often deteriorates due to harsh loading such as overload due to trucks, and severe environmental conditions such as heavy rain, snow load, and cyclic loading. In absence of proper maintenance planning, this results in potholes, wide cracks, bumps, and increased roughness of roads. In this paper, a data-driven model will be developed to detect these damages using vibration and image signals. The key idea of the proposed methodology is that the road anomaly manifests in these signals, which can be detected by training a machine learning algorithm. The use of various machine learning techniques such as the support vector machine and Radom Forest method will be investigated. The proposed model will first be trained and tested with artificially simulated data, and the model architecture will be finalized by comparing the accuracies of various models. Once a model is fixed, the field study will be performed, and data will be collected. The field data will be used to validate the proposed model and to predict the future road’s health condition. The proposed will help to automate the road condition monitoring process, repair cost estimation, and maintenance planning process.

Keywords: SVM, data-driven, road health monitoring, pot-hole

Procedia PDF Downloads 61
7799 Machine Learning for Exoplanetary Habitability Assessment

Authors: King Kumire, Amos Kubeka

Abstract:

The synergy of machine learning and astronomical technology advancement is giving rise to the new space age, which is pronounced by better habitability assessments. To initiate this discussion, it should be recorded for definition purposes that the symbiotic relationship between astronomy and improved computing has been code-named the Cis-Astro gateway concept. The cosmological fate of this phrase has been unashamedly plagiarized from the cis-lunar gateway template and its associated LaGrange points which act as an orbital bridge to the moon from our planet Earth. However, for this study, the scientific audience is invited to bridge toward the discovery of new habitable planets. It is imperative to state that cosmic probes of this magnitude can be utilized as the starting nodes of the astrobiological search for galactic life. This research can also assist by acting as the navigation system for future space telescope launches through the delimitation of target exoplanets. The findings and the associated platforms can be harnessed as building blocks for the modeling of climate change on planet earth. The notion that if the human genus exhausts the resources of the planet earth or there is a bug of some sort that makes the earth inhabitable for humans explains the need to find an alternative planet to inhabit. The scientific community, through interdisciplinary discussions of the International Astronautical Federation so far has the common position that engineers can reduce space mission costs by constructing a stable cis-lunar orbit infrastructure for refilling and carrying out other associated in-orbit servicing activities. Similarly, the Cis-Astro gateway can be envisaged as a budget optimization technique that models extra-solar bodies and can facilitate the scoping of future mission rendezvous. It should be registered as well that this broad and voluminous catalog of exoplanets shall be narrowed along the way using machine learning filters. The gist of this topic revolves around the indirect economic rationale of establishing a habitability scoping platform.

Keywords: machine-learning, habitability, exoplanets, supercomputing

Procedia PDF Downloads 65
7798 Machine Learning for Exoplanetary Habitability Assessment

Authors: King Kumire, Amos Kubeka

Abstract:

The synergy of machine learning and astronomical technology advancement is giving rise to the new space age, which is pronounced by better habitability assessments. To initiate this discussion, it should be recorded for definition purposes that the symbiotic relationship between astronomy and improved computing has been code-named the Cis-Astro gateway concept. The cosmological fate of this phrase has been unashamedly plagiarized from the cis-lunar gateway template and its associated LaGrange points which act as an orbital bridge to the moon from our planet Earth. However, for this study, the scientific audience is invited to bridge toward the discovery of new habitable planets. It is imperative to state that cosmic probes of this magnitude can be utilized as the starting nodes of the astrobiological search for galactic life. This research can also assist by acting as the navigation system for future space telescope launches through the delimitation of target exoplanets. The findings and the associated platforms can be harnessed as building blocks for the modeling of climate change on planet earth. The notion that if the human genus exhausts the resources of the planet earth or there is a bug of some sort that makes the earth inhabitable for humans explains the need to find an alternative planet to inhabit. The scientific community, through interdisciplinary discussions of the International Astronautical Federation so far, has the common position that engineers can reduce space mission costs by constructing a stable cis-lunar orbit infrastructure for refilling and carrying out other associated in-orbit servicing activities. Similarly, the Cis-Astro gateway can be envisaged as a budget optimization technique that models extra-solar bodies and can facilitate the scoping of future mission rendezvous. It should be registered as well that this broad and voluminous catalog of exoplanets shall be narrowed along the way using machine learning filters. The gist of this topic revolves around the indirect economic rationale of establishing a habitability scoping platform.

Keywords: exoplanets, habitability, machine-learning, supercomputing

Procedia PDF Downloads 82
7797 Sentiment Analysis: Comparative Analysis of Multilingual Sentiment and Opinion Classification Techniques

Authors: Sannikumar Patel, Brian Nolan, Markus Hofmann, Philip Owende, Kunjan Patel

Abstract:

Sentiment analysis and opinion mining have become emerging topics of research in recent years but most of the work is focused on data in the English language. A comprehensive research and analysis are essential which considers multiple languages, machine translation techniques, and different classifiers. This paper presents, a comparative analysis of different approaches for multilingual sentiment analysis. These approaches are divided into two parts: one using classification of text without language translation and second using the translation of testing data to a target language, such as English, before classification. The presented research and results are useful for understanding whether machine translation should be used for multilingual sentiment analysis or building language specific sentiment classification systems is a better approach. The effects of language translation techniques, features, and accuracy of various classifiers for multilingual sentiment analysis is also discussed in this study.

Keywords: cross-language analysis, machine learning, machine translation, sentiment analysis

Procedia PDF Downloads 683
7796 Develop a Conceptual Data Model of Geotechnical Risk Assessment in Underground Coal Mining Using a Cloud-Based Machine Learning Platform

Authors: Reza Mohammadzadeh

Abstract:

The major challenges in geotechnical engineering in underground spaces arise from uncertainties and different probabilities. The collection, collation, and collaboration of existing data to incorporate them in analysis and design for given prospect evaluation would be a reliable, practical problem solving method under uncertainty. Machine learning (ML) is a subfield of artificial intelligence in statistical science which applies different techniques (e.g., Regression, neural networks, support vector machines, decision trees, random forests, genetic programming, etc.) on data to automatically learn and improve from them without being explicitly programmed and make decisions and predictions. In this paper, a conceptual database schema of geotechnical risks in underground coal mining based on a cloud system architecture has been designed. A new approach of risk assessment using a three-dimensional risk matrix supported by the level of knowledge (LoK) has been proposed in this model. Subsequently, the model workflow methodology stages have been described. In order to train data and LoK models deployment, an ML platform has been implemented. IBM Watson Studio, as a leading data science tool and data-driven cloud integration ML platform, is employed in this study. As a Use case, a data set of geotechnical hazards and risk assessment in underground coal mining were prepared to demonstrate the performance of the model, and accordingly, the results have been outlined.

Keywords: data model, geotechnical risks, machine learning, underground coal mining

Procedia PDF Downloads 247
7795 Artificial Intelligence-Based Thermal Management of Battery System for Electric Vehicles

Authors: Raghunandan Gurumurthy, Aricson Pereira, Sandeep Patil

Abstract:

The escalating adoption of electric vehicles (EVs) across the globe has underscored the critical importance of advancing battery system technologies. This has catalyzed a shift towards the design and development of battery systems that not only exhibit higher energy efficiency but also boast enhanced thermal performance and sophisticated multi-material enclosures. A significant leap in this domain has been the incorporation of simulation-based design optimization for battery packs and Battery Management Systems (BMS), a move further enriched by integrating artificial intelligence/machine learning (AI/ML) approaches. These strategies are pivotal in refining the design, manufacturing, and operational processes for electric vehicles and energy storage systems. By leveraging AI/ML, stakeholders can now predict battery performance metrics—such as State of Health, State of Charge, and State of Power—with unprecedented accuracy. Furthermore, as Li-ion batteries (LIBs) become more prevalent in urban settings, the imperative for bolstering thermal and fire resilience has intensified. This has propelled Battery Thermal Management Systems (BTMs) to the forefront of energy storage research, highlighting the role of machine learning and AI not just as tools for enhanced safety management through accurate temperature forecasts and diagnostics but also as indispensable allies in the early detection and warning of potential battery fires.

Keywords: electric vehicles, battery thermal management, industrial engineering, machine learning, artificial intelligence, manufacturing

Procedia PDF Downloads 51
7794 A Generalized Framework for Adaptive Machine Learning Deployments in Algorithmic Trading

Authors: Robert Caulk

Abstract:

A generalized framework for adaptive machine learning deployments in algorithmic trading is introduced, tested, and released as open-source code. The presented software aims to test the hypothesis that recent data contains enough information to form a probabilistically favorable short-term price prediction. Further, the framework contains various adaptive machine learning techniques that are geared toward generating profit during strong trends and minimizing losses during trend changes. Results demonstrate that this adaptive machine learning approach is capable of capturing trends and generating profit. The presentation also discusses the importance of defining the parameter space associated with the dynamic training data-set and using the parameter space to identify and remove outliers from prediction data points. Meanwhile, the generalized architecture enables common users to exploit the powerful machinery while focusing on high-level feature engineering and model testing. The presentation also highlights common strengths and weaknesses associated with the presented technique and presents a broad range of well-tested starting points for feature set construction, target setting, and statistical methods for enforcing risk management and maintaining probabilistically favorable entry and exit points. The presentation also describes the end-to-end data processing tools associated with FreqAI, including automatic data fetching, data aggregation, feature engineering, safe and robust data pre-processing, outlier detection, custom machine learning and statistical tools, data post-processing, and adaptive training backtest emulation, and deployment of adaptive training in live environments. Finally, the generalized user interface is also discussed in the presentation. Feature engineering is simplified so that users can seed their feature sets with common indicator libraries (e.g. TA-lib, pandas-ta). The user also feeds data expansion parameters to fill out a large feature set for the model, which can contain as many as 10,000+ features. The presentation describes the various object-oriented programming techniques employed to make FreqAI agnostic to third-party libraries and external data sources. In other words, the back-end is constructed in such a way that users can leverage a broad range of common regression libraries (Catboost, LightGBM, Sklearn, etc) as well as common Neural Network libraries (TensorFlow, PyTorch) without worrying about the logistical complexities associated with data handling and API interactions. The presentation finishes by drawing conclusions about the most important parameters associated with a live deployment of the adaptive learning framework and provides the road map for future development in FreqAI.

Keywords: machine learning, market trend detection, open-source, adaptive learning, parameter space exploration

Procedia PDF Downloads 65
7793 Infrared Spectroscopy in Tandem with Machine Learning for Simultaneous Rapid Identification of Bacteria Isolated Directly from Patients' Urine Samples and Determination of Their Susceptibility to Antibiotics

Authors: Mahmoud Huleihel, George Abu-Aqil, Manal Suleiman, Klaris Riesenberg, Itshak Lapidot, Ahmad Salman

Abstract:

Urinary tract infections (UTIs) are considered to be the most common bacterial infections worldwide, which are caused mainly by Escherichia (E.) coli (about 80%). Klebsiella pneumoniae (about 10%) and Pseudomonas aeruginosa (about 6%). Although antibiotics are considered as the most effective treatment for bacterial infectious diseases, unfortunately, most of the bacteria already have developed resistance to the majority of the commonly available antibiotics. Therefore, it is crucial to identify the infecting bacteria and to determine its susceptibility to antibiotics for prescribing effective treatment. Classical methods are time consuming, require ~48 hours for determining bacterial susceptibility. Thus, it is highly urgent to develop a new method that can significantly reduce the time required for determining both infecting bacterium at the species level and diagnose its susceptibility to antibiotics. Fourier-Transform Infrared (FTIR) spectroscopy is well known as a sensitive and rapid method, which can detect minor molecular changes in bacterial genome associated with the development of resistance to antibiotics. The main goal of this study is to examine the potential of FTIR spectroscopy, in tandem with machine learning algorithms, to identify the infected bacteria at the species level and to determine E. coli susceptibility to different antibiotics directly from patients' urine in about 30minutes. For this goal, 1600 different E. coli isolates were isolated for different patients' urine sample, measured by FTIR, and analyzed using different machine learning algorithm like Random Forest, XGBoost, and CNN. We achieved 98% success in isolate level identification and 89% accuracy in susceptibility determination.

Keywords: urinary tract infections (UTIs), E. coli, Klebsiella pneumonia, Pseudomonas aeruginosa, bacterial, susceptibility to antibiotics, infrared microscopy, machine learning

Procedia PDF Downloads 142
7792 Comparison of Machine Learning and Deep Learning Algorithms for Automatic Classification of 80 Different Pollen Species

Authors: Endrick Barnacin, Jean-Luc Henry, Jimmy Nagau, Jack Molinie

Abstract:

Palynology is a field of interest in many disciplines due to its multiple applications: chronological dating, climatology, allergy treatment, and honey characterization. Unfortunately, the analysis of a pollen slide is a complicated and time consuming task that requires the intervention of experts in the field, which are becoming increasingly rare due to economic and social conditions. That is why the need for automation of this task is urgent. A lot of studies have investigated the subject using different standard image processing descriptors and sometimes hand-crafted ones.In this work, we make a comparative study between classical feature extraction methods (Shape, GLCM, LBP, and others) and Deep Learning (CNN, Autoencoders, Transfer Learning) to perform a recognition task over 80 regional pollen species. It has been found that the use of Transfer Learning seems to be more precise than the other approaches

Keywords: pollens identification, features extraction, pollens classification, automated palynology

Procedia PDF Downloads 109
7791 A Review on Intelligent Systems for Geoscience

Authors: R Palson Kennedy, P.Kiran Sai

Abstract:

This article introduces machine learning (ML) researchers to the hurdles that geoscience problems present, as well as the opportunities for improvement in both ML and geosciences. This article presents a review from the data life cycle perspective to meet that need. Numerous facets of geosciences present unique difficulties for the study of intelligent systems. Geosciences data is notoriously difficult to analyze since it is frequently unpredictable, intermittent, sparse, multi-resolution, and multi-scale. The first half addresses data science’s essential concepts and theoretical underpinnings, while the second section contains key themes and sharing experiences from current publications focused on each stage of the data life cycle. Finally, themes such as open science, smart data, and team science are considered.

Keywords: Data science, intelligent system, machine learning, big data, data life cycle, recent development, geo science

Procedia PDF Downloads 117
7790 Investigating the performance of machine learning models on PM2.5 forecasts: A case study in the city of Thessaloniki

Authors: Alexandros Pournaras, Anastasia Papadopoulou, Serafim Kontos, Anastasios Karakostas

Abstract:

The air quality of modern cities is an important concern, as poor air quality contributes to human health and environmental issues. Reliable air quality forecasting has, thus, gained scientific and governmental attention as an essential tool that enables authorities to take proactive measures for public safety. In this study, the potential of Machine Learning (ML) models to forecast PM2.5 at local scale is investigated in the city of Thessaloniki, the second largest city in Greece, which has been struggling with the persistent issue of air pollution. ML models, with proven ability to address timeseries forecasting, are employed to predict the PM2.5 concentrations and the respective Air Quality Index 5-days ahead by learning from daily historical air quality and meteorological data from 2014 to 2016 and gathered from two stations with different land use characteristics in the urban fabric of Thessaloniki. The performance of the ML models on PM2.5 concentrations is evaluated with common statistical methods, such as R squared (r²) and Root Mean Squared Error (RMSE), utilizing a portion of the stations’ measurements as test set. A multi-categorical evaluation is utilized for the assessment of their performance on respective AQIs. Several conclusions were made from the experiments conducted. Experimenting on MLs’ configuration revealed a moderate effect of various parameters and training schemas on the model’s predictions. Their performance of all these models were found to produce satisfactory results on PM2.5 concentrations. In addition, their application on untrained stations showed that these models can perform well, indicating a generalized behavior. Moreover, their performance on AQI was even better, showing that the MLs can be used as predictors for AQI, which is the direct information provided to the general public.

Keywords: Air Quality, AQ Forecasting, AQI, Machine Learning, PM2.5

Procedia PDF Downloads 46
7789 A Comparative Time-Series Analysis and Deep Learning Projection of Innate Radon Gas Risk in Canadian and Swedish Residential Buildings

Authors: Selim M. Khan, Dustin D. Pearson, Tryggve Rönnqvist, Markus E. Nielsen, Joshua M. Taron, Aaron A. Goodarzi

Abstract:

Accumulation of radioactive radon gas in indoor air poses a serious risk to human health by increasing the lifetime risk of lung cancer and is classified by IARC as a category one carcinogen. Radon exposure risks are a function of geologic, geographic, design, and human behavioural variables and can change over time. Using time series and deep machine learning modelling, we analyzed long-term radon test outcomes as a function of building metrics from 25,489 Canadian and 38,596 Swedish residential properties constructed between 1945 to 2020. While Canadian and Swedish properties built between 1970 and 1980 are comparable (96–103 Bq/m³), innate radon risks subsequently diverge, rising in Canada and falling in Sweden such that 21st Century Canadian houses show 467% greater average radon (131 Bq/m³) relative to Swedish equivalents (28 Bq/m³). These trends are consistent across housing types and regions within each country. The introduction of energy efficiency measures within Canadian and Swedish building codes coincided with opposing radon level trajectories in each nation. Deep machine learning modelling predicts that, without intervention, average Canadian residential radon levels will increase to 176 Bq/m³ by 2050, emphasizing the importance and urgency of future building code intervention to achieve systemic radon reduction in Canada.

Keywords: radon health risk, time-series, deep machine learning, lung cancer, Canada, Sweden

Procedia PDF Downloads 59
7788 A Case-Based Reasoning-Decision Tree Hybrid System for Stock Selection

Authors: Yaojun Wang, Yaoqing Wang

Abstract:

Stock selection is an important decision-making problem. Many machine learning and data mining technologies are employed to build automatic stock-selection system. A profitable stock-selection system should consider the stock’s investment value and the market timing. In this paper, we present a hybrid system including both engage for stock selection. This system uses a case-based reasoning (CBR) model to execute the stock classification, uses a decision-tree model to help with market timing and stock selection. The experiments show that the performance of this hybrid system is better than that of other techniques regarding to the classification accuracy, the average return and the Sharpe ratio.

Keywords: case-based reasoning, decision tree, stock selection, machine learning

Procedia PDF Downloads 387