Search results for: science-informed machine learning
8143 Predicting Low Birth Weight Using Machine Learning: A Study on 53,637 Ethiopian Birth Data
Authors: Kehabtimer Shiferaw Kotiso, Getachew Hailemariam, Abiy Seifu Estifanos
Abstract:
Introduction: Despite the highest share of low birth weight (LBW) for neonatal mortality and morbidity, predicting births with LBW for better intervention preparation is challenging. This study aims to predict LBW using a dataset encompassing 53,637 birth cohorts collected from 36 primary hospitals across seven regions in Ethiopia from February 2022 to June 2024. Methods: We identified ten explanatory variables related to maternal and neonatal characteristics, including maternal education, age, residence, history of miscarriage or abortion, history of preterm birth, type of pregnancy, number of livebirths, number of stillbirths, antenatal care frequency, and sex of the fetus to predict LBW. Using WEKA 3.8.2, we developed and compared seven machine learning algorithms. Data preprocessing included handling missing values, outlier detection, and ensuring data integrity in birth weight records. Model performance was evaluated through metrics such as accuracy, precision, recall, F1-score, and area under the Receiver Operating Characteristic curve (ROC AUC) using 10-fold cross-validation. Results: The results demonstrated that the decision tree, J48, logistic regression, and gradient boosted trees model achieved the highest accuracy (94.5% to 94.6%) with a precision of 93.1% to 93.3%, F1-score of 92.7% to 93.1%, and ROC AUC of 71.8% to 76.6%. Conclusion: This study demonstrates the effectiveness of machine learning models in predicting LBW. The high accuracy and recall rates achieved indicate that these models can serve as valuable tools for healthcare policymakers and providers in identifying at-risk newborns and implementing timely interventions to achieve the sustainable developmental goal (SDG) related to neonatal mortality.Keywords: low birth weight, machine learning, classification, neonatal mortality, Ethiopia
Procedia PDF Downloads 218142 Implementation of Data Science in Field of Homologation
Authors: Shubham Bhonde, Nekzad Doctor, Shashwat Gawande
Abstract:
For the use and the import of Keys and ID Transmitter as well as Body Control Modules with radio transmission in a lot of countries, homologation is required. Final deliverables in homologation of the product are certificates. In considering the world of homologation, there are approximately 200 certificates per product, with most of the certificates in local languages. It is challenging to manually investigate each certificate and extract relevant data from the certificate, such as expiry date, approval date, etc. It is most important to get accurate data from the certificate as inaccuracy may lead to missing re-homologation of certificates that will result in an incompliance situation. There is a scope of automation in reading the certificate data in the field of homologation. We are using deep learning as a tool for automation. We have first trained a model using machine learning by providing all country's basic data. We have trained this model only once. We trained the model by feeding pdf and jpg files using the ETL process. Eventually, that trained model will give more accurate results later. As an outcome, we will get the expiry date and approval date of the certificate with a single click. This will eventually help to implement automation features on a broader level in the database where certificates are stored. This automation will help to minimize human error to almost negligible.Keywords: homologation, re-homologation, data science, deep learning, machine learning, ETL (extract transform loading)
Procedia PDF Downloads 1638141 Internet-Based Architecture for Machine-to-Machine Communication of a Public Security Network
Authors: Ogwueleka Francisca Nonyelum, Jiya Muhammad
Abstract:
Poor communication between the victims of the burglaries, road and fire accidents and the agencies, and lack of quick emergency response by the agencies is solved through Machine-to-Machine (M2M) communication. A distress caller is expected to make a call through a network to the respective agency for emergency response but due to some challenges, this often becomes arduous and futile. This research puts forth an Internet-based architecture for Machine-to-Machine (M2M) communication to enhance information dissemination in National Public Security Communication System (NPSCS) network. M2M enables the flow of data between machines and machines and ultimately machines and people with information flowing from a machine over a network, and then through a gateway to a system where it is reviewed and acted on. The research findings showed that Internet-based architecture for M2M communication is most suitable for deployment of a public security network which will allow machines to use Internet to talk to each other.Keywords: machine-to-machine (M2M), internet-based architecture, network, gateway
Procedia PDF Downloads 4828140 Validating Condition-Based Maintenance Algorithms through Simulation
Authors: Marcel Chevalier, Léo Dupont, Sylvain Marié, Frédérique Roffet, Elena Stolyarova, William Templier, Costin Vasile
Abstract:
Industrial end-users are currently facing an increasing need to reduce the risk of unexpected failures and optimize their maintenance. This calls for both short-term analysis and long-term ageing anticipation. At Schneider Electric, we tackle those two issues using both machine learning and first principles models. Machine learning models are incrementally trained from normal data to predict expected values and detect statistically significant short-term deviations. Ageing models are constructed by breaking down physical systems into sub-assemblies, then determining relevant degradation modes and associating each one to the right kinetic law. Validating such anomaly detection and maintenance models is challenging, both because actual incident and ageing data are rare and distorted by human interventions, and incremental learning depends on human feedback. To overcome these difficulties, we propose to simulate physics, systems, and humans -including asset maintenance operations- in order to validate the overall approaches in accelerated time and possibly choose between algorithmic alternatives.Keywords: degradation models, ageing, anomaly detection, soft sensor, incremental learning
Procedia PDF Downloads 1268139 Correlation Analysis to Quantify Learning Outcomes for Different Teaching Pedagogies
Authors: Kanika Sood, Sijie Shang
Abstract:
A fundamental goal of education includes preparing students to become a part of the global workforce by making beneficial contributions to society. In this paper, we analyze student performance for multiple courses that involve different teaching pedagogies: a cooperative learning technique and an inquiry-based learning strategy. Student performance includes student engagement, grades, and attendance records. We perform this study in the Computer Science department for online and in-person courses for 450 students. We will perform correlation analysis to study the relationship between student scores and other parameters such as gender, mode of learning. We use natural language processing and machine learning to analyze student feedback data and performance data. We assess the learning outcomes of two teaching pedagogies for undergraduate and graduate courses to showcase the impact of pedagogical adoption and learning outcome as determinants of academic achievement. Early findings suggest that when using the specified pedagogies, students become experts on their topics and illustrate enhanced engagement with peers.Keywords: bag-of-words, cooperative learning, education, inquiry-based learning, in-person learning, natural language processing, online learning, sentiment analysis, teaching pedagogy
Procedia PDF Downloads 778138 Application of Machine Learning Techniques in Forest Cover-Type Prediction
Authors: Saba Ebrahimi, Hedieh Ashrafi
Abstract:
Predicting the cover type of forests is a challenge for natural resource managers. In this project, we aim to perform a comprehensive comparative study of two well-known classification methods, support vector machine (SVM) and decision tree (DT). The comparison is first performed among different types of each classifier, and then the best of each classifier will be compared by considering different evaluation metrics. The effect of boosting and bagging for decision trees is also explored. Furthermore, the effect of principal component analysis (PCA) and feature selection is also investigated. During the project, the forest cover-type dataset from the remote sensing and GIS program is used in all computations.Keywords: classification methods, support vector machine, decision tree, forest cover-type dataset
Procedia PDF Downloads 2178137 Distributed Cyber Physical Secure Framework for DC Microgrids: DC Ship Power System Applications
Authors: Grace karimi Muriithi, Behnaz Papari, Ali Arsalan, Christopher Shannon Edrington
Abstract:
Complexity and nonlinearity of the control system design is increasing for DC microgrid applications when the cyber concept associated with the technology constraints will added to the picture. Controllers’ functionality during the critical operation mode is required to guaranteed specifically for a high profile applications such as NAVY DC ship power system (SPS) as an small-scaled DC microgrid. Thus, SPS is susceptible to cyber-attacks and, accordingly, can provide the disastrous effects. In this study, a machine learning (ML) approach is demonstrated to offer the promising performance of SPS for developing an effective and robust functionality over attacks time. Simulation results analysis demonstrate that the proposed method can improve the controllability successfully.Keywords: controlability, cyber attacks, distribute control, machine learning
Procedia PDF Downloads 1148136 Comparison Study of Machine Learning Classifiers for Speech Emotion Recognition
Authors: Aishwarya Ravindra Fursule, Shruti Kshirsagar
Abstract:
In the intersection of artificial intelligence and human-centered computing, this paper delves into speech emotion recognition (SER). It presents a comparative analysis of machine learning models such as K-Nearest Neighbors (KNN),logistic regression, support vector machines (SVM), decision trees, ensemble classifiers, and random forests, applied to SER. The research employs four datasets: Crema D, SAVEE, TESS, and RAVDESS. It focuses on extracting salient audio signal features like Zero Crossing Rate (ZCR), Chroma_stft, Mel Frequency Cepstral Coefficients (MFCC), root mean square (RMS) value, and MelSpectogram. These features are used to train and evaluate the models’ ability to recognize eight types of emotions from speech: happy, sad, neutral, angry, calm, disgust, fear, and surprise. Among the models, the Random Forest algorithm demonstrated superior performance, achieving approximately 79% accuracy. This suggests its suitability for SER within the parameters of this study. The research contributes to SER by showcasing the effectiveness of various machine learning algorithms and feature extraction techniques. The findings hold promise for the development of more precise emotion recognition systems in the future. This abstract provides a succinct overview of the paper’s content, methods, and results.Keywords: comparison, ML classifiers, KNN, decision tree, SVM, random forest, logistic regression, ensemble classifiers
Procedia PDF Downloads 458135 Cognition of Driving Context for Driving Assistance
Authors: Manolo Dulva Hina, Clement Thierry, Assia Soukane, Amar Ramdane-Cherif
Abstract:
In this paper, we presented our innovative way of determining the driving context for a driving assistance system. We invoke the fusion of all parameters that describe the context of the environment, the vehicle and the driver to obtain the driving context. We created a training set that stores driving situation patterns and from which the system consults to determine the driving situation. A machine-learning algorithm predicts the driving situation. The driving situation is an input to the fission process that yields the action that must be implemented when the driver needs to be informed or assisted from the given the driving situation. The action may be directed towards the driver, the vehicle or both. This is an ongoing work whose goal is to offer an alternative driving assistance system for safe driving, green driving and comfortable driving. Here, ontologies are used for knowledge representation.Keywords: cognitive driving, intelligent transportation system, multimodal system, ontology, machine learning
Procedia PDF Downloads 3678134 Liquid Biopsy Based Microbial Biomarker in Coronary Artery Disease Diagnosis
Authors: Eyup Ozkan, Ozkan U. Nalbantoglu, Aycan Gundogdu, Mehmet Hora, A. Emre Onuk
Abstract:
The human microbiome has been associated with cardiological conditions and this relationship is becoming to be defined beyond the gastrointestinal track. In this study, we investigate the alteration in circulatory microbiota in the context of Coronary Artery Disease (CAD). We received circulatory blood samples from suspected CAD patients and maintain 16S ribosomal RNA sequencing to identify each patient’s microbiome. It was found that Corynebacterium and Methanobacteria genera show statistically significant differences between healthy and CAD patients. The overall biodiversities between the groups were observed to be different revealed by machine learning classification models. We also achieve and demonstrate the performance of a diagnostic method using circulatory blood microbiome-based estimation.Keywords: coronary artery disease, blood microbiome, machine learning, angiography, next-generation sequencing
Procedia PDF Downloads 1568133 Short-Term Forecast of Wind Turbine Production with Machine Learning Methods: Direct Approach and Indirect Approach
Authors: Mamadou Dione, Eric Matzner-lober, Philippe Alexandre
Abstract:
The Energy Transition Act defined by the French State has precise implications on Renewable Energies, in particular on its remuneration mechanism. Until then, a purchase obligation contract permitted the sale of wind-generated electricity at a fixed rate. Tomorrow, it will be necessary to sell this electricity on the Market (at variable rates) before obtaining additional compensation intended to reduce the risk. This sale on the market requires to announce in advance (about 48 hours before) the production that will be delivered on the network, so to be able to predict (in the short term) this production. The fundamental problem remains the variability of the Wind accentuated by the geographical situation. The objective of the project is to provide, every day, short-term forecasts (48-hour horizon) of wind production using weather data. The predictions of the GFS model and those of the ECMWF model are used as explanatory variables. The variable to be predicted is the production of a wind farm. We do two approaches: a direct approach that predicts wind generation directly from weather data, and an integrated approach that estimâtes wind from weather data and converts it into wind power by power curves. We used machine learning techniques to predict this production. The models tested are random forests, CART + Bagging, CART + Boosting, SVM (Support Vector Machine). The application is made on a wind farm of 22MW (11 wind turbines) of the Compagnie du Vent (that became Engie Green France). Our results are very conclusive compared to the literature.Keywords: forecast aggregation, machine learning, spatio-temporal dynamics modeling, wind power forcast
Procedia PDF Downloads 2178132 Enhancing Fall Detection Accuracy with a Transfer Learning-Aided Transformer Model Using Computer Vision
Authors: Sheldon McCall, Miao Yu, Liyun Gong, Shigang Yue, Stefanos Kollias
Abstract:
Falls are a significant health concern for older adults globally, and prompt identification is critical to providing necessary healthcare support. Our study proposes a new fall detection method using computer vision based on modern deep learning techniques. Our approach involves training a trans- former model on a large 2D pose dataset for general action recognition, followed by transfer learning. Specifically, we freeze the first few layers of the trained transformer model and train only the last two layers for fall detection. Our experimental results demonstrate that our proposed method outperforms both classical machine learning and deep learning approaches in fall/non-fall classification. Overall, our study suggests that our proposed methodology could be a valuable tool for identifying falls.Keywords: healthcare, fall detection, transformer, transfer learning
Procedia PDF Downloads 1468131 Reinforcement Learning Optimization: Unraveling Trends and Advancements in Metaheuristic Algorithms
Authors: Rahul Paul, Kedar Nath Das
Abstract:
The field of machine learning (ML) is experiencing rapid development, resulting in a multitude of theoretical advancements and extensive practical implementations across various disciplines. The objective of ML is to facilitate the ability of machines to perform cognitive tasks by leveraging knowledge gained from prior experiences and effectively addressing complex problems, even in situations that deviate from previously encountered instances. Reinforcement Learning (RL) has emerged as a prominent subfield within ML and has gained considerable attention in recent times from researchers. This surge in interest can be attributed to the practical applications of RL, the increasing availability of data, and the rapid advancements in computing power. At the same time, optimization algorithms play a pivotal role in the field of ML and have attracted considerable interest from researchers. A multitude of proposals have been put forth to address optimization problems or improve optimization techniques within the domain of ML. The necessity of a thorough examination and implementation of optimization algorithms within the context of ML is of utmost importance in order to provide guidance for the advancement of research in both optimization and ML. This article provides a comprehensive overview of the application of metaheuristic evolutionary optimization algorithms in conjunction with RL to address a diverse range of scientific challenges. Furthermore, this article delves into the various challenges and unresolved issues pertaining to the optimization of RL models.Keywords: machine learning, reinforcement learning, loss function, evolutionary optimization techniques
Procedia PDF Downloads 748130 Feature Weighting Comparison Based on Clustering Centers in the Detection of Diabetic Retinopathy
Authors: Kemal Polat
Abstract:
In this paper, three feature weighting methods have been used to improve the classification performance of diabetic retinopathy (DR). To classify the diabetic retinopathy, features extracted from the output of several retinal image processing algorithms, such as image-level, lesion-specific and anatomical components, have been used and fed them into the classifier algorithms. The dataset used in this study has been taken from University of California, Irvine (UCI) machine learning repository. Feature weighting methods including the fuzzy c-means clustering based feature weighting, subtractive clustering based feature weighting, and Gaussian mixture clustering based feature weighting, have been used and compered with each other in the classification of DR. After feature weighting, five different classifier algorithms comprising multi-layer perceptron (MLP), k- nearest neighbor (k-NN), decision tree, support vector machine (SVM), and Naïve Bayes have been used. The hybrid method based on combination of subtractive clustering based feature weighting and decision tree classifier has been obtained the classification accuracy of 100% in the screening of DR. These results have demonstrated that the proposed hybrid scheme is very promising in the medical data set classification.Keywords: machine learning, data weighting, classification, data mining
Procedia PDF Downloads 3258129 Normal and Peaberry Coffee Beans Classification from Green Coffee Bean Images Using Convolutional Neural Networks and Support Vector Machine
Authors: Hira Lal Gope, Hidekazu Fukai
Abstract:
The aim of this study is to develop a system which can identify and sort peaberries automatically at low cost for coffee producers in developing countries. In this paper, the focus is on the classification of peaberries and normal coffee beans using image processing and machine learning techniques. The peaberry is not bad and not a normal bean. The peaberry is born in an only single seed, relatively round seed from a coffee cherry instead of the usual flat-sided pair of beans. It has another value and flavor. To make the taste of the coffee better, it is necessary to separate the peaberry and normal bean before green coffee beans roasting. Otherwise, the taste of total beans will be mixed, and it will be bad. In roaster procedure time, all the beans shape, size, and weight must be unique; otherwise, the larger bean will take more time for roasting inside. The peaberry has a different size and different shape even though they have the same weight as normal beans. The peaberry roasts slower than other normal beans. Therefore, neither technique provides a good option to select the peaberries. Defect beans, e.g., sour, broken, black, and fade bean, are easy to check and pick up manually by hand. On the other hand, the peaberry pick up is very difficult even for trained specialists because the shape and color of the peaberry are similar to normal beans. In this study, we use image processing and machine learning techniques to discriminate the normal and peaberry bean as a part of the sorting system. As the first step, we applied Deep Convolutional Neural Networks (CNN) and Support Vector Machine (SVM) as machine learning techniques to discriminate the peaberry and normal bean. As a result, better performance was obtained with CNN than with SVM for the discrimination of the peaberry. The trained artificial neural network with high performance CPU and GPU in this work will be simply installed into the inexpensive and low in calculation Raspberry Pi system. We assume that this system will be used in under developed countries. The study evaluates and compares the feasibility of the methods in terms of accuracy of classification and processing speed.Keywords: convolutional neural networks, coffee bean, peaberry, sorting, support vector machine
Procedia PDF Downloads 1448128 One-Class Support Vector Machine for Sentiment Analysis of Movie Review Documents
Authors: Chothmal, Basant Agarwal
Abstract:
Sentiment analysis means to classify a given review document into positive or negative polar document. Sentiment analysis research has been increased tremendously in recent times due to its large number of applications in the industry and academia. Sentiment analysis models can be used to determine the opinion of the user towards any entity or product. E-commerce companies can use sentiment analysis model to improve their products on the basis of users’ opinion. In this paper, we propose a new One-class Support Vector Machine (One-class SVM) based sentiment analysis model for movie review documents. In the proposed approach, we initially extract features from one class of documents, and further test the given documents with the one-class SVM model if a given new test document lies in the model or it is an outlier. Experimental results show the effectiveness of the proposed sentiment analysis model.Keywords: feature selection methods, machine learning, NB, one-class SVM, sentiment analysis, support vector machine
Procedia PDF Downloads 5178127 Land Suitability Prediction Modelling for Agricultural Crops Using Machine Learning Approach: A Case Study of Khuzestan Province, Iran
Authors: Saba Gachpaz, Hamid Reza Heidari
Abstract:
The sharp increase in population growth leads to more pressure on agricultural areas to satisfy the food supply. To achieve this, more resources should be consumed and, besides other environmental concerns, highlight sustainable agricultural development. Land-use management is a crucial factor in obtaining optimum productivity. Machine learning is a widely used technique in the agricultural sector, from yield prediction to customer behavior. This method focuses on learning and provides patterns and correlations from our data set. In this study, nine physical control factors, namely, soil classification, electrical conductivity, normalized difference water index (NDWI), groundwater level, elevation, annual precipitation, pH of water, annual mean temperature, and slope in the alluvial plain in Khuzestan (an agricultural hotspot in Iran) are used to decide the best agricultural land use for both rainfed and irrigated agriculture for ten different crops. For this purpose, each variable was imported into Arc GIS, and a raster layer was obtained. In the next level, by using training samples, all layers were imported into the python environment. A random forest model was applied, and the weight of each variable was specified. In the final step, results were visualized using a digital elevation model, and the importance of all factors for each one of the crops was obtained. Our results show that despite 62% of the study area being allocated to agricultural purposes, only 42.9% of these areas can be defined as a suitable class for cultivation purposes.Keywords: land suitability, machine learning, random forest, sustainable agriculture
Procedia PDF Downloads 848126 Leveraging Learning Analytics to Inform Learning Design in Higher Education
Authors: Mingming Jiang
Abstract:
This literature review aims to offer an overview of existing research on learning analytics and learning design, the alignment between the two, and how learning analytics has been leveraged to inform learning design in higher education. Current research suggests a need to create more alignment and integration between learning analytics and learning design in order to not only ground learning analytics on learning sciences but also enable data-driven decisions in learning design to improve learning outcomes. In addition, multiple conceptual frameworks have been proposed to enhance the synergy and alignment between learning analytics and learning design. Future research should explore this synergy further in the unique context of higher education, identifying learning analytics metrics in higher education that can offer insight into learning processes, evaluating the effect of learning analytics outcomes on learning design decision-making in higher education, and designing learning environments in higher education that make the capturing and deployment of learning analytics outcomes more efficient.Keywords: learning analytics, learning design, big data in higher education, online learning environments
Procedia PDF Downloads 1708125 Market Index Trend Prediction using Deep Learning and Risk Analysis
Authors: Shervin Alaei, Reza Moradi
Abstract:
Trading in financial markets is subject to risks due to their high volatilities. Here, using an LSTM neural network, and by doing some risk-based feature engineering tasks, we developed a method that can accurately predict trends of the Tehran stock exchange market index from a few days ago. Our test results have shown that the proposed method with an average prediction accuracy of more than 94% is superior to the other common machine learning algorithms. To the best of our knowledge, this is the first work incorporating deep learning and risk factors to accurately predict market trends.Keywords: deep learning, LSTM, trend prediction, risk management, artificial neural networks
Procedia PDF Downloads 1568124 Software Defect Analysis- Eclipse Dataset
Authors: Amrane Meriem, Oukid Salyha
Abstract:
The presence of defects or bugs in software can lead to costly setbacks, operational inefficiencies, and compromised user experiences. The integration of Machine Learning(ML) techniques has emerged to predict and preemptively address software defects. ML represents a proactive strategy aimed at identifying potential anomalies, errors, or vulnerabilities within code before they manifest as operational issues. By analyzing historical data, such as code changes, feature im- plementations, and defect occurrences. This en- ables development teams to anticipate and mitigate these issues, thus enhancing software quality, reducing maintenance costs, and ensuring smoother user interactions. In this work, we used a recommendation system to improve the performance of ML models in terms of predicting the code severity and effort estimation.Keywords: software engineering, machine learning, bugs detection, effort estimation
Procedia PDF Downloads 868123 Study on Dynamic Stiffness Matching and Optimization Design Method of a Machine Tool
Authors: Lu Xi, Li Pan, Wen Mengmeng
Abstract:
The stiffness of each component has different influences on the stiffness of the machine tool. Taking the five-axis gantry machining center as an example, we made the modal analysis of the machine tool, followed by raising and lowering the stiffness of the pillar, slide plate, beam, ram and saddle so as to study the stiffness matching among these components on the standard of whether the stiffness of the modified machine tool changes more than 50% relative to the stiffness of the original machine tool. The structural optimization of the machine tool can be realized by changing the stiffness of the components whose stiffness is mismatched. For example, the stiffness of the beam is mismatching. The natural frequencies of the first six orders of the beam increased by 7.70%, 0.38%, 6.82%, 7.96%, 18.72% and 23.13%, with the weight increased by 28Kg, leading to the natural frequencies of several orders which had a great influence on the dynamic performance of the whole machine increased by 1.44%, 0.43%, 0.065%, which verified the correctness of the optimization method based on stiffness matching proposed in this paper.Keywords: machine tool, optimization, modal analysis, stiffness matching
Procedia PDF Downloads 1018122 Integrating Natural Language Processing (NLP) and Machine Learning in Lung Cancer Diagnosis
Authors: Mehrnaz Mostafavi
Abstract:
The assessment and categorization of incidental lung nodules present a considerable challenge in healthcare, often necessitating resource-intensive multiple computed tomography (CT) scans for growth confirmation. This research addresses this issue by introducing a distinct computational approach leveraging radiomics and deep-learning methods. However, understanding local services is essential before implementing these advancements. With diverse tracking methods in place, there is a need for efficient and accurate identification approaches, especially in the context of managing lung nodules alongside pre-existing cancer scenarios. This study explores the integration of text-based algorithms in medical data curation, indicating their efficacy in conjunction with machine learning and deep-learning models for identifying lung nodules. Combining medical images with text data has demonstrated superior data retrieval compared to using each modality independently. While deep learning and text analysis show potential in detecting previously missed nodules, challenges persist, such as increased false positives. The presented research introduces a Structured-Query-Language (SQL) algorithm designed for identifying pulmonary nodules in a tertiary cancer center, externally validated at another hospital. Leveraging natural language processing (NLP) and machine learning, the algorithm categorizes lung nodule reports based on sentence features, aiming to facilitate research and assess clinical pathways. The hypothesis posits that the algorithm can accurately identify lung nodule CT scans and predict concerning nodule features using machine-learning classifiers. Through a retrospective observational study spanning a decade, CT scan reports were collected, and an algorithm was developed to extract and classify data. Results underscore the complexity of lung nodule cohorts in cancer centers, emphasizing the importance of careful evaluation before assuming a metastatic origin. The SQL and NLP algorithms demonstrated high accuracy in identifying lung nodule sentences, indicating potential for local service evaluation and research dataset creation. Machine-learning models exhibited strong accuracy in predicting concerning changes in lung nodule scan reports. While limitations include variability in disease group attribution, the potential for correlation rather than causality in clinical findings, and the need for further external validation, the algorithm's accuracy and potential to support clinical decision-making and healthcare automation represent a significant stride in lung nodule management and research.Keywords: lung cancer diagnosis, structured-query-language (SQL), natural language processing (NLP), machine learning, CT scans
Procedia PDF Downloads 1008121 Comparison of Deep Learning and Machine Learning Algorithms to Diagnose and Predict Breast Cancer
Authors: F. Ghazalnaz Sharifonnasabi, Iman Makhdoom
Abstract:
Breast cancer is a serious health concern that affects many people around the world. According to a study published in the Breast journal, the global burden of breast cancer is expected to increase significantly over the next few decades. The number of deaths from breast cancer has been increasing over the years, but the age-standardized mortality rate has decreased in some countries. It’s important to be aware of the risk factors for breast cancer and to get regular check- ups to catch it early if it does occur. Machin learning techniques have been used to aid in the early detection and diagnosis of breast cancer. These techniques, that have been shown to be effective in predicting and diagnosing the disease, have become a research hotspot. In this study, we consider two deep learning approaches including: Multi-Layer Perceptron (MLP), and Convolutional Neural Network (CNN). We also considered the five-machine learning algorithm titled: Decision Tree (C4.5), Naïve Bayesian (NB), Support Vector Machine (SVM), K-Nearest Neighbors (KNN) Algorithm and XGBoost (eXtreme Gradient Boosting) on the Breast Cancer Wisconsin Diagnostic dataset. We have carried out the process of evaluating and comparing classifiers involving selecting appropriate metrics to evaluate classifier performance and selecting an appropriate tool to quantify this performance. The main purpose of the study is predicting and diagnosis breast cancer, applying the mentioned algorithms and also discovering of the most effective with respect to confusion matrix, accuracy and precision. It is realized that CNN outperformed all other classifiers and achieved the highest accuracy (0.982456). The work is implemented in the Anaconda environment based on Python programing language.Keywords: breast cancer, multi-layer perceptron, Naïve Bayesian, SVM, decision tree, convolutional neural network, XGBoost, KNN
Procedia PDF Downloads 758120 Machine Learning and Metaheuristic Algorithms in Short Femoral Stem Custom Design to Reduce Stress Shielding
Authors: Isabel Moscol, Carlos J. Díaz, Ciro Rodríguez
Abstract:
Hip replacement becomes necessary when a person suffers severe pain or considerable functional limitations and the best option to enhance their quality of life is through the replacement of the damaged joint. One of the main components in femoral prostheses is the stem which distributes the loads from the joint to the proximal femur. To preserve more bone stock and avoid weakening of the diaphysis, a short starting stem was selected, generated from the intramedullary morphology of the patient's femur. It ensures the implantability of the design and leads to geometric delimitation for personalized optimization with machine learning (ML) and metaheuristic algorithms. The present study attempts to design a cementless short stem to make the strain deviation before and after implantation close to zero, promoting its fixation and durability. Regression models developed to estimate the percentage change of maximum principal stresses were used as objective optimization functions by the metaheuristic algorithm. The latter evaluated different geometries of the short stem with the modification of certain parameters in oblique sections from the osteotomy plane. The optimized geometry reached a global stress shielding (SS) of 18.37% with a determination factor (R²) of 0.667. The predicted results favour implantability integration in the short stem optimization to effectively reduce SS in the proximal femur.Keywords: machine learning techniques, metaheuristic algorithms, short-stem design, stress shielding, hip replacement
Procedia PDF Downloads 1958119 Machine Learning Facing Behavioral Noise Problem in an Imbalanced Data Using One Side Behavioral Noise Reduction: Application to a Fraud Detection
Authors: Salma El Hajjami, Jamal Malki, Alain Bouju, Mohammed Berrada
Abstract:
With the expansion of machine learning and data mining in the context of Big Data analytics, the common problem that affects data is class imbalance. It refers to an imbalanced distribution of instances belonging to each class. This problem is present in many real world applications such as fraud detection, network intrusion detection, medical diagnostics, etc. In these cases, data instances labeled negatively are significantly more numerous than the instances labeled positively. When this difference is too large, the learning system may face difficulty when tackling this problem, since it is initially designed to work in relatively balanced class distribution scenarios. Another important problem, which usually accompanies these imbalanced data, is the overlapping instances between the two classes. It is commonly referred to as noise or overlapping data. In this article, we propose an approach called: One Side Behavioral Noise Reduction (OSBNR). This approach presents a way to deal with the problem of class imbalance in the presence of a high noise level. OSBNR is based on two steps. Firstly, a cluster analysis is applied to groups similar instances from the minority class into several behavior clusters. Secondly, we select and eliminate the instances of the majority class, considered as behavioral noise, which overlap with behavior clusters of the minority class. The results of experiments carried out on a representative public dataset confirm that the proposed approach is efficient for the treatment of class imbalances in the presence of noise.Keywords: machine learning, imbalanced data, data mining, big data
Procedia PDF Downloads 1308118 Automatic Method for Classification of Informative and Noninformative Images in Colonoscopy Video
Authors: Nidhal K. Azawi, John M. Gauch
Abstract:
Colorectal cancer is one of the leading causes of cancer death in the US and the world, which is why millions of colonoscopy examinations are performed annually. Unfortunately, noise, specular highlights, and motion artifacts corrupt many images in a typical colonoscopy exam. The goal of our research is to produce automated techniques to detect and correct or remove these noninformative images from colonoscopy videos, so physicians can focus their attention on informative images. In this research, we first automatically extract features from images. Then we use machine learning and deep neural network to classify colonoscopy images as either informative or noninformative. Our results show that we achieve image classification accuracy between 92-98%. We also show how the removal of noninformative images together with image alignment can aid in the creation of image panoramas and other visualizations of colonoscopy images.Keywords: colonoscopy classification, feature extraction, image alignment, machine learning
Procedia PDF Downloads 2538117 Disease Level Assessment in Wheat Plots Using a Residual Deep Learning Algorithm
Authors: Felipe A. Guth, Shane Ward, Kevin McDonnell
Abstract:
The assessment of disease levels in crop fields is an important and time-consuming task that generally relies on expert knowledge of trained individuals. Image classification in agriculture problems historically has been based on classical machine learning strategies that make use of hand-engineered features in the top of a classification algorithm. This approach tends to not produce results with high accuracy and generalization to the classes classified by the system when the nature of the elements has a significant variability. The advent of deep convolutional neural networks has revolutionized the field of machine learning, especially in computer vision tasks. These networks have great resourcefulness of learning and have been applied successfully to image classification and object detection tasks in the last years. The objective of this work was to propose a new method based on deep learning convolutional neural networks towards the task of disease level monitoring. Common RGB images of winter wheat were obtained during a growing season. Five categories of disease levels presence were produced, in collaboration with agronomists, for the algorithm classification. Disease level tasks performed by experts provided ground truth data for the disease score of the same winter wheat plots were RGB images were acquired. The system had an overall accuracy of 84% on the discrimination of the disease level classes.Keywords: crop disease assessment, deep learning, precision agriculture, residual neural networks
Procedia PDF Downloads 3318116 Using Swarm Intelligence to Forecast Outcomes of English Premier League Matches
Authors: Hans Schumann, Colin Domnauer, Louis Rosenberg
Abstract:
In this study, machine learning techniques were deployed on real-time human swarm data to forecast the likelihood of outcomes for English Premier League matches in the 2020/21 season. These techniques included ensemble models in combination with neural networks and were tested against an industry standard of Vegas Oddsmakers. Predictions made from the collective intelligence of human swarm participants managed to achieve a positive return on investment over a full season on matches, empirically proving the usefulness of a new artificial intelligence valuing human instinct and intelligence.Keywords: artificial intelligence, data science, English Premier League, human swarming, machine learning, sports betting, swarm intelligence
Procedia PDF Downloads 2128115 Machine Learning Approaches to Water Usage Prediction in Kocaeli: A Comparative Study
Authors: Kasim Görenekli, Ali Gülbağ
Abstract:
This study presents a comprehensive analysis of water consumption patterns in Kocaeli province, Turkey, utilizing various machine learning approaches. We analyzed data from 5,000 water subscribers across residential, commercial, and official categories over an 80-month period from January 2016 to August 2022, resulting in a total of 400,000 records. The dataset encompasses water consumption records, weather information, weekends and holidays, previous months' consumption, and the influence of the COVID-19 pandemic.We implemented and compared several machine learning models, including Linear Regression, Random Forest, Support Vector Regression (SVR), XGBoost, Artificial Neural Networks (ANN), Long Short-Term Memory (LSTM), and Gated Recurrent Units (GRU). Particle Swarm Optimization (PSO) was applied to optimize hyperparameters for all models.Our results demonstrate varying performance across subscriber types and models. For official subscribers, Random Forest achieved the highest R² of 0.699 with PSO optimization. For commercial subscribers, Linear Regression performed best with an R² of 0.730 with PSO. Residential water usage proved more challenging to predict, with XGBoost achieving the highest R² of 0.572 with PSO.The study identified key factors influencing water consumption, with previous months' consumption, meter diameter, and weather conditions being among the most significant predictors. The impact of the COVID-19 pandemic on consumption patterns was also observed, particularly in residential usage.This research provides valuable insights for effective water resource management in Kocaeli and similar regions, considering Turkey's high water loss rate and below-average per capita water supply. The comparative analysis of different machine learning approaches offers a comprehensive framework for selecting appropriate models for water consumption prediction in urban settings.Keywords: mMachine learning, water consumption prediction, particle swarm optimization, COVID-19, water resource management
Procedia PDF Downloads 158114 Optimizing Machine Vision System Setup Accuracy by Six-Sigma DMAIC Approach
Authors: Joseph C. Chen
Abstract:
Machine vision system provides automatic inspection to reduce manufacturing costs considerably. However, only a few principles have been found to optimize machine vision system and help it function more accurately in industrial practice. Mostly, there were complicated and impractical design techniques to improve the accuracy of machine vision system. This paper discusses implementing the Six Sigma Define, Measure, Analyze, Improve, and Control (DMAIC) approach to optimize the setup parameters of machine vision system when it is used as a direct measurement technique. This research follows a case study showing how Six Sigma DMAIC methodology has been put into use.Keywords: DMAIC, machine vision system, process capability, Taguchi Parameter Design
Procedia PDF Downloads 436