Search results for: machine learning methods
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 21299

Search results for: machine learning methods

21119 Machine Learning Techniques in Seismic Risk Assessment of Structures

Authors: Farid Khosravikia, Patricia Clayton

Abstract:

The main objective of this work is to evaluate the advantages and disadvantages of various machine learning techniques in two key steps of seismic hazard and risk assessment of different types of structures. The first step is the development of ground-motion models, which are used for forecasting ground-motion intensity measures (IM) given source characteristics, source-to-site distance, and local site condition for future events. IMs such as peak ground acceleration and velocity (PGA and PGV, respectively) as well as 5% damped elastic pseudospectral accelerations at different periods (PSA), are indicators of the strength of shaking at the ground surface. Typically, linear regression-based models, with pre-defined equations and coefficients, are used in ground motion prediction. However, due to the restrictions of the linear regression methods, such models may not capture more complex nonlinear behaviors that exist in the data. Thus, this study comparatively investigates potential benefits from employing other machine learning techniques as statistical method in ground motion prediction such as Artificial Neural Network, Random Forest, and Support Vector Machine. The results indicate the algorithms satisfy some physically sound characteristics such as magnitude scaling distance dependency without requiring pre-defined equations or coefficients. Moreover, it is shown that, when sufficient data is available, all the alternative algorithms tend to provide more accurate estimates compared to the conventional linear regression-based method, and particularly, Random Forest outperforms the other algorithms. However, the conventional method is a better tool when limited data is available. Second, it is investigated how machine learning techniques could be beneficial for developing probabilistic seismic demand models (PSDMs), which provide the relationship between the structural demand responses (e.g., component deformations, accelerations, internal forces, etc.) and the ground motion IMs. In the risk framework, such models are used to develop fragility curves estimating exceeding probability of damage for pre-defined limit states, and therefore, control the reliability of the predictions in the risk assessment. In this study, machine learning algorithms like artificial neural network, random forest, and support vector machine are adopted and trained on the demand parameters to derive PSDMs for them. It is observed that such models can provide more accurate estimates of prediction in relatively shorter about of time compared to conventional methods. Moreover, they can be used for sensitivity analysis of fragility curves with respect to many modeling parameters without necessarily requiring more intense numerical response-history analysis.

Keywords: artificial neural network, machine learning, random forest, seismic risk analysis, seismic hazard analysis, support vector machine

Procedia PDF Downloads 92
21118 Real-Time Network Anomaly Detection Systems Based on Machine-Learning Algorithms

Authors: Zahra Ramezanpanah, Joachim Carvallo, Aurelien Rodriguez

Abstract:

This paper aims to detect anomalies in streaming data using machine learning algorithms. In this regard, we designed two separate pipelines and evaluated the effectiveness of each separately. The first pipeline, based on supervised machine learning methods, consists of two phases. In the first phase, we trained several supervised models using the UNSW-NB15 data-set. We measured the efficiency of each using different performance metrics and selected the best model for the second phase. At the beginning of the second phase, we first, using Argus Server, sniffed a local area network. Several types of attacks were simulated and then sent the sniffed data to a running algorithm at short intervals. This algorithm can display the results of each packet of received data in real-time using the trained model. The second pipeline presented in this paper is based on unsupervised algorithms, in which a Temporal Graph Network (TGN) is used to monitor a local network. The TGN is trained to predict the probability of future states of the network based on its past behavior. Our contribution in this section is introducing an indicator to identify anomalies from these predicted probabilities.

Keywords: temporal graph network, anomaly detection, cyber security, IDS

Procedia PDF Downloads 90
21117 Missing Link Data Estimation with Recurrent Neural Network: An Application Using Speed Data of Daegu Metropolitan Area

Authors: JaeHwan Yang, Da-Woon Jeong, Seung-Young Kho, Dong-Kyu Kim

Abstract:

In terms of ITS, information on link characteristic is an essential factor for plan or operation. But in practical cases, not every link has installed sensors on it. The link that does not have data on it is called “Missing Link”. The purpose of this study is to impute data of these missing links. To get these data, this study applies the machine learning method. With the machine learning process, especially for the deep learning process, missing link data can be estimated from present link data. For deep learning process, this study uses “Recurrent Neural Network” to take time-series data of road. As input data, Dedicated Short-range Communications (DSRC) data of Dalgubul-daero of Daegu Metropolitan Area had been fed into the learning process. Neural Network structure has 17 links with present data as input, 2 hidden layers, for 1 missing link data. As a result, forecasted data of target link show about 94% of accuracy compared with actual data.

Keywords: data estimation, link data, machine learning, road network

Procedia PDF Downloads 501
21116 Predicting the Diagnosis of Alzheimer’s Disease: Development and Validation of Machine Learning Models

Authors: Jay L. Fu

Abstract:

Patients with Alzheimer's disease progressively lose their memory and thinking skills and, eventually, the ability to carry out simple daily tasks. The disease is irreversible, but early detection and treatment can slow down the disease progression. In this research, publicly available MRI data and demographic data from 373 MRI imaging sessions were utilized to build models to predict dementia. Various machine learning models, including logistic regression, k-nearest neighbor, support vector machine, random forest, and neural network, were developed. Data were divided into training and testing sets, where training sets were used to build the predictive model, and testing sets were used to assess the accuracy of prediction. Key risk factors were identified, and various models were compared to come forward with the best prediction model. Among these models, the random forest model appeared to be the best model with an accuracy of 90.34%. MMSE, nWBV, and gender were the three most important contributing factors to the detection of Alzheimer’s. Among all the models used, the percent in which at least 4 of the 5 models shared the same diagnosis for a testing input was 90.42%. These machine learning models allow early detection of Alzheimer’s with good accuracy, which ultimately leads to early treatment of these patients.

Keywords: Alzheimer's disease, clinical diagnosis, magnetic resonance imaging, machine learning prediction

Procedia PDF Downloads 132
21115 Machine Learning Techniques to Predict Cyberbullying and Improve Social Work Interventions

Authors: Oscar E. Cariceo, Claudia V. Casal

Abstract:

Machine learning offers a set of techniques to promote social work interventions and can lead to support decisions of practitioners in order to predict new behaviors based on data produced by the organizations, services agencies, users, clients or individuals. Machine learning techniques include a set of generalizable algorithms that are data-driven, which means that rules and solutions are derived by examining data, based on the patterns that are present within any data set. In other words, the goal of machine learning is teaching computers through 'examples', by training data to test specifics hypothesis and predict what would be a certain outcome, based on a current scenario and improve that experience. Machine learning can be classified into two general categories depending on the nature of the problem that this technique needs to tackle. First, supervised learning involves a dataset that is already known in terms of their output. Supervising learning problems are categorized, into regression problems, which involve a prediction from quantitative variables, using a continuous function; and classification problems, which seek predict results from discrete qualitative variables. For social work research, machine learning generates predictions as a key element to improving social interventions on complex social issues by providing better inference from data and establishing more precise estimated effects, for example in services that seek to improve their outcomes. This paper exposes the results of a classification algorithm to predict cyberbullying among adolescents. Data were retrieved from the National Polyvictimization Survey conducted by the government of Chile in 2017. A logistic regression model was created to predict if an adolescent would experience cyberbullying based on the interaction and behavior of gender, age, grade, type of school, and self-esteem sentiments. The model can predict with an accuracy of 59.8% if an adolescent will suffer cyberbullying. These results can help to promote programs to avoid cyberbullying at schools and improve evidence based practice.

Keywords: cyberbullying, evidence based practice, machine learning, social work research

Procedia PDF Downloads 159
21114 Developing a Hybrid Method to Diagnose and Predict Sports Related Concussions with Machine Learning

Authors: Melody Yin

Abstract:

Concussions impact a large amount of adolescents; they make up as much as half of the diagnosed concussions in America. This research proposes a hybrid machine learning model based on the combination of human/knowledge-based domains and computer-generated feature rankings to improve the accuracy of diagnosing sports related concussion (SRC). Using a data set of symptoms collected on the sideline post-SRC events, the symptom selection criteria method has been developed by using Google AutoML's important score function to identify the top 10 symptom features. In addition, symptom domains have been introduced as another parameter, categorizing the symptoms into physical, cognitive, sleep, and emotional domains. The hybrid machine learning model has been trained with a combination of the top 10 symptoms and 4 domains. From the results, the hybrid model was the best performer for symptom resolution time prediction in 2 and 4-week thresholds. This research is a proof of concept study in the use of domains along with machine learning in order to improve concussion prediction accuracy. It is also possible that the use of domains can make the model more efficient due to reduced training time. This research examines the use of a hybrid method in predicting sports-related concussion. This achievement is based on data preprocessing, using a hybrid method to select criteria to achieve high performance.

Keywords: hybrid model, machine learning, sports related concussion, symptom resolution time

Procedia PDF Downloads 156
21113 Developing an Out-of-Distribution Generalization Model Selection Framework through Impurity and Randomness Measurements and a Bias Index

Authors: Todd Zhou, Mikhail Yurochkin

Abstract:

Out-of-distribution (OOD) detection is receiving increasing amounts of attention in the machine learning research community, boosted by recent technologies, such as autonomous driving and image processing. This newly-burgeoning field has called for the need for more effective and efficient methods for out-of-distribution generalization methods. Without accessing the label information, deploying machine learning models to out-of-distribution domains becomes extremely challenging since it is impossible to evaluate model performance on unseen domains. To tackle this out-of-distribution detection difficulty, we designed a model selection pipeline algorithm and developed a model selection framework with different impurity and randomness measurements to evaluate and choose the best-performing models for out-of-distribution data. By exploring different randomness scores based on predicted probabilities, we adopted the out-of-distribution entropy and developed a custom-designed score, ”CombinedScore,” as the evaluation criterion. This proposed score was created by adding labeled source information into the judging space of the uncertainty entropy score using harmonic mean. Furthermore, the prediction bias was explored through the equality of opportunity violation measurement. We also improved machine learning model performance through model calibration. The effectiveness of the framework with the proposed evaluation criteria was validated on the Folktables American Community Survey (ACS) datasets.

Keywords: model selection, domain generalization, model fairness, randomness measurements, bias index

Procedia PDF Downloads 114
21112 Machine Learning Data Architecture

Authors: Neerav Kumar, Naumaan Nayyar, Sharath Kashyap

Abstract:

Most companies see an increase in the adoption of machine learning (ML) applications across internal and external-facing use cases. ML applications vend output either in batch or real-time patterns. A complete batch ML pipeline architecture comprises data sourcing, feature engineering, model training, model deployment, model output vending into a data store for downstream application. Due to unclear role expectations, we have observed that scientists specializing in building and optimizing models are investing significant efforts into building the other components of the architecture, which we do not believe is the best use of scientists’ bandwidth. We propose a system architecture created using AWS services that bring industry best practices to managing the workflow and simplifies the process of model deployment and end-to-end data integration for an ML application. This narrows down the scope of scientists’ work to model building and refinement while specialized data engineers take over the deployment, pipeline orchestration, data quality, data permission system, etc. The pipeline infrastructure is built and deployed as code (using terraform, cdk, cloudformation, etc.) which makes it easy to replicate and/or extend the architecture to other models that are used in an organization.

Keywords: data pipeline, machine learning, AWS, architecture, batch machine learning

Procedia PDF Downloads 51
21111 Physics-Informed Machine Learning for Displacement Estimation in Solid Mechanics Problem

Authors: Feng Yang

Abstract:

Machine learning (ML), especially deep learning (DL), has been extensively applied to many applications in recently years and gained great success in solving different problems, including scientific problems. However, conventional ML/DL methodologies are purely data-driven which have the limitations, such as need of ample amount of labelled training data, lack of consistency to physical principles, and lack of generalizability to new problems/domains. Recently, there is a growing consensus that ML models need to further take advantage of prior knowledge to deal with these limitations. Physics-informed machine learning, aiming at integration of physics/domain knowledge into ML, has been recognized as an emerging area of research, especially in the recent 2 to 3 years. In this work, physics-informed ML, specifically physics-informed neural network (NN), is employed and implemented to estimate the displacements at x, y, z directions in a solid mechanics problem that is controlled by equilibrium equations with boundary conditions. By incorporating the physics (i.e. the equilibrium equations) into the learning process of NN, it is showed that the NN can be trained very efficiently with a small set of labelled training data. Experiments with different settings of the NN model and the amount of labelled training data were conducted, and the results show that very high accuracy can be achieved in fulfilling the equilibrium equations as well as in predicting the displacements, e.g. in setting the overall displacement of 0.1, a root mean square error (RMSE) of 2.09 × 10−4 was achieved.

Keywords: deep learning, neural network, physics-informed machine learning, solid mechanics

Procedia PDF Downloads 138
21110 Machine Learning for Classifying Risks of Death and Length of Stay of Patients in Intensive Unit Care Beds

Authors: Itamir de Morais Barroca Filho, Cephas A. S. Barreto, Ramon Malaquias, Cezar Miranda Paula de Souza, Arthur Costa Gorgônio, João C. Xavier-Júnior, Mateus Firmino, Fellipe Matheus Costa Barbosa

Abstract:

Information and Communication Technologies (ICT) in healthcare are crucial for efficiently delivering medical healthcare services to patients. These ICTs are also known as e-health and comprise technologies such as electronic record systems, telemedicine systems, and personalized devices for diagnosis. The focus of e-health is to improve the quality of health information, strengthen national health systems, and ensure accessible, high-quality health care for all. All the data gathered by these technologies make it possible to help clinical staff with automated decisions using machine learning. In this context, we collected patient data, such as heart rate, oxygen saturation (SpO2), blood pressure, respiration, and others. With this data, we were able to develop machine learning models for patients’ risk of death and estimate the length of stay in ICU beds. Thus, this paper presents the methodology for applying machine learning techniques to develop these models. As a result, although we implemented these models on an IoT healthcare platform, helping clinical staff in healthcare in an ICU, it is essential to create a robust clinical validation process and monitoring of the proposed models.

Keywords: ICT, e-health, machine learning, ICU, healthcare

Procedia PDF Downloads 86
21109 How Unicode Glyphs Revolutionized the Way We Communicate

Authors: Levi Corallo

Abstract:

Typed language made by humans on computers and cell phones has made a significant distinction from previous modes of written language exchanges. While acronyms remain one of the most predominant markings of typed language, another and perhaps more recent revolution in the way humans communicate has been with the use of symbols or glyphs, primarily Emojis—globally introduced on the iPhone keyboard by Apple in 2008. This paper seeks to analyze the use of symbols in typed communication from both a linguistic and machine learning perspective. The Unicode system will be explored and methods of encoding will be juxtaposed with the current machine and human perception. Topics in how typed symbol usage exists in conversation will be explored as well as topics across current research methods dealing with Emojis like sentiment analysis, predictive text models, and so on. This study proposes that sequential analysis is a significant feature for analyzing unicode characters in a corpus with machine learning. Current models that are trying to learn or translate the meaning of Emojis should be starting to learn using bi- and tri-grams of Emoji, as well as observing the relationship between combinations of different Emoji in tandem. The sociolinguistics of an entire new vernacular of language referred to here as ‘typed language’ will also be delineated across my analysis with unicode glyphs from both a semantic and technical perspective.

Keywords: unicode, text symbols, emojis, glyphs, communication

Procedia PDF Downloads 185
21108 A Radiomics Approach to Predict the Evolution of Prostate Imaging Reporting and Data System Score 3/5 Prostate Areas in Multiparametric Magnetic Resonance

Authors: Natascha C. D'Amico, Enzo Grossi, Giovanni Valbusa, Ala Malasevschi, Gianpiero Cardone, Sergio Papa

Abstract:

Purpose: To characterize, through a radiomic approach, the nature of areas classified PI-RADS (Prostate Imaging Reporting and Data System) 3/5, recognized in multiparametric prostate magnetic resonance with T2-weighted (T2w), diffusion and perfusion sequences with paramagnetic contrast. Methods and Materials: 24 cases undergoing multiparametric prostate MR and biopsy were admitted to this pilot study. Clinical outcome of the PI-RADS 3/5 was found through biopsy, finding 8 malignant tumours. The analysed images were acquired with a Philips achieva 1.5T machine with a CE- T2-weighted sequence in the axial plane. Semi-automatic tumour segmentation was carried out on MR images using 3DSlicer image analysis software. 45 shape-based, intensity-based and texture-based features were extracted and represented the input for preprocessing. An evolutionary algorithm (a TWIST system based on KNN algorithm) was used to subdivide the dataset into training and testing set and select features yielding the maximal amount of information. After this pre-processing 20 input variables were selected and different machine learning systems were used to develop a predictive model based on a training testing crossover procedure. Results: The best machine learning system (three-layers feed-forward neural network) obtained a global accuracy of 90% ( 80 % sensitivity and 100% specificity ) with a ROC of 0.82. Conclusion: Machine learning systems coupled with radiomics show a promising potential in distinguishing benign from malign tumours in PI-RADS 3/5 areas.

Keywords: machine learning, MR prostate, PI-Rads 3, radiomics

Procedia PDF Downloads 176
21107 Glucose Monitoring System Using Machine Learning Algorithms

Authors: Sangeeta Palekar, Neeraj Rangwani, Akash Poddar, Jayu Kalambe

Abstract:

The bio-medical analysis is an indispensable procedure for identifying health-related diseases like diabetes. Monitoring the glucose level in our body regularly helps us identify hyperglycemia and hypoglycemia, which can cause severe medical problems like nerve damage or kidney diseases. This paper presents a method for predicting the glucose concentration in blood samples using image processing and machine learning algorithms. The glucose solution is prepared by the glucose oxidase (GOD) and peroxidase (POD) method. An experimental database is generated based on the colorimetric technique. The image of the glucose solution is captured by the raspberry pi camera and analyzed using image processing by extracting the RGB, HSV, LUX color space values. Regression algorithms like multiple linear regression, decision tree, RandomForest, and XGBoost were used to predict the unknown glucose concentration. The multiple linear regression algorithm predicts the results with 97% accuracy. The image processing and machine learning-based approach reduce the hardware complexities of existing platforms.

Keywords: artificial intelligence glucose detection, glucose oxidase, peroxidase, image processing, machine learning

Procedia PDF Downloads 187
21106 A Text Classification Approach Based on Natural Language Processing and Machine Learning Techniques

Authors: Rim Messaoudi, Nogaye-Gueye Gning, François Azelart

Abstract:

Automatic text classification applies mostly natural language processing (NLP) and other AI-guided techniques to automatically classify text in a faster and more accurate manner. This paper discusses the subject of using predictive maintenance to manage incident tickets inside the sociality. It focuses on proposing a tool that treats and analyses comments and notes written by administrators after resolving an incident ticket. The goal here is to increase the quality of these comments. Additionally, this tool is based on NLP and machine learning techniques to realize the textual analytics of the extracted data. This approach was tested using real data taken from the French National Railways (SNCF) company and was given a high-quality result.

Keywords: machine learning, text classification, NLP techniques, semantic representation

Procedia PDF Downloads 84
21105 A Less Complexity Deep Learning Method for Drones Detection

Authors: Mohamad Kassab, Amal El Fallah Seghrouchni, Frederic Barbaresco, Raed Abu Zitar

Abstract:

Detecting objects such as drones is a challenging task as their relative size and maneuvering capabilities deceive machine learning models and cause them to misclassify drones as birds or other objects. In this work, we investigate applying several deep learning techniques to benchmark real data sets of flying drones. A deep learning paradigm is proposed for the purpose of mitigating the complexity of those systems. The proposed paradigm consists of a hybrid between the AdderNet deep learning paradigm and the Single Shot Detector (SSD) paradigm. The goal was to minimize multiplication operations numbers in the filtering layers within the proposed system and, hence, reduce complexity. Some standard machine learning technique, such as SVM, is also tested and compared to other deep learning systems. The data sets used for training and testing were either complete or filtered in order to remove the images with mall objects. The types of data were RGB or IR data. Comparisons were made between all these types, and conclusions were presented.

Keywords: drones detection, deep learning, birds versus drones, precision of detection, AdderNet

Procedia PDF Downloads 164
21104 Machine Learning in Momentum Strategies

Authors: Yi-Min Lan, Hung-Wen Cheng, Hsuan-Ling Chang, Jou-Ping Yu

Abstract:

The study applies machine learning models to construct momentum strategies and utilizes the information coefficient as an indicator for selecting stocks with strong and weak momentum characteristics. Through this approach, the study has built investment portfolios capable of generating superior returns and conducted a thorough analysis. Compared to existing research on momentum strategies, machine learning is incorporated to capture non-linear interactions. This approach enhances the conventional stock selection process, which is often impeded by difficulties associated with timeliness, accuracy, and efficiency due to market risk factors. The study finds that implementing bidirectional momentum strategies outperforms unidirectional ones, and momentum factors with longer observation periods exhibit stronger correlations with returns. Optimizing the number of stocks in the portfolio while staying within a certain threshold leads to the highest level of excess returns. The study presents a novel framework for momentum strategies that enhances and improves the operational aspects of asset management. By introducing innovative financial technology applications to traditional investment strategies, this paper can demonstrate significant effectiveness.

Keywords: information coefficient, machine learning, momentum, portfolio, return prediction

Procedia PDF Downloads 44
21103 Automated Machine Learning Algorithm Using Recurrent Neural Network to Perform Long-Term Time Series Forecasting

Authors: Ying Su, Morgan C. Wang

Abstract:

Long-term time series forecasting is an important research area for automated machine learning (AutoML). Currently, forecasting based on either machine learning or statistical learning is usually built by experts, and it requires significant manual effort, from model construction, feature engineering, and hyper-parameter tuning to the construction of the time series model. Automation is not possible since there are too many human interventions. To overcome these limitations, this article proposed to use recurrent neural networks (RNN) through the memory state of RNN to perform long-term time series prediction. We have shown that this proposed approach is better than the traditional Autoregressive Integrated Moving Average (ARIMA). In addition, we also found it is better than other network systems, including Fully Connected Neural Networks (FNN), Convolutional Neural Networks (CNN), and Nonpooling Convolutional Neural Networks (NPCNN).

Keywords: automated machines learning, autoregressive integrated moving average, neural networks, time series analysis

Procedia PDF Downloads 86
21102 Machine Learning Approach to Project Control Threshold Reliability Evaluation

Authors: Y. Kim, H. Lee, M. Park, B. Lee

Abstract:

Planning is understood as the determination of what has to be performed, how, in which sequence, when, what resources are needed, and their cost within the organization before execution. In most construction project, it is evident that the inherent nature of planning is dynamic, and initial planning is subject to be changed due to various uncertain conditions of construction project. Planners take a continuous revision process during the course of a project and until the very end of project. However, current practice lacks reliable, systematic tool for setting variance thresholds to determine when and what corrective actions to be taken. Rather it is heavily dependent on the level of experience and knowledge of the planner. Thus, this paper introduces a machine learning approach to evaluate project control threshold reliability incorporating project-specific data and presents a method to automate the process. The results have shown that the model improves the efficiency and accuracy of the monitoring process as an early warning.

Keywords: machine learning, project control, project progress monitoring, schedule

Procedia PDF Downloads 234
21101 Evaluating Models Through Feature Selection Methods Using Data Driven Approach

Authors: Shital Patil, Surendra Bhosale

Abstract:

Cardiac diseases are the leading causes of mortality and morbidity in the world, from recent few decades accounting for a large number of deaths have emerged as the most life-threatening disorder globally. Machine learning and Artificial intelligence have been playing key role in predicting the heart diseases. A relevant set of feature can be very helpful in predicting the disease accurately. In this study, we proposed a comparative analysis of 4 different features selection methods and evaluated their performance with both raw (Unbalanced dataset) and sampled (Balanced) dataset. The publicly available Z-Alizadeh Sani dataset have been used for this study. Four feature selection methods: Data Analysis, minimum Redundancy maximum Relevance (mRMR), Recursive Feature Elimination (RFE), Chi-squared are used in this study. These methods are tested with 8 different classification models to get the best accuracy possible. Using balanced and unbalanced dataset, the study shows promising results in terms of various performance metrics in accurately predicting heart disease. Experimental results obtained by the proposed method with the raw data obtains maximum AUC of 100%, maximum F1 score of 94%, maximum Recall of 98%, maximum Precision of 93%. While with the balanced dataset obtained results are, maximum AUC of 100%, F1-score 95%, maximum Recall of 95%, maximum Precision of 97%.

Keywords: cardio vascular diseases, machine learning, feature selection, SMOTE

Procedia PDF Downloads 105
21100 Using AI for Analysing Political Leaders

Authors: Shuai Zhao, Shalendra D. Sharma, Jin Xu

Abstract:

This research uses advanced machine learning models to learn a number of hypotheses regarding political executives. Specifically, it analyses the impact these powerful leaders have on economic growth by using leaders’ data from the Archigos database from 1835 to the end of 2015. The data is processed by the AutoGluon, which was developed by Amazon. Automated Machine Learning (AutoML) and AutoGluon can automatically extract features from the data and then use multiple classifiers to train the data. Use a linear regression model and classification model to establish the relationship between leaders and economic growth (GDP per capita growth), and to clarify the relationship between their characteristics and economic growth from a machine learning perspective. Our work may show as a model or signal for collaboration between the fields of statistics and artificial intelligence (AI) that can light up the way for political researchers and economists.

Keywords: comparative politics, political executives, leaders’ characteristics, artificial intelligence

Procedia PDF Downloads 76
21099 Using Machine Learning to Enhance Win Ratio for College Ice Hockey Teams

Authors: Sadixa Sanjel, Ahmed Sadek, Naseef Mansoor, Zelalem Denekew

Abstract:

Collegiate ice hockey (NCAA) sports analytics is different from the national level hockey (NHL). We apply and compare multiple machine learning models such as Linear Regression, Random Forest, and Neural Networks to predict the win ratio for a team based on their statistics. Data exploration helps determine which statistics are most useful in increasing the win ratio, which would be beneficial to coaches and team managers. We ran experiments to select the best model and chose Random Forest as the best performing. We conclude with how to bridge the gap between the college and national levels of sports analytics and the use of machine learning to enhance team performance despite not having a lot of metrics or budget for automatic tracking.

Keywords: NCAA, NHL, sports analytics, random forest, regression, neural networks, game predictions

Procedia PDF Downloads 103
21098 Comparative Evaluation of Accuracy of Selected Machine Learning Classification Techniques for Diagnosis of Cancer: A Data Mining Approach

Authors: Rajvir Kaur, Jeewani Anupama Ginige

Abstract:

With recent trends in Big Data and advancements in Information and Communication Technologies, the healthcare industry is at the stage of its transition from clinician oriented to technology oriented. Many people around the world die of cancer because the diagnosis of disease was not done at an early stage. Nowadays, the computational methods in the form of Machine Learning (ML) are used to develop automated decision support systems that can diagnose cancer with high confidence in a timely manner. This paper aims to carry out the comparative evaluation of a selected set of ML classifiers on two existing datasets: breast cancer and cervical cancer. The ML classifiers compared in this study are Decision Tree (DT), Support Vector Machine (SVM), k-Nearest Neighbor (k-NN), Logistic Regression, Ensemble (Bagged Tree) and Artificial Neural Networks (ANN). The evaluation is carried out based on standard evaluation metrics Precision (P), Recall (R), F1-score and Accuracy. The experimental results based on the evaluation metrics show that ANN showed the highest-level accuracy (99.4%) when tested with breast cancer dataset. On the other hand, when these ML classifiers are tested with the cervical cancer dataset, Ensemble (Bagged Tree) technique gave better accuracy (93.1%) in comparison to other classifiers.

Keywords: artificial neural networks, breast cancer, classifiers, cervical cancer, f-score, machine learning, precision, recall

Procedia PDF Downloads 266
21097 The Impact of Experiential Learning on the Success of Upper Division Mechanical Engineering Students

Authors: Seyedali Seyedkavoosi, Mohammad Obadat, Seantorrion Boyle

Abstract:

The purpose of this study is to assess the effectiveness of a nontraditional experiential learning strategy in improving the success and interest of mechanical engineering students, using the Kinematics/Dynamics of Machine course as a case study. This upper-division technical course covers a wide range of topics, including mechanism and machine system analysis and synthesis, yet the complexities of ideas like acceleration, motion, and machine component relationships are hard to explain using standard teaching techniques. To solve this problem, a thorough design project was created that gave students hands-on experience developing, manufacturing, and testing their inventions. The main goals of the project were to improve students' grasp of machine design and kinematics, to develop problem-solving and presenting abilities, and to familiarize them with professional software. A questionnaire survey was done to evaluate the effect of this technique on students' performance and interest in mechanical engineering. The outcomes of the study shed light on the usefulness of nontraditional experiential learning approaches in engineering education.

Keywords: experiential learning, nontraditional teaching, hands-on design project, engineering education

Procedia PDF Downloads 78
21096 A Support Vector Machine Learning Prediction Model of Evapotranspiration Using Real-Time Sensor Node Data

Authors: Waqas Ahmed Khan Afridi, Subhas Chandra Mukhopadhyay, Bandita Mainali

Abstract:

The research paper presents a unique approach to evapotranspiration (ET) prediction using a Support Vector Machine (SVM) learning algorithm. The study leverages real-time sensor node data to develop an accurate and adaptable prediction model, addressing the inherent challenges of traditional ET estimation methods. The integration of the SVM algorithm with real-time sensor node data offers great potential to improve spatial and temporal resolution in ET predictions. In the model development, key input features are measured and computed using mathematical equations such as Penman-Monteith (FAO56) and soil water balance (SWB), which include soil-environmental parameters such as; solar radiation (Rs), air temperature (T), atmospheric pressure (P), relative humidity (RH), wind speed (u2), rain (R), deep percolation (DP), soil temperature (ST), and change in soil moisture (∆SM). The one-year field data are split into combinations of three proportions i.e. train, test, and validation sets. While kernel functions with tuning hyperparameters have been used to train and improve the accuracy of the prediction model with multiple iterations. This paper also outlines the existing methods and the machine learning techniques to determine Evapotranspiration, data collection and preprocessing, model construction, and evaluation metrics, highlighting the significance of SVM in advancing the field of ET prediction. The results demonstrate the robustness and high predictability of the developed model on the basis of performance evaluation metrics (R2, RMSE, MAE). The effectiveness of the proposed model in capturing complex relationships within soil and environmental parameters provide insights into its potential applications for water resource management and hydrological ecosystem.

Keywords: evapotranspiration, FAO56, KNIME, machine learning, RStudio, SVM, sensors

Procedia PDF Downloads 52
21095 Machine Learning Prediction of Compressive Damage and Energy Absorption in Carbon Fiber-Reinforced Polymer Tubular Structures

Authors: Milad Abbasi

Abstract:

Carbon fiber-reinforced polymer (CFRP) composite structures are increasingly being utilized in the automotive industry due to their lightweight and specific energy absorption capabilities. Although it is impossible to predict composite mechanical properties directly using theoretical methods, various research has been conducted so far in the literature for accurate simulation of CFRP structures' energy-absorbing behavior. In this research, axial compression experiments were carried out on hand lay-up unidirectional CFRP composite tubes. The fabrication method allowed the authors to extract the material properties of the CFRPs using ASTM D3039, D3410, and D3518 standards. A neural network machine learning algorithm was then utilized to build a robust prediction model to forecast the axial compressive properties of CFRP tubes while reducing high-cost experimental efforts. The predicted results have been compared with the experimental outcomes in terms of load-carrying capacity and energy absorption capability. The results showed high accuracy and precision in the prediction of the energy-absorption capacity of the CFRP tubes. This research also demonstrates the effectiveness and challenges of machine learning techniques in the robust simulation of composites' energy-absorption behavior. Interestingly, the proposed method considerably condensed numerical and experimental efforts in the simulation and calibration of CFRP composite tubes subjected to compressive loading.

Keywords: CFRP composite tubes, energy absorption, crushing behavior, machine learning, neural network

Procedia PDF Downloads 138
21094 Spontaneous and Posed Smile Detection: Deep Learning, Traditional Machine Learning, and Human Performance

Authors: Liang Wang, Beste F. Yuksel, David Guy Brizan

Abstract:

A computational model of affect that can distinguish between spontaneous and posed smiles with no errors on a large, popular data set using deep learning techniques is presented in this paper. A Long Short-Term Memory (LSTM) classifier, a type of Recurrent Neural Network, is utilized and compared to human classification. Results showed that while human classification (mean of 0.7133) was above chance, the LSTM model was more accurate than human classification and other comparable state-of-the-art systems. Additionally, a high accuracy rate was maintained with small amounts of training videos (70 instances). The derivation of important features to further understand the success of our computational model were analyzed, and it was inferred that thousands of pairs of points within the eyes and mouth are important throughout all time segments in a smile. This suggests that distinguishing between a posed and spontaneous smile is a complex task, one which may account for the difficulty and lower accuracy of human classification compared to machine learning models.

Keywords: affective computing, affect detection, computer vision, deep learning, human-computer interaction, machine learning, posed smile detection, spontaneous smile detection

Procedia PDF Downloads 113
21093 Identification of Hepatocellular Carcinoma Using Supervised Learning Algorithms

Authors: Sagri Sharma

Abstract:

Analysis of diseases integrating multi-factors increases the complexity of the problem and therefore, development of frameworks for the analysis of diseases is an issue that is currently a topic of intense research. Due to the inter-dependence of the various parameters, the use of traditional methodologies has not been very effective. Consequently, newer methodologies are being sought to deal with the problem. Supervised Learning Algorithms are commonly used for performing the prediction on previously unseen data. These algorithms are commonly used for applications in fields ranging from image analysis to protein structure and function prediction and they get trained using a known dataset to come up with a predictor model that generates reasonable predictions for the response to new data. Gene expression profiles generated by DNA analysis experiments can be quite complex since these experiments can involve hypotheses involving entire genomes. The application of well-known machine learning algorithm - Support Vector Machine - to analyze the expression levels of thousands of genes simultaneously in a timely, automated and cost effective way is thus used. The objectives to undertake the presented work are development of a methodology to identify genes relevant to Hepatocellular Carcinoma (HCC) from gene expression dataset utilizing supervised learning algorithms and statistical evaluations along with development of a predictive framework that can perform classification tasks on new, unseen data.

Keywords: artificial intelligence, biomarker, gene expression datasets, hepatocellular carcinoma, machine learning, supervised learning algorithms, support vector machine

Procedia PDF Downloads 418
21092 Comparison of Different Machine Learning Algorithms for Solubility Prediction

Authors: Muhammet Baldan, Emel Timuçin

Abstract:

Molecular solubility prediction plays a crucial role in various fields, such as drug discovery, environmental science, and material science. In this study, we compare the performance of five machine learning algorithms—linear regression, support vector machines (SVM), random forests, gradient boosting machines (GBM), and neural networks—for predicting molecular solubility using the AqSolDB dataset. The dataset consists of 9981 data points with their corresponding solubility values. MACCS keys (166 bits), RDKit properties (20 properties), and structural properties(3) features are extracted for every smile representation in the dataset. A total of 189 features were used for training and testing for every molecule. Each algorithm is trained on a subset of the dataset and evaluated using metrics accuracy scores. Additionally, computational time for training and testing is recorded to assess the efficiency of each algorithm. Our results demonstrate that random forest model outperformed other algorithms in terms of predictive accuracy, achieving an 0.93 accuracy score. Gradient boosting machines and neural networks also exhibit strong performance, closely followed by support vector machines. Linear regression, while simpler in nature, demonstrates competitive performance but with slightly higher errors compared to ensemble methods. Overall, this study provides valuable insights into the performance of machine learning algorithms for molecular solubility prediction, highlighting the importance of algorithm selection in achieving accurate and efficient predictions in practical applications.

Keywords: random forest, machine learning, comparison, feature extraction

Procedia PDF Downloads 23
21091 Machine Learning Approach for Predicting Students’ Academic Performance and Study Strategies Based on Their Motivation

Authors: Fidelia A. Orji, Julita Vassileva

Abstract:

This research aims to develop machine learning models for students' academic performance and study strategy prediction, which could be generalized to all courses in higher education. Key learning attributes (intrinsic, extrinsic, autonomy, relatedness, competence, and self-esteem) used in building the models are chosen based on prior studies, which revealed that the attributes are essential in students’ learning process. Previous studies revealed the individual effects of each of these attributes on students’ learning progress. However, few studies have investigated the combined effect of the attributes in predicting student study strategy and academic performance to reduce the dropout rate. To bridge this gap, we used Scikit-learn in python to build five machine learning models (Decision Tree, K-Nearest Neighbour, Random Forest, Linear/Logistic Regression, and Support Vector Machine) for both regression and classification tasks to perform our analysis. The models were trained, evaluated, and tested for accuracy using 924 university dentistry students' data collected by Chilean authors through quantitative research design. A comparative analysis of the models revealed that the tree-based models such as the random forest (with prediction accuracy of 94.9%) and decision tree show the best results compared to the linear, support vector, and k-nearest neighbours. The models built in this research can be used in predicting student performance and study strategy so that appropriate interventions could be implemented to improve student learning progress. Thus, incorporating strategies that could improve diverse student learning attributes in the design of online educational systems may increase the likelihood of students continuing with their learning tasks as required. Moreover, the results show that the attributes could be modelled together and used to adapt/personalize the learning process.

Keywords: classification models, learning strategy, predictive modeling, regression models, student academic performance, student motivation, supervised machine learning

Procedia PDF Downloads 111
21090 A Machine Learning Approach for Intelligent Transportation System Management on Urban Roads

Authors: Ashish Dhamaniya, Vineet Jain, Rajesh Chouhan

Abstract:

Traffic management is one of the gigantic issue in most of the urban roads in al-most all metropolitan cities in India. Speed is one of the critical traffic parameters for effective Intelligent Transportation System (ITS) implementation as it decides the arrival rate of vehicles on an intersection which are majorly the point of con-gestions. The study aimed to leverage Machine Learning (ML) models to produce precise predictions of speed on urban roadway links. The research objective was to assess how categorized traffic volume and road width, serving as variables, in-fluence speed prediction. Four tree-based regression models namely: Decision Tree (DT), Random Forest (RF), Extra Tree (ET), and Extreme Gradient Boost (XGB)are employed for this purpose. The models' performances were validated using test data, and the results demonstrate that Random Forest surpasses other machine learning techniques and a conventional utility theory-based model in speed prediction. The study is useful for managing the urban roadway network performance under mixed traffic conditions and effective implementation of ITS.

Keywords: stream speed, urban roads, machine learning, traffic flow

Procedia PDF Downloads 52