Search results for: machine learning algorithms
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 9456

Search results for: machine learning algorithms

9156 Improved Particle Swarm Optimization with Cellular Automata and Fuzzy Cellular Automata

Authors: Ramin Javadzadeh

Abstract:

The particle swarm optimization are Meta heuristic optimization method, which are used for clustering and pattern recognition applications are abundantly. These algorithms in multimodal optimization problems are more efficient than genetic algorithms. A major drawback in these algorithms is their slow convergence to global optimum and their weak stability can be considered in various running of these algorithms. In this paper, improved Particle swarm optimization is introduced for the first time to overcome its problems. The fuzzy cellular automata is used for improving the algorithm efficiently. The credibility of the proposed approach is evaluated by simulations, and it is shown that the proposed approach achieves better results can be achieved compared to the Particle swarm optimization algorithms.

Keywords: cellular automata, cellular learning automata, local search, optimization, particle swarm optimization

Procedia PDF Downloads 576
9155 Machine Learning Data Architecture

Authors: Neerav Kumar, Naumaan Nayyar, Sharath Kashyap

Abstract:

Most companies see an increase in the adoption of machine learning (ML) applications across internal and external-facing use cases. ML applications vend output either in batch or real-time patterns. A complete batch ML pipeline architecture comprises data sourcing, feature engineering, model training, model deployment, model output vending into a data store for downstream application. Due to unclear role expectations, we have observed that scientists specializing in building and optimizing models are investing significant efforts into building the other components of the architecture, which we do not believe is the best use of scientists’ bandwidth. We propose a system architecture created using AWS services that bring industry best practices to managing the workflow and simplifies the process of model deployment and end-to-end data integration for an ML application. This narrows down the scope of scientists’ work to model building and refinement while specialized data engineers take over the deployment, pipeline orchestration, data quality, data permission system, etc. The pipeline infrastructure is built and deployed as code (using terraform, cdk, cloudformation, etc.) which makes it easy to replicate and/or extend the architecture to other models that are used in an organization.

Keywords: data pipeline, machine learning, AWS, architecture, batch machine learning

Procedia PDF Downloads 35
9154 Physics-Informed Machine Learning for Displacement Estimation in Solid Mechanics Problem

Authors: Feng Yang

Abstract:

Machine learning (ML), especially deep learning (DL), has been extensively applied to many applications in recently years and gained great success in solving different problems, including scientific problems. However, conventional ML/DL methodologies are purely data-driven which have the limitations, such as need of ample amount of labelled training data, lack of consistency to physical principles, and lack of generalizability to new problems/domains. Recently, there is a growing consensus that ML models need to further take advantage of prior knowledge to deal with these limitations. Physics-informed machine learning, aiming at integration of physics/domain knowledge into ML, has been recognized as an emerging area of research, especially in the recent 2 to 3 years. In this work, physics-informed ML, specifically physics-informed neural network (NN), is employed and implemented to estimate the displacements at x, y, z directions in a solid mechanics problem that is controlled by equilibrium equations with boundary conditions. By incorporating the physics (i.e. the equilibrium equations) into the learning process of NN, it is showed that the NN can be trained very efficiently with a small set of labelled training data. Experiments with different settings of the NN model and the amount of labelled training data were conducted, and the results show that very high accuracy can be achieved in fulfilling the equilibrium equations as well as in predicting the displacements, e.g. in setting the overall displacement of 0.1, a root mean square error (RMSE) of 2.09 × 10−4 was achieved.

Keywords: deep learning, neural network, physics-informed machine learning, solid mechanics

Procedia PDF Downloads 114
9153 Hybrid Feature Selection Method for Sentiment Classification of Movie Reviews

Authors: Vishnu Goyal, Basant Agarwal

Abstract:

Sentiment analysis research provides methods for identifying the people’s opinion written in blogs, reviews, social networking websites etc. Sentiment analysis is to understand what opinion people have about any given entity, object or thing. Sentiment analysis research can be broadly categorised into three types of approaches i.e. semantic orientation, machine learning and lexicon based approaches. Feature selection methods improve the performance of the machine learning algorithms by eliminating the irrelevant features. Information gain feature selection method has been considered best method for sentiment analysis; however, it has the drawback of selection of threshold. Therefore, in this paper, we propose a hybrid feature selection methods comprising of information gain and proposed feature selection method. Initially, features are selected using Information Gain (IG) and further more noisy features are eliminated using the proposed feature selection method. Experimental results show the efficiency of the proposed feature selection methods.

Keywords: feature selection, sentiment analysis, hybrid feature selection

Procedia PDF Downloads 307
9152 Machine Learning for Classifying Risks of Death and Length of Stay of Patients in Intensive Unit Care Beds

Authors: Itamir de Morais Barroca Filho, Cephas A. S. Barreto, Ramon Malaquias, Cezar Miranda Paula de Souza, Arthur Costa Gorgônio, João C. Xavier-Júnior, Mateus Firmino, Fellipe Matheus Costa Barbosa

Abstract:

Information and Communication Technologies (ICT) in healthcare are crucial for efficiently delivering medical healthcare services to patients. These ICTs are also known as e-health and comprise technologies such as electronic record systems, telemedicine systems, and personalized devices for diagnosis. The focus of e-health is to improve the quality of health information, strengthen national health systems, and ensure accessible, high-quality health care for all. All the data gathered by these technologies make it possible to help clinical staff with automated decisions using machine learning. In this context, we collected patient data, such as heart rate, oxygen saturation (SpO2), blood pressure, respiration, and others. With this data, we were able to develop machine learning models for patients’ risk of death and estimate the length of stay in ICU beds. Thus, this paper presents the methodology for applying machine learning techniques to develop these models. As a result, although we implemented these models on an IoT healthcare platform, helping clinical staff in healthcare in an ICU, it is essential to create a robust clinical validation process and monitoring of the proposed models.

Keywords: ICT, e-health, machine learning, ICU, healthcare

Procedia PDF Downloads 75
9151 IoT Based Soil Moisture Monitoring System for Indoor Plants

Authors: Gul Rahim Rahimi

Abstract:

The IoT-based soil moisture monitoring system for indoor plants is designed to address the challenges of maintaining optimal moisture levels in soil for plant growth and health. The system utilizes sensor technology to collect real-time data on soil moisture levels, which is then processed and analyzed using machine learning algorithms. This allows for accurate and timely monitoring of soil moisture levels, ensuring plants receive the appropriate amount of water to thrive. The main objectives of the system are twofold: to keep plants fresh and healthy by preventing water deficiency and to provide users with comprehensive insights into the water content of the soil on a daily and hourly basis. By monitoring soil moisture levels, users can identify patterns and trends in water consumption, allowing for more informed decision-making regarding watering schedules and plant care. The scope of the system extends to the agriculture industry, where it can be utilized to minimize the efforts required by farmers to monitor soil moisture levels manually. By automating the process of soil moisture monitoring, farmers can optimize water usage, improve crop yields, and reduce the risk of plant diseases associated with over or under-watering. Key technologies employed in the system include the Capacitive Soil Moisture Sensor V1.2 for accurate soil moisture measurement, the Node MCU ESP8266-12E Board for data transmission and communication, and the Arduino framework for programming and development. Additionally, machine learning algorithms are utilized to analyze the collected data and provide actionable insights. Cloud storage is utilized to store and manage the data collected from multiple sensors, allowing for easy access and retrieval of information. Overall, the IoT-based soil moisture monitoring system offers a scalable and efficient solution for indoor plant care, with potential applications in agriculture and beyond. By harnessing the power of IoT and machine learning, the system empowers users to make informed decisions about plant watering, leading to healthier and more vibrant indoor environments.

Keywords: IoT-based, soil moisture monitoring, indoor plants, water management

Procedia PDF Downloads 24
9150 Life Prediction Method of Lithium-Ion Battery Based on Grey Support Vector Machines

Authors: Xiaogang Li, Jieqiong Miao

Abstract:

As for the problem of the grey forecasting model prediction accuracy is low, an improved grey prediction model is put forward. Firstly, use trigonometric function transform the original data sequence in order to improve the smoothness of data , this model called SGM( smoothness of grey prediction model), then combine the improved grey model with support vector machine , and put forward the grey support vector machine model (SGM - SVM).Before the establishment of the model, we use trigonometric functions and accumulation generation operation preprocessing data in order to enhance the smoothness of the data and weaken the randomness of the data, then use support vector machine (SVM) to establish a prediction model for pre-processed data and select model parameters using genetic algorithms to obtain the optimum value of the global search. Finally, restore data through the "regressive generate" operation to get forecasting data. In order to prove that the SGM-SVM model is superior to other models, we select the battery life data from calce. The presented model is used to predict life of battery and the predicted result was compared with that of grey model and support vector machines.For a more intuitive comparison of the three models, this paper presents root mean square error of this three different models .The results show that the effect of grey support vector machine (SGM-SVM) to predict life is optimal, and the root mean square error is only 3.18%. Keywords: grey forecasting model, trigonometric function, support vector machine, genetic algorithms, root mean square error

Keywords: Grey prediction model, trigonometric functions, support vector machines, genetic algorithms, root mean square error

Procedia PDF Downloads 434
9149 Yawning Computing Using Bayesian Networks

Authors: Serge Tshibangu, Turgay Celik, Zenzo Ncube

Abstract:

Road crashes kill nearly over a million people every year, and leave millions more injured or permanently disabled. Various annual reports reveal that the percentage of fatal crashes due to fatigue/driver falling asleep comes directly after the percentage of fatal crashes due to intoxicated drivers. This percentage is higher than the combined percentage of fatal crashes due to illegal/Un-Safe U-turn and illegal/Un-Safe reversing. Although a relatively small percentage of police reports on road accidents highlights drowsiness and fatigue, the importance of these factors is greater than we might think, hidden by the undercounting of their events. Some scenarios show that these factors are significant in accidents with killed and injured people. Thus the need for an automatic drivers fatigue detection system in order to considerably reduce the number of accidents owing to fatigue.This research approaches the drivers fatigue detection problem in an innovative way by combining cues collected from both temporal analysis of drivers’ faces and environment. Monotony in driving environment is inter-related with visual symptoms of fatigue on drivers’ faces to achieve fatigue detection. Optical and infrared (IR) sensors are used to analyse the monotony in driving environment and to detect the visual symptoms of fatigue on human face. Internal cues from drivers faces and external cues from environment are combined together using machine learning algorithms to automatically detect fatigue.

Keywords: intelligent transportation systems, bayesian networks, yawning computing, machine learning algorithms

Procedia PDF Downloads 436
9148 Detection and Classification of Rubber Tree Leaf Diseases Using Machine Learning

Authors: Kavyadevi N., Kaviya G., Gowsalya P., Janani M., Mohanraj S.

Abstract:

Hevea brasiliensis, also known as the rubber tree, is one of the foremost assets of crops in the world. One of the most significant advantages of the Rubber Plant in terms of air oxygenation is its capacity to reduce the likelihood of an individual developing respiratory allergies like asthma. To construct such a system that can properly identify crop diseases and pests and then create a database of insecticides for each pest and disease, we must first give treatment for the illness that has been detected. We shall primarily examine three major leaf diseases since they are economically deficient in this article, which is Bird's eye spot, algal spot and powdery mildew. And the recommended work focuses on disease identification on rubber tree leaves. It will be accomplished by employing one of the superior algorithms. Input, Preprocessing, Image Segmentation, Extraction Feature, and Classification will be followed by the processing technique. We will use time-consuming procedures that they use to detect the sickness. As a consequence, the main ailments, underlying causes, and signs and symptoms of diseases that harm the rubber tree are covered in this study.

Keywords: image processing, python, convolution neural network (CNN), machine learning

Procedia PDF Downloads 52
9147 Development of an Automatic Computational Machine Learning Pipeline to Process Confocal Fluorescence Images for Virtual Cell Generation

Authors: Miguel Contreras, David Long, Will Bachman

Abstract:

Background: Microscopy plays a central role in cell and developmental biology. In particular, fluorescence microscopy can be used to visualize specific cellular components and subsequently quantify their morphology through development of virtual-cell models for study of effects of mechanical forces on cells. However, there are challenges with these imaging experiments, which can make it difficult to quantify cell morphology: inconsistent results, time-consuming and potentially costly protocols, and limitation on number of labels due to spectral overlap. To address these challenges, the objective of this project is to develop an automatic computational machine learning pipeline to predict cellular components morphology for virtual-cell generation based on fluorescence cell membrane confocal z-stacks. Methods: Registered confocal z-stacks of nuclei and cell membrane of endothelial cells, consisting of 20 images each, were obtained from fluorescence confocal microscopy and normalized through software pipeline for each image to have a mean pixel intensity value of 0.5. An open source machine learning algorithm, originally developed to predict fluorescence labels on unlabeled transmitted light microscopy cell images, was trained using this set of normalized z-stacks on a single CPU machine. Through transfer learning, the algorithm used knowledge acquired from its previous training sessions to learn the new task. Once trained, the algorithm was used to predict morphology of nuclei using normalized cell membrane fluorescence images as input. Predictions were compared to the ground truth fluorescence nuclei images. Results: After one week of training, using one cell membrane z-stack (20 images) and corresponding nuclei label, results showed qualitatively good predictions on training set. The algorithm was able to accurately predict nuclei locations as well as shape when fed only fluorescence membrane images. Similar training sessions with improved membrane image quality, including clear lining and shape of the membrane, clearly showing the boundaries of each cell, proportionally improved nuclei predictions, reducing errors relative to ground truth. Discussion: These results show the potential of pre-trained machine learning algorithms to predict cell morphology using relatively small amounts of data and training time, eliminating the need of using multiple labels in immunofluorescence experiments. With further training, the algorithm is expected to predict different labels (e.g., focal-adhesion sites, cytoskeleton), which can be added to the automatic machine learning pipeline for direct input into Principal Component Analysis (PCA) for generation of virtual-cell mechanical models.

Keywords: cell morphology prediction, computational machine learning, fluorescence microscopy, virtual-cell models

Procedia PDF Downloads 179
9146 Predictive Maintenance of Industrial Shredders: Efficient Operation through Real-Time Monitoring Using Statistical Machine Learning

Authors: Federico Pittino, Thomas Arnold

Abstract:

The shredding of waste materials is a key step in the recycling process towards the circular economy. Industrial shredders for waste processing operate in very harsh operating conditions, leading to the need for frequent maintenance of critical components. Maintenance optimization is particularly important also to increase the machine’s efficiency, thereby reducing the operational costs. In this work, a monitoring system has been developed and deployed on an industrial shredder located at a waste recycling plant in Austria. The machine has been monitored for one year, and methods for predictive maintenance have been developed for two key components: the cutting knives and the drive belt. The large amount of collected data is leveraged by statistical machine learning techniques, thereby not requiring very detailed knowledge of the machine or its live operating conditions. The results show that, despite the wide range of operating conditions, a reliable estimate of the optimal time for maintenance can be derived. Moreover, the trade-off between the cost of maintenance and the increase in power consumption due to the wear state of the monitored components of the machine is investigated. This work proves the benefits of real-time monitoring system for the efficient operation of industrial shredders.

Keywords: predictive maintenance, circular economy, industrial shredder, cost optimization, statistical machine learning

Procedia PDF Downloads 103
9145 Comprehensive Machine Learning-Based Glucose Sensing from Near-Infrared Spectra

Authors: Bitewulign Mekonnen

Abstract:

Context: This scientific paper focuses on the use of near-infrared (NIR) spectroscopy to determine glucose concentration in aqueous solutions accurately and rapidly. The study compares six different machine learning methods for predicting glucose concentration and also explores the development of a deep learning model for classifying NIR spectra. The objective is to optimize the detection model and improve the accuracy of glucose prediction. This research is important because it provides a comprehensive analysis of various machine-learning techniques for estimating aqueous glucose concentrations. Research Aim: The aim of this study is to compare and evaluate different machine-learning methods for predicting glucose concentration from NIR spectra. Additionally, the study aims to develop and assess a deep-learning model for classifying NIR spectra. Methodology: The research methodology involves the use of machine learning and deep learning techniques. Six machine learning regression models, including support vector machine regression, partial least squares regression, extra tree regression, random forest regression, extreme gradient boosting, and principal component analysis-neural network, are employed to predict glucose concentration. The NIR spectra data is randomly divided into train and test sets, and the process is repeated ten times to increase generalization ability. In addition, a convolutional neural network is developed for classifying NIR spectra. Findings: The study reveals that the SVMR, ETR, and PCA-NN models exhibit excellent performance in predicting glucose concentration, with correlation coefficients (R) > 0.99 and determination coefficients (R²)> 0.985. The deep learning model achieves high macro-averaging scores for precision, recall, and F1-measure. These findings demonstrate the effectiveness of machine learning and deep learning methods in optimizing the detection model and improving glucose prediction accuracy. Theoretical Importance: This research contributes to the field by providing a comprehensive analysis of various machine-learning techniques for estimating glucose concentrations from NIR spectra. It also explores the use of deep learning for the classification of indistinguishable NIR spectra. The findings highlight the potential of machine learning and deep learning in enhancing the prediction accuracy of glucose-relevant features. Data Collection and Analysis Procedures: The NIR spectra and corresponding references for glucose concentration are measured in increments of 20 mg/dl. The data is randomly divided into train and test sets, and the models are evaluated using regression analysis and classification metrics. The performance of each model is assessed based on correlation coefficients, determination coefficients, precision, recall, and F1-measure. Question Addressed: The study addresses the question of whether machine learning and deep learning methods can optimize the detection model and improve the accuracy of glucose prediction from NIR spectra. Conclusion: The research demonstrates that machine learning and deep learning methods can effectively predict glucose concentration from NIR spectra. The SVMR, ETR, and PCA-NN models exhibit superior performance, while the deep learning model achieves high classification scores. These findings suggest that machine learning and deep learning techniques can be used to improve the prediction accuracy of glucose-relevant features. Further research is needed to explore their clinical utility in analyzing complex matrices, such as blood glucose levels.

Keywords: machine learning, signal processing, near-infrared spectroscopy, support vector machine, neural network

Procedia PDF Downloads 64
9144 Internet of Things Networks: Denial of Service Detection in Constrained Application Protocol Using Machine Learning Algorithm

Authors: Adamu Abdullahi, On Francisca, Saidu Isah Rambo, G. N. Obunadike, D. T. Chinyio

Abstract:

The paper discusses the potential threat of Denial of Service (DoS) attacks in the Internet of Things (IoT) networks on constrained application protocols (CoAP). As billions of IoT devices are expected to be connected to the internet in the coming years, the security of these devices is vulnerable to attacks, disrupting their functioning. This research aims to tackle this issue by applying mixed methods of qualitative and quantitative for feature selection, extraction, and cluster algorithms to detect DoS attacks in the Constrained Application Protocol (CoAP) using the Machine Learning Algorithm (MLA). The main objective of the research is to enhance the security scheme for CoAP in the IoT environment by analyzing the nature of DoS attacks and identifying a new set of features for detecting them in the IoT network environment. The aim is to demonstrate the effectiveness of the MLA in detecting DoS attacks and compare it with conventional intrusion detection systems for securing the CoAP in the IoT environment. Findings: The research identifies the appropriate node to detect DoS attacks in the IoT network environment and demonstrates how to detect the attacks through the MLA. The accuracy detection in both classification and network simulation environments shows that the k-means algorithm scored the highest percentage in the training and testing of the evaluation. The network simulation platform also achieved the highest percentage of 99.93% in overall accuracy. This work reviews conventional intrusion detection systems for securing the CoAP in the IoT environment. The DoS security issues associated with the CoAP are discussed.

Keywords: algorithm, CoAP, DoS, IoT, machine learning

Procedia PDF Downloads 48
9143 DNA Methylation Score Development for In utero Exposure to Paternal Smoking Using a Supervised Machine Learning Approach

Authors: Cristy Stagnar, Nina Hubig, Diana Ivankovic

Abstract:

The epigenome is a compelling candidate for mediating long-term responses to environmental effects modifying disease risk. The main goal of this research is to develop a machine learning-based DNA methylation score, which will be valuable in delineating the unique contribution of paternal epigenetic modifications to the germline impacting childhood health outcomes. It will also be a useful tool in validating self-reports of nonsmoking and in adjusting epigenome-wide DNA methylation association studies for this early-life exposure. Using secondary data from two population-based methylation profiling studies, our DNA methylation score is based on CpG DNA methylation measurements from cord blood gathered from children whose fathers smoked pre- and peri-conceptually. Each child’s mother and father fell into one of three class labels in the accompanying questionnaires -never smoker, former smoker, or current smoker. By applying different machine learning algorithms to the accessible resource for integrated epigenomic studies (ARIES) sub-study of the Avon longitudinal study of parents and children (ALSPAC) data set, which we used for training and testing of our model, the best-performing algorithm for classifying the father smoker and mother never smoker was selected based on Cohen’s κ. Error in the model was identified and optimized. The final DNA methylation score was further tested and validated in an independent data set. This resulted in a linear combination of methylation values of selected probes via a logistic link function that accurately classified each group and contributed the most towards classification. The result is a unique, robust DNA methylation score which combines information on DNA methylation and early life exposure of offspring to paternal smoking during pregnancy and which may be used to examine the paternal contribution to offspring health outcomes.

Keywords: epigenome, health outcomes, paternal preconception environmental exposures, supervised machine learning

Procedia PDF Downloads 168
9142 Facilitating Written Biology Assessment in Large-Enrollment Courses Using Machine Learning

Authors: Luanna B. Prevost, Kelli Carter, Margaurete Romero, Kirsti Martinez

Abstract:

Writing is an essential scientific practice, yet, in several countries, the increasing university science class-size limits the use of written assessments. Written assessments allow students to demonstrate their learning in their own words and permit the faculty to evaluate students’ understanding. However, the time and resources required to grade written assessments prohibit their use in large-enrollment science courses. This study examined the use of machine learning algorithms to automatically analyze student writing and provide timely feedback to the faculty about students' writing in biology. Written responses to questions about matter and energy transformation were collected from large-enrollment undergraduate introductory biology classrooms. Responses were analyzed using the LightSide text mining and classification software. Cohen’s Kappa was used to measure agreement between the LightSide models and human raters. Predictive models achieved agreement with human coding of 0.7 Cohen’s Kappa or greater. Models captured that when writing about matter-energy transformation at the ecosystem level, students focused on primarily on the concepts of heat loss, recycling of matter, and conservation of matter and energy. Models were also produced to capture writing about processes such as decomposition and biochemical cycling. The models created in this study can be used to provide automatic feedback about students understanding of these concepts to biology faculty who desire to use formative written assessments in larger enrollment biology classes, but do not have the time or personnel for manual grading.

Keywords: machine learning, written assessment, biology education, text mining

Procedia PDF Downloads 252
9141 A Text Classification Approach Based on Natural Language Processing and Machine Learning Techniques

Authors: Rim Messaoudi, Nogaye-Gueye Gning, François Azelart

Abstract:

Automatic text classification applies mostly natural language processing (NLP) and other AI-guided techniques to automatically classify text in a faster and more accurate manner. This paper discusses the subject of using predictive maintenance to manage incident tickets inside the sociality. It focuses on proposing a tool that treats and analyses comments and notes written by administrators after resolving an incident ticket. The goal here is to increase the quality of these comments. Additionally, this tool is based on NLP and machine learning techniques to realize the textual analytics of the extracted data. This approach was tested using real data taken from the French National Railways (SNCF) company and was given a high-quality result.

Keywords: machine learning, text classification, NLP techniques, semantic representation

Procedia PDF Downloads 67
9140 A Less Complexity Deep Learning Method for Drones Detection

Authors: Mohamad Kassab, Amal El Fallah Seghrouchni, Frederic Barbaresco, Raed Abu Zitar

Abstract:

Detecting objects such as drones is a challenging task as their relative size and maneuvering capabilities deceive machine learning models and cause them to misclassify drones as birds or other objects. In this work, we investigate applying several deep learning techniques to benchmark real data sets of flying drones. A deep learning paradigm is proposed for the purpose of mitigating the complexity of those systems. The proposed paradigm consists of a hybrid between the AdderNet deep learning paradigm and the Single Shot Detector (SSD) paradigm. The goal was to minimize multiplication operations numbers in the filtering layers within the proposed system and, hence, reduce complexity. Some standard machine learning technique, such as SVM, is also tested and compared to other deep learning systems. The data sets used for training and testing were either complete or filtered in order to remove the images with mall objects. The types of data were RGB or IR data. Comparisons were made between all these types, and conclusions were presented.

Keywords: drones detection, deep learning, birds versus drones, precision of detection, AdderNet

Procedia PDF Downloads 152
9139 Short Text Classification Using Part of Speech Feature to Analyze Students' Feedback of Assessment Components

Authors: Zainab Mutlaq Ibrahim, Mohamed Bader-El-Den, Mihaela Cocea

Abstract:

Students' textual feedback can hold unique patterns and useful information about learning process, it can hold information about advantages and disadvantages of teaching methods, assessment components, facilities, and other aspects of teaching. The results of analysing such a feedback can form a key point for institutions’ decision makers to advance and update their systems accordingly. This paper proposes a data mining framework for analysing end of unit general textual feedback using part of speech feature (PoS) with four machine learning algorithms: support vector machines, decision tree, random forest, and naive bays. The proposed framework has two tasks: first, to use the above algorithms to build an optimal model that automatically classifies the whole data set into two subsets, one subset is tailored to assessment practices (assessment related), and the other one is the non-assessment related data. Second task to use the same algorithms to build an optimal model for whole data set, and the new data subsets to automatically detect their sentiment. The significance of this paper is to compare the performance of the above four algorithms using part of speech feature to the performance of the same algorithms using n-grams feature. The paper follows Knowledge Discovery and Data Mining (KDDM) framework to construct the classification and sentiment analysis models, which is understanding the assessment domain, cleaning and pre-processing the data set, selecting and running the data mining algorithm, interpreting mined patterns, and consolidating the discovered knowledge. The results of this paper experiments show that both models which used both features performed very well regarding first task. But regarding the second task, models that used part of speech feature has underperformed in comparison with models that used unigrams and bigrams.

Keywords: assessment, part of speech, sentiment analysis, student feedback

Procedia PDF Downloads 113
9138 Machine Learning in Momentum Strategies

Authors: Yi-Min Lan, Hung-Wen Cheng, Hsuan-Ling Chang, Jou-Ping Yu

Abstract:

The study applies machine learning models to construct momentum strategies and utilizes the information coefficient as an indicator for selecting stocks with strong and weak momentum characteristics. Through this approach, the study has built investment portfolios capable of generating superior returns and conducted a thorough analysis. Compared to existing research on momentum strategies, machine learning is incorporated to capture non-linear interactions. This approach enhances the conventional stock selection process, which is often impeded by difficulties associated with timeliness, accuracy, and efficiency due to market risk factors. The study finds that implementing bidirectional momentum strategies outperforms unidirectional ones, and momentum factors with longer observation periods exhibit stronger correlations with returns. Optimizing the number of stocks in the portfolio while staying within a certain threshold leads to the highest level of excess returns. The study presents a novel framework for momentum strategies that enhances and improves the operational aspects of asset management. By introducing innovative financial technology applications to traditional investment strategies, this paper can demonstrate significant effectiveness.

Keywords: information coefficient, machine learning, momentum, portfolio, return prediction

Procedia PDF Downloads 32
9137 Automated Machine Learning Algorithm Using Recurrent Neural Network to Perform Long-Term Time Series Forecasting

Authors: Ying Su, Morgan C. Wang

Abstract:

Long-term time series forecasting is an important research area for automated machine learning (AutoML). Currently, forecasting based on either machine learning or statistical learning is usually built by experts, and it requires significant manual effort, from model construction, feature engineering, and hyper-parameter tuning to the construction of the time series model. Automation is not possible since there are too many human interventions. To overcome these limitations, this article proposed to use recurrent neural networks (RNN) through the memory state of RNN to perform long-term time series prediction. We have shown that this proposed approach is better than the traditional Autoregressive Integrated Moving Average (ARIMA). In addition, we also found it is better than other network systems, including Fully Connected Neural Networks (FNN), Convolutional Neural Networks (CNN), and Nonpooling Convolutional Neural Networks (NPCNN).

Keywords: automated machines learning, autoregressive integrated moving average, neural networks, time series analysis

Procedia PDF Downloads 57
9136 Machine Learning Approach to Project Control Threshold Reliability Evaluation

Authors: Y. Kim, H. Lee, M. Park, B. Lee

Abstract:

Planning is understood as the determination of what has to be performed, how, in which sequence, when, what resources are needed, and their cost within the organization before execution. In most construction project, it is evident that the inherent nature of planning is dynamic, and initial planning is subject to be changed due to various uncertain conditions of construction project. Planners take a continuous revision process during the course of a project and until the very end of project. However, current practice lacks reliable, systematic tool for setting variance thresholds to determine when and what corrective actions to be taken. Rather it is heavily dependent on the level of experience and knowledge of the planner. Thus, this paper introduces a machine learning approach to evaluate project control threshold reliability incorporating project-specific data and presents a method to automate the process. The results have shown that the model improves the efficiency and accuracy of the monitoring process as an early warning.

Keywords: machine learning, project control, project progress monitoring, schedule

Procedia PDF Downloads 226
9135 Classification of Health Risk Factors to Predict the Risk of Falling in Older Adults

Authors: L. Lindsay, S. A. Coleman, D. Kerr, B. J. Taylor, A. Moorhead

Abstract:

Cognitive decline and frailty is apparent in older adults leading to an increased likelihood of the risk of falling. Currently health care professionals have to make professional decisions regarding such risks, and hence make difficult decisions regarding the future welfare of the ageing population. This study uses health data from The Irish Longitudinal Study on Ageing (TILDA), focusing on adults over the age of 50 years, in order to analyse health risk factors and predict the likelihood of falls. This prediction is based on the use of machine learning algorithms whereby health risk factors are used as inputs to predict the likelihood of falling. Initial results show that health risk factors such as long-term health issues contribute to the number of falls. The identification of such health risk factors has the potential to inform health and social care professionals, older people and their family members in order to mitigate daily living risks.

Keywords: classification, falls, health risk factors, machine learning, older adults

Procedia PDF Downloads 121
9134 Chinese Sentence Level Lip Recognition

Authors: Peng Wang, Tigang Jiang

Abstract:

The computer based lip reading method of different languages cannot be universal. At present, for the research of Chinese lip reading, whether the work on data sets or recognition algorithms, is far from mature. In this paper, we study the Chinese lipreading method based on machine learning, and propose a Chinese Sentence-level lip-reading network (CNLipNet) model which consists of spatio-temporal convolutional neural network(CNN), recurrent neural network(RNN) and Connectionist Temporal Classification (CTC) loss function. This model can map variable-length sequence of video frames to Chinese Pinyin sequence and is trained end-to-end. More over, We create CNLRS, a Chinese Lipreading Dataset, which contains 5948 samples and can be shared through github. The evaluation of CNLipNet on this dataset yielded a 41% word correct rate and a 70.6% character correct rate. This evaluation result is far superior to the professional human lip readers, indicating that CNLipNet performs well in lipreading.

Keywords: lipreading, machine learning, spatio-temporal, convolutional neural network, recurrent neural network

Procedia PDF Downloads 101
9133 Feature Analysis of Predictive Maintenance Models

Authors: Zhaoan Wang

Abstract:

Research in predictive maintenance modeling has improved in the recent years to predict failures and needed maintenance with high accuracy, saving cost and improving manufacturing efficiency. However, classic prediction models provide little valuable insight towards the most important features contributing to the failure. By analyzing and quantifying feature importance in predictive maintenance models, cost saving can be optimized based on business goals. First, multiple classifiers are evaluated with cross-validation to predict the multi-class of failures. Second, predictive performance with features provided by different feature selection algorithms are further analyzed. Third, features selected by different algorithms are ranked and combined based on their predictive power. Finally, linear explainer SHAP (SHapley Additive exPlanations) is applied to interpret classifier behavior and provide further insight towards the specific roles of features in both local predictions and global model behavior. The results of the experiments suggest that certain features play dominant roles in predictive models while others have significantly less impact on the overall performance. Moreover, for multi-class prediction of machine failures, the most important features vary with type of machine failures. The results may lead to improved productivity and cost saving by prioritizing sensor deployment, data collection, and data processing of more important features over less importance features.

Keywords: automated supply chain, intelligent manufacturing, predictive maintenance machine learning, feature engineering, model interpretation

Procedia PDF Downloads 106
9132 Using AI for Analysing Political Leaders

Authors: Shuai Zhao, Shalendra D. Sharma, Jin Xu

Abstract:

This research uses advanced machine learning models to learn a number of hypotheses regarding political executives. Specifically, it analyses the impact these powerful leaders have on economic growth by using leaders’ data from the Archigos database from 1835 to the end of 2015. The data is processed by the AutoGluon, which was developed by Amazon. Automated Machine Learning (AutoML) and AutoGluon can automatically extract features from the data and then use multiple classifiers to train the data. Use a linear regression model and classification model to establish the relationship between leaders and economic growth (GDP per capita growth), and to clarify the relationship between their characteristics and economic growth from a machine learning perspective. Our work may show as a model or signal for collaboration between the fields of statistics and artificial intelligence (AI) that can light up the way for political researchers and economists.

Keywords: comparative politics, political executives, leaders’ characteristics, artificial intelligence

Procedia PDF Downloads 57
9131 A Selection Approach: Discriminative Model for Nominal Attributes-Based Distance Measures

Authors: Fang Gong

Abstract:

Distance measures are an indispensable part of many instance-based learning (IBL) and machine learning (ML) algorithms. The value difference metrics (VDM) and inverted specific-class distance measure (ISCDM) are among the top-performing distance measures that address nominal attributes. VDM performs well in some domains owing to its simplicity and poorly in others that exist missing value and non-class attribute noise. ISCDM, however, typically works better than VDM on such domains. To maximize their advantages and avoid disadvantages, in this paper, a selection approach: a discriminative model for nominal attributes-based distance measures is proposed. More concretely, VDM and ISCDM are built independently on a training dataset at the training stage, and the most credible one is recorded for each training instance. At the test stage, its nearest neighbor for each test instance is primarily found by any of VDM and ISCDM and then chooses the most reliable model of its nearest neighbor to predict its class label. It is simply denoted as a discriminative distance measure (DDM). Experiments are conducted on the 34 University of California at Irvine (UCI) machine learning repository datasets, and it shows DDM retains the interpretability and simplicity of VDM and ISCDM but significantly outperforms the original VDM and ISCDM and other state-of-the-art competitors in terms of accuracy.

Keywords: distance measure, discriminative model, nominal attributes, nearest neighbor

Procedia PDF Downloads 90
9130 Using Machine Learning to Enhance Win Ratio for College Ice Hockey Teams

Authors: Sadixa Sanjel, Ahmed Sadek, Naseef Mansoor, Zelalem Denekew

Abstract:

Collegiate ice hockey (NCAA) sports analytics is different from the national level hockey (NHL). We apply and compare multiple machine learning models such as Linear Regression, Random Forest, and Neural Networks to predict the win ratio for a team based on their statistics. Data exploration helps determine which statistics are most useful in increasing the win ratio, which would be beneficial to coaches and team managers. We ran experiments to select the best model and chose Random Forest as the best performing. We conclude with how to bridge the gap between the college and national levels of sports analytics and the use of machine learning to enhance team performance despite not having a lot of metrics or budget for automatic tracking.

Keywords: NCAA, NHL, sports analytics, random forest, regression, neural networks, game predictions

Procedia PDF Downloads 88
9129 Neural Network and Support Vector Machine for Prediction of Foot Disorders Based on Foot Analysis

Authors: Monireh Ahmadi Bani, Adel Khorramrouz, Lalenoor Morvarid, Bagheri Mahtab

Abstract:

Background:- Foot disorders are common in musculoskeletal problems. Plantar pressure distribution measurement is one the most important part of foot disorders diagnosis for quantitative analysis. However, the association of plantar pressure and foot disorders is not clear. With the growth of dataset and machine learning methods, the relationship between foot disorders and plantar pressures can be detected. Significance of the study:- The purpose of this study was to predict the probability of common foot disorders based on peak plantar pressure distribution and center of pressure during walking. Methodologies:- 2323 participants were assessed in a foot therapy clinic between 2015 and 2021. Foot disorders were diagnosed by an experienced physician and then they were asked to walk on a force plate scanner. After the data preprocessing, due to the difference in walking time and foot size, we normalized the samples based on time and foot size. Some of force plate variables were selected as input to a deep neural network (DNN), and the probability of any each foot disorder was measured. In next step, we used support vector machine (SVM) and run dataset for each foot disorder (classification of yes or no). We compared DNN and SVM for foot disorders prediction based on plantar pressure distributions and center of pressure. Findings:- The results demonstrated that the accuracy of deep learning architecture is sufficient for most clinical and research applications in the study population. In addition, the SVM approach has more accuracy for predictions, enabling applications for foot disorders diagnosis. The detection accuracy was 71% by the deep learning algorithm and 78% by the SVM algorithm. Moreover, when we worked with peak plantar pressure distribution, it was more accurate than center of pressure dataset. Conclusion:- Both algorithms- deep learning and SVM will help therapist and patients to improve the data pool and enhance foot disorders prediction with less expense and error after removing some restrictions properly.

Keywords: deep neural network, foot disorder, plantar pressure, support vector machine

Procedia PDF Downloads 319
9128 Ensemble Methods in Machine Learning: An Algorithmic Approach to Derive Distinctive Behaviors of Criminal Activity Applied to the Poaching Domain

Authors: Zachary Blanks, Solomon Sonya

Abstract:

Poaching presents a serious threat to endangered animal species, environment conservations, and human life. Additionally, some poaching activity has even been linked to supplying funds to support terrorist networks elsewhere around the world. Consequently, agencies dedicated to protecting wildlife habitats have a near intractable task of adequately patrolling an entire area (spanning several thousand kilometers) given limited resources, funds, and personnel at their disposal. Thus, agencies need predictive tools that are both high-performing and easily implementable by the user to help in learning how the significant features (e.g. animal population densities, topography, behavior patterns of the criminals within the area, etc) interact with each other in hopes of abating poaching. This research develops a classification model using machine learning algorithms to aid in forecasting future attacks that is both easy to train and performs well when compared to other models. In this research, we demonstrate how data imputation methods (specifically predictive mean matching, gradient boosting, and random forest multiple imputation) can be applied to analyze data and create significant predictions across a varied data set. Specifically, we apply these methods to improve the accuracy of adopted prediction models (Logistic Regression, Support Vector Machine, etc). Finally, we assess the performance of the model and the accuracy of our data imputation methods by learning on a real-world data set constituting four years of imputed data and testing on one year of non-imputed data. This paper provides three main contributions. First, we extend work done by the Teamcore and CREATE (Center for Risk and Economic Analysis of Terrorism Events) research group at the University of Southern California (USC) working in conjunction with the Department of Homeland Security to apply game theory and machine learning algorithms to develop more efficient ways of reducing poaching. This research introduces ensemble methods (Random Forests and Stochastic Gradient Boosting) and applies it to real-world poaching data gathered from the Ugandan rain forest park rangers. Next, we consider the effect of data imputation on both the performance of various algorithms and the general accuracy of the method itself when applied to a dependent variable where a large number of observations are missing. Third, we provide an alternate approach to predict the probability of observing poaching both by season and by month. The results from this research are very promising. We conclude that by using Stochastic Gradient Boosting to predict observations for non-commercial poaching by season, we are able to produce statistically equivalent results while being orders of magnitude faster in computation time and complexity. Additionally, when predicting potential poaching incidents by individual month vice entire seasons, boosting techniques produce a mean area under the curve increase of approximately 3% relative to previous prediction schedules by entire seasons.

Keywords: ensemble methods, imputation, machine learning, random forests, statistical analysis, stochastic gradient boosting, wildlife protection

Procedia PDF Downloads 265
9127 The Impact of Experiential Learning on the Success of Upper Division Mechanical Engineering Students

Authors: Seyedali Seyedkavoosi, Mohammad Obadat, Seantorrion Boyle

Abstract:

The purpose of this study is to assess the effectiveness of a nontraditional experiential learning strategy in improving the success and interest of mechanical engineering students, using the Kinematics/Dynamics of Machine course as a case study. This upper-division technical course covers a wide range of topics, including mechanism and machine system analysis and synthesis, yet the complexities of ideas like acceleration, motion, and machine component relationships are hard to explain using standard teaching techniques. To solve this problem, a thorough design project was created that gave students hands-on experience developing, manufacturing, and testing their inventions. The main goals of the project were to improve students' grasp of machine design and kinematics, to develop problem-solving and presenting abilities, and to familiarize them with professional software. A questionnaire survey was done to evaluate the effect of this technique on students' performance and interest in mechanical engineering. The outcomes of the study shed light on the usefulness of nontraditional experiential learning approaches in engineering education.

Keywords: experiential learning, nontraditional teaching, hands-on design project, engineering education

Procedia PDF Downloads 69