Search results for: machine learning control
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 18282

Search results for: machine learning control

18162 Intrusion Detection Based on Graph Oriented Big Data Analytics

Authors: Ahlem Abid, Farah Jemili

Abstract:

Intrusion detection has been the subject of numerous studies in industry and academia, but cyber security analysts always want greater precision and global threat analysis to secure their systems in cyberspace. To improve intrusion detection system, the visualisation of the security events in form of graphs and diagrams is important to improve the accuracy of alerts. In this paper, we propose an approach of an IDS based on cloud computing, big data technique and using a machine learning graph algorithm which can detect in real time different attacks as early as possible. We use the MAWILab intrusion detection dataset . We choose Microsoft Azure as a unified cloud environment to load our dataset on. We implement the k2 algorithm which is a graphical machine learning algorithm to classify attacks. Our system showed a good performance due to the graphical machine learning algorithm and spark structured streaming engine.

Keywords: Apache Spark Streaming, Graph, Intrusion detection, k2 algorithm, Machine Learning, MAWILab, Microsoft Azure Cloud

Procedia PDF Downloads 142
18161 Heart Attack Prediction Using Several Machine Learning Methods

Authors: Suzan Anwar, Utkarsh Goyal

Abstract:

Heart rate (HR) is a predictor of cardiovascular, cerebrovascular, and all-cause mortality in the general population, as well as in patients with cardio and cerebrovascular diseases. Machine learning (ML) significantly improves the accuracy of cardiovascular risk prediction, increasing the number of patients identified who could benefit from preventive treatment while avoiding unnecessary treatment of others. This research examines relationship between the individual's various heart health inputs like age, sex, cp, trestbps, thalach, oldpeaketc, and the likelihood of developing heart disease. Machine learning techniques like logistic regression and decision tree, and Python are used. The results of testing and evaluating the model using the Heart Failure Prediction Dataset show the chance of a person having a heart disease with variable accuracy. Logistic regression has yielded an accuracy of 80.48% without data handling. With data handling (normalization, standardscaler), the logistic regression resulted in improved accuracy of 87.80%, decision tree 100%, random forest 100%, and SVM 100%.

Keywords: heart rate, machine learning, SVM, decision tree, logistic regression, random forest

Procedia PDF Downloads 135
18160 Multi-Level Air Quality Classification in China Using Information Gain and Support Vector Machine

Authors: Bingchun Liu, Pei-Chann Chang, Natasha Huang, Dun Li

Abstract:

Machine Learning and Data Mining are the two important tools for extracting useful information and knowledge from large datasets. In machine learning, classification is a wildly used technique to predict qualitative variables and is generally preferred over regression from an operational point of view. Due to the enormous increase in air pollution in various countries especially China, Air Quality Classification has become one of the most important topics in air quality research and modelling. This study aims at introducing a hybrid classification model based on information theory and Support Vector Machine (SVM) using the air quality data of four cities in China namely Beijing, Guangzhou, Shanghai and Tianjin from Jan 1, 2014 to April 30, 2016. China's Ministry of Environmental Protection has classified the daily air quality into 6 levels namely Serious Pollution, Severe Pollution, Moderate Pollution, Light Pollution, Good and Excellent based on their respective Air Quality Index (AQI) values. Using the information theory, information gain (IG) is calculated and feature selection is done for both categorical features and continuous numeric features. Then SVM Machine Learning algorithm is implemented on the selected features with cross-validation. The final evaluation reveals that the IG and SVM hybrid model performs better than SVM (alone), Artificial Neural Network (ANN) and K-Nearest Neighbours (KNN) models in terms of accuracy as well as complexity.

Keywords: machine learning, air quality classification, air quality index, information gain, support vector machine, cross-validation

Procedia PDF Downloads 232
18159 Vector Control of Two Five Phase PMSM Connected in Series Powered by Matrix Converter Application to the Rail Traction

Authors: S. Meguenni, A. Djahbar, K. Tounsi

Abstract:

Electric railway traction systems are complex; they have electrical couplings, magnetic and solid mechanics. These couplings impose several constraints that complicate the modeling and analysis of these systems. An example of drive systems, which combine the advantages of the use of multiphase machines, power electronics and computing means, is mono convert isseur multi-machine system which can control a fully decoupled so many machines whose electric windings are connected in series. In this approach, our attention especially on modeling and independent control of two five phase synchronous machine with permanent magnet connected in series and fed by a matrix converter application to the rail traction (bogie of a locomotive BB 36000).

Keywords: synchronous machine, vector control Multi-machine/ Multi-inverter, matrix inverter, Railway traction

Procedia PDF Downloads 367
18158 A Study on Big Data Analytics, Applications and Challenges

Authors: Chhavi Rana

Abstract:

The aim of the paper is to highlight the existing development in the field of big data analytics. Applications like bioinformatics, smart infrastructure projects, Healthcare, and business intelligence contain voluminous and incremental data, which is hard to organise and analyse and can be dealt with using the framework and model in this field of study. An organization's decision-making strategy can be enhanced using big data analytics and applying different machine learning techniques and statistical tools on such complex data sets that will consequently make better things for society. This paper reviews the current state of the art in this field of study as well as different application domains of big data analytics. It also elaborates on various frameworks in the process of Analysis using different machine-learning techniques. Finally, the paper concludes by stating different challenges and issues raised in existing research.

Keywords: big data, big data analytics, machine learning, review

Procedia PDF Downloads 78
18157 A Study on Big Data Analytics, Applications, and Challenges

Authors: Chhavi Rana

Abstract:

The aim of the paper is to highlight the existing development in the field of big data analytics. Applications like bioinformatics, smart infrastructure projects, healthcare, and business intelligence contain voluminous and incremental data which is hard to organise and analyse and can be dealt with using the framework and model in this field of study. An organisation decision-making strategy can be enhanced by using big data analytics and applying different machine learning techniques and statistical tools to such complex data sets that will consequently make better things for society. This paper reviews the current state of the art in this field of study as well as different application domains of big data analytics. It also elaborates various frameworks in the process of analysis using different machine learning techniques. Finally, the paper concludes by stating different challenges and issues raised in existing research.

Keywords: big data, big data analytics, machine learning, review

Procedia PDF Downloads 90
18156 Insider Theft Detection in Organizations Using Keylogger and Machine Learning

Authors: Shamatha Shetty, Sakshi Dhabadi, Prerana M., Indushree B.

Abstract:

About 66% of firms claim that insider attacks are more likely to happen. The frequency of insider incidents has increased by 47% in the last two years. The goal of this work is to prevent dangerous employee behavior by using keyloggers and the Machine Learning (ML) model. Every keystroke that the user enters is recorded by the keylogging program, also known as keystroke logging. Keyloggers are used to stop improper use of the system. This enables us to collect all textual data, save it in a CSV file, and analyze it using an ML algorithm and the VirusTotal API. Many large companies use it to methodically monitor how their employees use computers, the internet, and email. We are utilizing the SVM algorithm and the VirusTotal API to improve overall efficiency and accuracy in identifying specific patterns and words to automate and offer the report for improved monitoring.

Keywords: cyber security, machine learning, cyclic process, email notification

Procedia PDF Downloads 51
18155 Review on Implementation of Artificial Intelligence and Machine Learning for Controlling Traffic and Avoiding Accidents

Authors: Neha Singh, Shristi Singh

Abstract:

Accidents involving motor vehicles are more likely to cause serious injuries and fatalities. It also has a host of other perpetual issues, such as the regular loss of life and goods in accidents. To solve these issues, appropriate measures must be implemented, such as establishing an autonomous incident detection system that makes use of machine learning and artificial intelligence. In order to reduce traffic accidents, this article examines the overview of artificial intelligence and machine learning in autonomous event detection systems. The paper explores the major issues, prospective solutions, and use of artificial intelligence and machine learning in road transportation systems for minimising traffic accidents. There is a lot of discussion on additional, fresh, and developing approaches that less frequent accidents in the transportation industry. The study structured the following subtopics specifically: traffic management using machine learning and artificial intelligence and an incident detector with these two technologies. The internet of vehicles and vehicle ad hoc networks, as well as the use of wireless communication technologies like 5G wireless networks and the use of machine learning and artificial intelligence for the planning of road transportation systems, are elaborated. In addition, safety is the primary concern of road transportation. Route optimization, cargo volume forecasting, predictive fleet maintenance, real-time vehicle tracking, and traffic management, according to the review's key conclusions, are essential for ensuring the safety of road transportation networks. In addition to highlighting research trends, unanswered problems, and key research conclusions, the study also discusses the difficulties in applying artificial intelligence to road transport systems. Planning and managing the road transportation system might use the work as a resource.

Keywords: artificial intelligence, machine learning, incident detector, road transport systems, traffic management, automatic incident detection, deep learning

Procedia PDF Downloads 105
18154 Predictive Modeling of Student Behavior in Virtual Reality: A Machine Learning Approach

Authors: Gayathri Sadanala, Shibam Pokhrel, Owen Murphy

Abstract:

In the ever-evolving landscape of education, Virtual Reality (VR) environments offer a promising avenue for enhancing student engagement and learning experiences. However, understanding and predicting student behavior within these immersive settings remain challenging tasks. This paper presents a comprehensive study on the predictive modeling of student behavior in VR using machine learning techniques. We introduce a rich data set capturing student interactions, movements, and progress within a VR orientation program. The dataset is divided into training and testing sets, allowing us to develop and evaluate predictive models for various aspects of student behavior, including engagement levels, task completion, and performance. Our machine learning approach leverages a combination of feature engineering and model selection to reveal hidden patterns in the data. We employ regression and classification models to predict student outcomes, and the results showcase promising accuracy in forecasting behavior within VR environments. Furthermore, we demonstrate the practical implications of our predictive models for personalized VR-based learning experiences and early intervention strategies. By uncovering the intricate relationship between student behavior and VR interactions, we provide valuable insights for educators, designers, and developers seeking to optimize virtual learning environments.

Keywords: interaction, machine learning, predictive modeling, virtual reality

Procedia PDF Downloads 137
18153 Advanced Machine Learning Algorithm for Credit Card Fraud Detection

Authors: Manpreet Kaur

Abstract:

When legitimate credit card users are mistakenly labelled as fraudulent in numerous financial delated applications, there are numerous ethical problems. The innovative machine learning approach we have suggested in this research outperforms the current models and shows how to model a data set for credit card fraud detection while minimizing false positives. As a result, we advise using random forests as the best machine learning method for predicting and identifying credit card transaction fraud. The majority of victims of these fraudulent transactions were discovered to be credit card users over the age of 60, with a higher percentage of fraudulent transactions taking place between the specific hours.

Keywords: automated fraud detection, isolation forest method, local outlier factor, ML algorithm, credit card

Procedia PDF Downloads 105
18152 Optimization of 3D Printing Parameters Using Machine Learning to Enhance Mechanical Properties in Fused Deposition Modeling (FDM) Technology

Authors: Darwin Junnior Sabino Diego, Brando Burgos Guerrero, Diego Arroyo Villanueva

Abstract:

Additive manufacturing, commonly known as 3D printing, has revolutionized modern manufacturing by enabling the agile creation of complex objects. However, challenges persist in the consistency and quality of printed parts, particularly in their mechanical properties. This study focuses on addressing these challenges through the optimization of printing parameters in FDM technology, using Machine Learning techniques. Our aim is to improve the mechanical properties of printed objects by optimizing parameters such as speed, temperature, and orientation. We implement a methodology that combines experimental data collection with Machine Learning algorithms to identify relationships between printing parameters and mechanical properties. The results demonstrate the potential of this methodology to enhance the quality and consistency of 3D printed products, with significant applications across various industrial fields. This research not only advances understanding of additive manufacturing but also opens new avenues for practical implementation in industrial settings.

Keywords: 3D printing, additive manufacturing, machine learning, mechanical properties

Procedia PDF Downloads 45
18151 Machine Learning Analysis of Student Success in Introductory Calculus Based Physics I Course

Authors: Chandra Prayaga, Aaron Wade, Lakshmi Prayaga, Gopi Shankar Mallu

Abstract:

This paper presents the use of machine learning algorithms to predict the success of students in an introductory physics course. Data having 140 rows pertaining to the performance of two batches of students was used. The lack of sufficient data to train robust machine learning models was compensated for by generating synthetic data similar to the real data. CTGAN and CTGAN with Gaussian Copula (Gaussian) were used to generate synthetic data, with the real data as input. To check the similarity between the real data and each synthetic dataset, pair plots were made. The synthetic data was used to train machine learning models using the PyCaret package. For the CTGAN data, the Ada Boost Classifier (ADA) was found to be the ML model with the best fit, whereas the CTGAN with Gaussian Copula yielded Logistic Regression (LR) as the best model. Both models were then tested for accuracy with the real data. ROC-AUC analysis was performed for all the ten classes of the target variable (Grades A, A-, B+, B, B-, C+, C, C-, D, F). The ADA model with CTGAN data showed a mean AUC score of 0.4377, but the LR model with the Gaussian data showed a mean AUC score of 0.6149. ROC-AUC plots were obtained for each Grade value separately. The LR model with Gaussian data showed consistently better AUC scores compared to the ADA model with CTGAN data, except in two cases of the Grade value, C- and A-.

Keywords: machine learning, student success, physics course, grades, synthetic data, CTGAN, gaussian copula CTGAN

Procedia PDF Downloads 41
18150 Learning Predictive Models for Efficient Energy Management of Exhibition Hall

Authors: Jeongmin Kim, Eunju Lee, Kwang Ryel Ryu

Abstract:

This paper addresses the problem of predictive control for energy management of large-scaled exhibition halls, where a lot of energy is consumed to maintain internal atmosphere under certain required conditions. Predictive control achieves better energy efficiency by optimizing the operation of air-conditioning facilities with not only the current but also some future status taken into account. In this paper, we propose to use predictive models learned from past sensor data of hall environment, for use in optimizing the operating plan for the air-conditioning facilities by simulating future environmental change. We have implemented an emulator of an exhibition hall by using EnergyPlus, a widely used building energy emulation tool, to collect data for learning environment-change models. Experimental results show that the learned models predict future change highly accurately on a short-term basis.

Keywords: predictive control, energy management, machine learning, optimization

Procedia PDF Downloads 266
18149 Injury Prediction for Soccer Players Using Machine Learning

Authors: Amiel Satvedi, Richard Pyne

Abstract:

Injuries in professional sports occur on a regular basis. Some may be minor, while others can cause huge impact on a player's career and earning potential. In soccer, there is a high risk of players picking up injuries during game time. This research work seeks to help soccer players reduce the risk of getting injured by predicting the likelihood of injury while playing in the near future and then providing recommendations for intervention. The injury prediction tool will use a soccer player's number of minutes played on the field, number of appearances, distance covered and performance data for the current and previous seasons as variables to conduct statistical analysis and provide injury predictive results using a machine learning linear regression model.

Keywords: injury predictor, soccer injury prevention, machine learning in soccer, big data in soccer

Procedia PDF Downloads 176
18148 Fake News Detection for Korean News Using Machine Learning Techniques

Authors: Tae-Uk Yun, Pullip Chung, Kee-Young Kwahk, Hyunchul Ahn

Abstract:

Fake news is defined as the news articles that are intentionally and verifiably false, and could mislead readers. Spread of fake news may provoke anxiety, chaos, fear, or irrational decisions of the public. Thus, detecting fake news and preventing its spread has become very important issue in our society. However, due to the huge amount of fake news produced every day, it is almost impossible to identify it by a human. Under this context, researchers have tried to develop automated fake news detection using machine learning techniques over the past years. But, there have been no prior studies proposed an automated fake news detection method for Korean news to our best knowledge. In this study, we aim to detect Korean fake news using text mining and machine learning techniques. Our proposed method consists of two steps. In the first step, the news contents to be analyzed is convert to quantified values using various text mining techniques (topic modeling, TF-IDF, and so on). After that, in step 2, classifiers are trained using the values produced in step 1. As the classifiers, machine learning techniques such as logistic regression, backpropagation network, support vector machine, and deep neural network can be applied. To validate the effectiveness of the proposed method, we collected about 200 short Korean news from Seoul National University’s FactCheck. which provides with detailed analysis reports from 20 media outlets and links to source documents for each case. Using this dataset, we will identify which text features are important as well as which classifiers are effective in detecting Korean fake news.

Keywords: fake news detection, Korean news, machine learning, text mining

Procedia PDF Downloads 273
18147 A Survey in Techniques for Imbalanced Intrusion Detection System Datasets

Authors: Najmeh Abedzadeh, Matthew Jacobs

Abstract:

An intrusion detection system (IDS) is a software application that monitors malicious activities and generates alerts if any are detected. However, most network activities in IDS datasets are normal, and the relatively few numbers of attacks make the available data imbalanced. Consequently, cyber-attacks can hide inside a large number of normal activities, and machine learning algorithms have difficulty learning and classifying the data correctly. In this paper, a comprehensive literature review is conducted on different types of algorithms for both implementing the IDS and methods in correcting the imbalanced IDS dataset. The most famous algorithms are machine learning (ML), deep learning (DL), synthetic minority over-sampling technique (SMOTE), and reinforcement learning (RL). Most of the research use the CSE-CIC-IDS2017, CSE-CIC-IDS2018, and NSL-KDD datasets for evaluating their algorithms.

Keywords: IDS, imbalanced datasets, sampling algorithms, big data

Procedia PDF Downloads 319
18146 Prediction of MicroRNA-Target Gene by Machine Learning Algorithms in Lung Cancer Study

Authors: Nilubon Kurubanjerdjit, Nattakarn Iam-On, Ka-Lok Ng

Abstract:

MicroRNAs are small non-coding RNA found in many different species. They play crucial roles in cancer such as biological processes of apoptosis and proliferation. The identification of microRNA-target genes can be an essential first step towards to reveal the role of microRNA in various cancer types. In this paper, we predict miRNA-target genes for lung cancer by integrating prediction scores from miRanda and PITA algorithms used as a feature vector of miRNA-target interaction. Then, machine-learning algorithms were implemented for making a final prediction. The approach developed in this study should be of value for future studies into understanding the role of miRNAs in molecular mechanisms enabling lung cancer formation.

Keywords: microRNA, miRNAs, lung cancer, machine learning, Naïve Bayes, SVM

Procedia PDF Downloads 396
18145 Improved Performance in Content-Based Image Retrieval Using Machine Learning Approach

Authors: B. Ramesh Naik, T. Venugopal

Abstract:

This paper presents a novel approach which improves the high-level semantics of images based on machine learning approach. The contemporary approaches for image retrieval and object recognition includes Fourier transforms, Wavelets, SIFT and HoG. Though these descriptors helpful in a wide range of applications, they exploit zero order statistics, and this lacks high descriptiveness of image features. These descriptors usually take benefit of primitive visual features such as shape, color, texture and spatial locations to describe images. These features do not adequate to describe high-level semantics of the images. This leads to a gap in semantic content caused to unacceptable performance in image retrieval system. A novel method has been proposed referred as discriminative learning which is derived from machine learning approach that efficiently discriminates image features. The analysis and results of proposed approach were validated thoroughly on WANG and Caltech-101 Databases. The results proved that this approach is very competitive in content-based image retrieval.

Keywords: CBIR, discriminative learning, region weight learning, scale invariant feature transforms

Procedia PDF Downloads 178
18144 DeClEx-Processing Pipeline for Tumor Classification

Authors: Gaurav Shinde, Sai Charan Gongiguntla, Prajwal Shirur, Ahmed Hambaba

Abstract:

Health issues are significantly increasing, putting a substantial strain on healthcare services. This has accelerated the integration of machine learning in healthcare, particularly following the COVID-19 pandemic. The utilization of machine learning in healthcare has grown significantly. We introduce DeClEx, a pipeline that ensures that data mirrors real-world settings by incorporating Gaussian noise and blur and employing autoencoders to learn intermediate feature representations. Subsequently, our convolutional neural network, paired with spatial attention, provides comparable accuracy to state-of-the-art pre-trained models while achieving a threefold improvement in training speed. Furthermore, we provide interpretable results using explainable AI techniques. We integrate denoising and deblurring, classification, and explainability in a single pipeline called DeClEx.

Keywords: machine learning, healthcare, classification, explainability

Procedia PDF Downloads 50
18143 Uplink Throughput Prediction in Cellular Mobile Networks

Authors: Engin Eyceyurt, Josko Zec

Abstract:

The current and future cellular mobile communication networks generate enormous amounts of data. Networks have become extremely complex with extensive space of parameters, features and counters. These networks are unmanageable with legacy methods and an enhanced design and optimization approach is necessary that is increasingly reliant on machine learning. This paper proposes that machine learning as a viable approach for uplink throughput prediction. LTE radio metric, such as Reference Signal Received Power (RSRP), Reference Signal Received Quality (RSRQ), and Signal to Noise Ratio (SNR) are used to train models to estimate expected uplink throughput. The prediction accuracy with high determination coefficient of 91.2% is obtained from measurements collected with a simple smartphone application.

Keywords: drive test, LTE, machine learning, uplink throughput prediction

Procedia PDF Downloads 152
18142 Predicting the Diagnosis of Alzheimer’s Disease: Development and Validation of Machine Learning Models

Authors: Jay L. Fu

Abstract:

Patients with Alzheimer's disease progressively lose their memory and thinking skills and, eventually, the ability to carry out simple daily tasks. The disease is irreversible, but early detection and treatment can slow down the disease progression. In this research, publicly available MRI data and demographic data from 373 MRI imaging sessions were utilized to build models to predict dementia. Various machine learning models, including logistic regression, k-nearest neighbor, support vector machine, random forest, and neural network, were developed. Data were divided into training and testing sets, where training sets were used to build the predictive model, and testing sets were used to assess the accuracy of prediction. Key risk factors were identified, and various models were compared to come forward with the best prediction model. Among these models, the random forest model appeared to be the best model with an accuracy of 90.34%. MMSE, nWBV, and gender were the three most important contributing factors to the detection of Alzheimer’s. Among all the models used, the percent in which at least 4 of the 5 models shared the same diagnosis for a testing input was 90.42%. These machine learning models allow early detection of Alzheimer’s with good accuracy, which ultimately leads to early treatment of these patients.

Keywords: Alzheimer's disease, clinical diagnosis, magnetic resonance imaging, machine learning prediction

Procedia PDF Downloads 141
18141 Missing Link Data Estimation with Recurrent Neural Network: An Application Using Speed Data of Daegu Metropolitan Area

Authors: JaeHwan Yang, Da-Woon Jeong, Seung-Young Kho, Dong-Kyu Kim

Abstract:

In terms of ITS, information on link characteristic is an essential factor for plan or operation. But in practical cases, not every link has installed sensors on it. The link that does not have data on it is called “Missing Link”. The purpose of this study is to impute data of these missing links. To get these data, this study applies the machine learning method. With the machine learning process, especially for the deep learning process, missing link data can be estimated from present link data. For deep learning process, this study uses “Recurrent Neural Network” to take time-series data of road. As input data, Dedicated Short-range Communications (DSRC) data of Dalgubul-daero of Daegu Metropolitan Area had been fed into the learning process. Neural Network structure has 17 links with present data as input, 2 hidden layers, for 1 missing link data. As a result, forecasted data of target link show about 94% of accuracy compared with actual data.

Keywords: data estimation, link data, machine learning, road network

Procedia PDF Downloads 508
18140 Resilient Machine Learning in the Nuclear Industry: Crack Detection as a Case Study

Authors: Anita Khadka, Gregory Epiphaniou, Carsten Maple

Abstract:

There is a dramatic surge in the adoption of machine learning (ML) techniques in many areas, including the nuclear industry (such as fault diagnosis and fuel management in nuclear power plants), autonomous systems (including self-driving vehicles), space systems (space debris recovery, for example), medical surgery, network intrusion detection, malware detection, to name a few. With the application of learning methods in such diverse domains, artificial intelligence (AI) has become a part of everyday modern human life. To date, the predominant focus has been on developing underpinning ML algorithms that can improve accuracy, while factors such as resiliency and robustness of algorithms have been largely overlooked. If an adversarial attack is able to compromise the learning method or data, the consequences can be fatal, especially but not exclusively in safety-critical applications. In this paper, we present an in-depth analysis of five adversarial attacks and three defence methods on a crack detection ML model. Our analysis shows that it can be dangerous to adopt machine learning techniques in security-critical areas such as the nuclear industry without rigorous testing since they may be vulnerable to adversarial attacks. While common defence methods can effectively defend against different attacks, none of the three considered can provide protection against all five adversarial attacks analysed.

Keywords: adversarial machine learning, attacks, defences, nuclear industry, crack detection

Procedia PDF Downloads 152
18139 Fine-Grained Sentiment Analysis: Recent Progress

Authors: Jie Liu, Xudong Luo, Pingping Lin, Yifan Fan

Abstract:

Facebook, Twitter, Weibo, and other social media and significant e-commerce sites generate a massive amount of online texts, which can be used to analyse people’s opinions or sentiments for better decision-making. So, sentiment analysis, especially fine-grained sentiment analysis, is a very active research topic. In this paper, we survey various methods for fine-grained sentiment analysis, including traditional sentiment lexicon-based methods, machine learning-based methods, and deep learning-based methods in aspect/target/attribute-based sentiment analysis tasks. Besides, we discuss their advantages and problems worthy of careful studies in the future.

Keywords: sentiment analysis, fine-grained, machine learning, deep learning

Procedia PDF Downloads 256
18138 Machine Learning Techniques to Predict Cyberbullying and Improve Social Work Interventions

Authors: Oscar E. Cariceo, Claudia V. Casal

Abstract:

Machine learning offers a set of techniques to promote social work interventions and can lead to support decisions of practitioners in order to predict new behaviors based on data produced by the organizations, services agencies, users, clients or individuals. Machine learning techniques include a set of generalizable algorithms that are data-driven, which means that rules and solutions are derived by examining data, based on the patterns that are present within any data set. In other words, the goal of machine learning is teaching computers through 'examples', by training data to test specifics hypothesis and predict what would be a certain outcome, based on a current scenario and improve that experience. Machine learning can be classified into two general categories depending on the nature of the problem that this technique needs to tackle. First, supervised learning involves a dataset that is already known in terms of their output. Supervising learning problems are categorized, into regression problems, which involve a prediction from quantitative variables, using a continuous function; and classification problems, which seek predict results from discrete qualitative variables. For social work research, machine learning generates predictions as a key element to improving social interventions on complex social issues by providing better inference from data and establishing more precise estimated effects, for example in services that seek to improve their outcomes. This paper exposes the results of a classification algorithm to predict cyberbullying among adolescents. Data were retrieved from the National Polyvictimization Survey conducted by the government of Chile in 2017. A logistic regression model was created to predict if an adolescent would experience cyberbullying based on the interaction and behavior of gender, age, grade, type of school, and self-esteem sentiments. The model can predict with an accuracy of 59.8% if an adolescent will suffer cyberbullying. These results can help to promote programs to avoid cyberbullying at schools and improve evidence based practice.

Keywords: cyberbullying, evidence based practice, machine learning, social work research

Procedia PDF Downloads 166
18137 Optimizing Machine Vision System Setup Accuracy by Six-Sigma DMAIC Approach

Authors: Joseph C. Chen

Abstract:

Machine vision system provides automatic inspection to reduce manufacturing costs considerably. However, only a few principles have been found to optimize machine vision system and help it function more accurately in industrial practice. Mostly, there were complicated and impractical design techniques to improve the accuracy of machine vision system. This paper discusses implementing the Six Sigma Define, Measure, Analyze, Improve, and Control (DMAIC) approach to optimize the setup parameters of machine vision system when it is used as a direct measurement technique. This research follows a case study showing how Six Sigma DMAIC methodology has been put into use.

Keywords: DMAIC, machine vision system, process capability, Taguchi Parameter Design

Procedia PDF Downloads 432
18136 Developing a Hybrid Method to Diagnose and Predict Sports Related Concussions with Machine Learning

Authors: Melody Yin

Abstract:

Concussions impact a large amount of adolescents; they make up as much as half of the diagnosed concussions in America. This research proposes a hybrid machine learning model based on the combination of human/knowledge-based domains and computer-generated feature rankings to improve the accuracy of diagnosing sports related concussion (SRC). Using a data set of symptoms collected on the sideline post-SRC events, the symptom selection criteria method has been developed by using Google AutoML's important score function to identify the top 10 symptom features. In addition, symptom domains have been introduced as another parameter, categorizing the symptoms into physical, cognitive, sleep, and emotional domains. The hybrid machine learning model has been trained with a combination of the top 10 symptoms and 4 domains. From the results, the hybrid model was the best performer for symptom resolution time prediction in 2 and 4-week thresholds. This research is a proof of concept study in the use of domains along with machine learning in order to improve concussion prediction accuracy. It is also possible that the use of domains can make the model more efficient due to reduced training time. This research examines the use of a hybrid method in predicting sports-related concussion. This achievement is based on data preprocessing, using a hybrid method to select criteria to achieve high performance.

Keywords: hybrid model, machine learning, sports related concussion, symptom resolution time

Procedia PDF Downloads 163
18135 Machine Learning Data Architecture

Authors: Neerav Kumar, Naumaan Nayyar, Sharath Kashyap

Abstract:

Most companies see an increase in the adoption of machine learning (ML) applications across internal and external-facing use cases. ML applications vend output either in batch or real-time patterns. A complete batch ML pipeline architecture comprises data sourcing, feature engineering, model training, model deployment, model output vending into a data store for downstream application. Due to unclear role expectations, we have observed that scientists specializing in building and optimizing models are investing significant efforts into building the other components of the architecture, which we do not believe is the best use of scientists’ bandwidth. We propose a system architecture created using AWS services that bring industry best practices to managing the workflow and simplifies the process of model deployment and end-to-end data integration for an ML application. This narrows down the scope of scientists’ work to model building and refinement while specialized data engineers take over the deployment, pipeline orchestration, data quality, data permission system, etc. The pipeline infrastructure is built and deployed as code (using terraform, cdk, cloudformation, etc.) which makes it easy to replicate and/or extend the architecture to other models that are used in an organization.

Keywords: data pipeline, machine learning, AWS, architecture, batch machine learning

Procedia PDF Downloads 60
18134 Machine Learning for Classifying Risks of Death and Length of Stay of Patients in Intensive Unit Care Beds

Authors: Itamir de Morais Barroca Filho, Cephas A. S. Barreto, Ramon Malaquias, Cezar Miranda Paula de Souza, Arthur Costa Gorgônio, João C. Xavier-Júnior, Mateus Firmino, Fellipe Matheus Costa Barbosa

Abstract:

Information and Communication Technologies (ICT) in healthcare are crucial for efficiently delivering medical healthcare services to patients. These ICTs are also known as e-health and comprise technologies such as electronic record systems, telemedicine systems, and personalized devices for diagnosis. The focus of e-health is to improve the quality of health information, strengthen national health systems, and ensure accessible, high-quality health care for all. All the data gathered by these technologies make it possible to help clinical staff with automated decisions using machine learning. In this context, we collected patient data, such as heart rate, oxygen saturation (SpO2), blood pressure, respiration, and others. With this data, we were able to develop machine learning models for patients’ risk of death and estimate the length of stay in ICU beds. Thus, this paper presents the methodology for applying machine learning techniques to develop these models. As a result, although we implemented these models on an IoT healthcare platform, helping clinical staff in healthcare in an ICU, it is essential to create a robust clinical validation process and monitoring of the proposed models.

Keywords: ICT, e-health, machine learning, ICU, healthcare

Procedia PDF Downloads 99
18133 Physics-Informed Machine Learning for Displacement Estimation in Solid Mechanics Problem

Authors: Feng Yang

Abstract:

Machine learning (ML), especially deep learning (DL), has been extensively applied to many applications in recently years and gained great success in solving different problems, including scientific problems. However, conventional ML/DL methodologies are purely data-driven which have the limitations, such as need of ample amount of labelled training data, lack of consistency to physical principles, and lack of generalizability to new problems/domains. Recently, there is a growing consensus that ML models need to further take advantage of prior knowledge to deal with these limitations. Physics-informed machine learning, aiming at integration of physics/domain knowledge into ML, has been recognized as an emerging area of research, especially in the recent 2 to 3 years. In this work, physics-informed ML, specifically physics-informed neural network (NN), is employed and implemented to estimate the displacements at x, y, z directions in a solid mechanics problem that is controlled by equilibrium equations with boundary conditions. By incorporating the physics (i.e. the equilibrium equations) into the learning process of NN, it is showed that the NN can be trained very efficiently with a small set of labelled training data. Experiments with different settings of the NN model and the amount of labelled training data were conducted, and the results show that very high accuracy can be achieved in fulfilling the equilibrium equations as well as in predicting the displacements, e.g. in setting the overall displacement of 0.1, a root mean square error (RMSE) of 2.09 × 10−4 was achieved.

Keywords: deep learning, neural network, physics-informed machine learning, solid mechanics

Procedia PDF Downloads 145