Search results for: automated machine learning
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8791

Search results for: automated machine learning

8581 Human Digital Twin for Personal Conversation Automation Using Supervised Machine Learning Approaches

Authors: Aya Salama

Abstract:

Digital Twin is an emerging research topic that attracted researchers in the last decade. It is used in many fields, such as smart manufacturing and smart healthcare because it saves time and money. It is usually related to other technologies such as Data Mining, Artificial Intelligence, and Machine Learning. However, Human digital twin (HDT), in specific, is still a novel idea that still needs to prove its feasibility. HDT expands the idea of Digital Twin to human beings, which are living beings and different from the inanimate physical entities. The goal of this research was to create a Human digital twin that is responsible for real-time human replies automation by simulating human behavior. For this reason, clustering, supervised classification, topic extraction, and sentiment analysis were studied in this paper. The feasibility of the HDT for personal replies generation on social messaging applications was proved in this work. The overall accuracy of the proposed approach in this paper was 63% which is a very promising result that can open the way for researchers to expand the idea of HDT. This was achieved by using Random Forest for clustering the question data base and matching new questions. K-nearest neighbor was also applied for sentiment analysis.

Keywords: human digital twin, sentiment analysis, topic extraction, supervised machine learning, unsupervised machine learning, classification, clustering

Procedia PDF Downloads 65
8580 Quantum Statistical Machine Learning and Quantum Time Series

Authors: Omar Alzeley, Sergey Utev

Abstract:

Minimizing a constrained multivariate function is the fundamental of Machine learning, and these algorithms are at the core of data mining and data visualization techniques. The decision function that maps input points to output points is based on the result of optimization. This optimization is the central of learning theory. One approach to complex systems where the dynamics of the system is inferred by a statistical analysis of the fluctuations in time of some associated observable is time series analysis. The purpose of this paper is a mathematical transition from the autoregressive model of classical time series to the matrix formalization of quantum theory. Firstly, we have proposed a quantum time series model (QTS). Although Hamiltonian technique becomes an established tool to detect a deterministic chaos, other approaches emerge. The quantum probabilistic technique is used to motivate the construction of our QTS model. The QTS model resembles the quantum dynamic model which was applied to financial data. Secondly, various statistical methods, including machine learning algorithms such as the Kalman filter algorithm, are applied to estimate and analyses the unknown parameters of the model. Finally, simulation techniques such as Markov chain Monte Carlo have been used to support our investigations. The proposed model has been examined by using real and simulated data. We establish the relation between quantum statistical machine and quantum time series via random matrix theory. It is interesting to note that the primary focus of the application of QTS in the field of quantum chaos was to find a model that explain chaotic behaviour. Maybe this model will reveal another insight into quantum chaos.

Keywords: machine learning, simulation techniques, quantum probability, tensor product, time series

Procedia PDF Downloads 437
8579 Automated Manual Handling Risk Assessments: Practitioner Experienced Determinants of Automated Risk Analysis and Reporting Being a Benefit or Distraction

Authors: S. Cowley, M. Lawrance, D. Bick, R. McCord

Abstract:

Technology that automates manual handling (musculoskeletal disorder or MSD) risk assessments is increasingly available to ergonomists, engineers, generalist health and safety practitioners alike. The risk assessment process is generally based on the use of wearable motion sensors that capture information about worker movements for real-time or for posthoc analysis. Traditionally, MSD risk assessment is undertaken with the assistance of a checklist such as that from the SafeWork Australia code of practice, the expert assessor observing the task and ideally engaging with the worker in a discussion about the detail. Automation enables the non-expert to complete assessments and does not always require the assessor to be there. This clearly has cost and time benefits for the practitioner but is it an improvement on the assessment by the human. Human risk assessments draw on the knowledge and expertise of the assessor but, like all risk assessments, are highly subjective. The complexity of the checklists and models used in the process can be off-putting and sometimes will lead to the assessment becoming the focus and the end rather than a means to an end; the focus on risk control is lost. Automated risk assessment handles the complexity of the assessment for the assessor and delivers a simple risk score that enables decision-making regarding risk control. Being machine-based, they are objective and will deliver the same each time they assess an identical task. However, the WHS professional needs to know that this emergent technology asks the right questions and delivers the right answers. Whether it improves the risk assessment process and results or simply distances the professional from the task and the worker. They need clarity as to whether automation of manual task risk analysis and reporting leads to risk control or to a focus on the worker. Critically, they need evidence as to whether automation in this area of hazard management leads to better risk control or just a bigger collection of assessments. Practitioner experienced determinants of this automated manual task risk analysis and reporting being a benefit or distraction will address an understanding of emergent risk assessment technology, its use and things to consider when making decisions about adopting and applying these technologies.

Keywords: automated, manual-handling, risk-assessment, machine-based

Procedia PDF Downloads 96
8578 Segmentation Using Multi-Thresholded Sobel Images: Application to the Separation of Stuck Pollen Grains

Authors: Endrick Barnacin, Jean-Luc Henry, Jimmy Nagau, Jack Molinie

Abstract:

Being able to identify biological particles such as spores, viruses, or pollens is important for health care professionals, as it allows for appropriate therapeutic management of patients. Optical microscopy is a technology widely used for the analysis of these types of microorganisms, because, compared to other types of microscopy, it is not expensive. The analysis of an optical microscope slide is a tedious and time-consuming task when done manually. However, using machine learning and computer vision, this process can be automated. The first step of an automated microscope slide image analysis process is segmentation. During this step, the biological particles are localized and extracted. Very often, the use of an automatic thresholding method is sufficient to locate and extract the particles. However, in some cases, the particles are not extracted individually because they are stuck to other biological elements. In this paper, we propose a stuck particles separation method based on the use of the Sobel operator and thresholding. We illustrate it by applying it to the separation of 813 images of adjacent pollen grains. The method correctly separated 95.4% of these images.

Keywords: image segmentation, stuck particles separation, Sobel operator, thresholding

Procedia PDF Downloads 107
8577 A Machine Learning Model for Dynamic Prediction of Chronic Kidney Disease Risk Using Laboratory Data, Non-Laboratory Data, and Metabolic Indices

Authors: Amadou Wurry Jallow, Adama N. S. Bah, Karamo Bah, Shih-Ye Wang, Kuo-Chung Chu, Chien-Yeh Hsu

Abstract:

Chronic kidney disease (CKD) is a major public health challenge with high prevalence, rising incidence, and serious adverse consequences. Developing effective risk prediction models is a cost-effective approach to predicting and preventing complications of chronic kidney disease (CKD). This study aimed to develop an accurate machine learning model that can dynamically identify individuals at risk of CKD using various kinds of diagnostic data, with or without laboratory data, at different follow-up points. Creatinine is a key component used to predict CKD. These models will enable affordable and effective screening for CKD even with incomplete patient data, such as the absence of creatinine testing. This retrospective cohort study included data on 19,429 adults provided by a private research institute and screening laboratory in Taiwan, gathered between 2001 and 2015. Univariate Cox proportional hazard regression analyses were performed to determine the variables with high prognostic values for predicting CKD. We then identified interacting variables and grouped them according to diagnostic data categories. Our models used three types of data gathered at three points in time: non-laboratory, laboratory, and metabolic indices data. Next, we used subgroups of variables within each category to train two machine learning models (Random Forest and XGBoost). Our machine learning models can dynamically discriminate individuals at risk for developing CKD. All the models performed well using all three kinds of data, with or without laboratory data. Using only non-laboratory-based data (such as age, sex, body mass index (BMI), and waist circumference), both models predict chronic kidney disease as accurately as models using laboratory and metabolic indices data. Our machine learning models have demonstrated the use of different categories of diagnostic data for CKD prediction, with or without laboratory data. The machine learning models are simple to use and flexible because they work even with incomplete data and can be applied in any clinical setting, including settings where laboratory data is difficult to obtain.

Keywords: chronic kidney disease, glomerular filtration rate, creatinine, novel metabolic indices, machine learning, risk prediction

Procedia PDF Downloads 75
8576 Development of a Decision-Making Method by Using Machine Learning Algorithms in the Early Stage of School Building Design

Authors: Pegah Eshraghi, Zahra Sadat Zomorodian, Mohammad Tahsildoost

Abstract:

Over the past decade, energy consumption in educational buildings has steadily increased. The purpose of this research is to provide a method to quickly predict the energy consumption of buildings using separate evaluation of zones and decomposing the building to eliminate the complexity of geometry at the early design stage. To produce this framework, machine learning algorithms such as Support vector regression (SVR) and Artificial neural network (ANN) are used to predict energy consumption and thermal comfort metrics in a school as a case. The database consists of more than 55000 samples in three climates of Iran. Cross-validation evaluation and unseen data have been used for validation. In a specific label, cooling energy, it can be said the accuracy of prediction is at least 84% and 89% in SVR and ANN, respectively. The results show that the SVR performed much better than the ANN.

Keywords: early stage of design, energy, thermal comfort, validation, machine learning

Procedia PDF Downloads 58
8575 Use of Machine Learning in Data Quality Assessment

Authors: Bruno Pinto Vieira, Marco Antonio Calijorne Soares, Armando Sérgio de Aguiar Filho

Abstract:

Nowadays, a massive amount of information has been produced by different data sources, including mobile devices and transactional systems. In this scenario, concerns arise on how to maintain or establish data quality, which is now treated as a product to be defined, measured, analyzed, and improved to meet consumers' needs, which is the one who uses these data in decision making and companies strategies. Information that reaches low levels of quality can lead to issues that can consume time and money, such as missed business opportunities, inadequate decisions, and bad risk management actions. The step of selecting, identifying, evaluating, and selecting data sources with significant quality according to the need has become a costly task for users since the sources do not provide information about their quality. Traditional data quality control methods are based on user experience or business rules limiting performance and slowing down the process with less than desirable accuracy. Using advanced machine learning algorithms, it is possible to take advantage of computational resources to overcome challenges and add value to companies and users. In this study, machine learning is applied to data quality analysis on different datasets, seeking to compare the performance of the techniques according to the dimensions of quality assessment. As a result, we could create a ranking of approaches used, besides a system that is able to carry out automatically, data quality assessment.

Keywords: machine learning, data quality, quality dimension, quality assessment

Procedia PDF Downloads 120
8574 Performance Analysis of Traffic Classification with Machine Learning

Authors: Htay Htay Yi, Zin May Aye

Abstract:

Network security is role of the ICT environment because malicious users are continually growing that realm of education, business, and then related with ICT. The network security contravention is typically described and examined centrally based on a security event management system. The firewalls, Intrusion Detection System (IDS), and Intrusion Prevention System are becoming essential to monitor or prevent of potential violations, incidents attack, and imminent threats. In this system, the firewall rules are set only for where the system policies are needed. Dataset deployed in this system are derived from the testbed environment. The traffic as in DoS and PortScan traffics are applied in the testbed with firewall and IDS implementation. The network traffics are classified as normal or attacks in the existing testbed environment based on six machine learning classification methods applied in the system. It is required to be tested to get datasets and applied for DoS and PortScan. The dataset is based on CICIDS2017 and some features have been added. This system tested 26 features from the applied dataset. The system is to reduce false positive rates and to improve accuracy in the implemented testbed design. The system also proves good performance by selecting important features and comparing existing a dataset by machine learning classifiers.

Keywords: false negative rate, intrusion detection system, machine learning methods, performance

Procedia PDF Downloads 99
8573 Machine Learning Approach for Anomaly Detection in the Simulated Iec-60870-5-104 Traffic

Authors: Stepan Grebeniuk, Ersi Hodo, Henri Ruotsalainen, Paul Tavolato

Abstract:

Substation security plays an important role in the power delivery system. During the past years, there has been an increase in number of attacks on automation networks of the substations. In spite of that, there hasn’t been enough focus dedicated to the protection of such networks. Aiming to design a specialized anomaly detection system based on machine learning, in this paper we will discuss the IEC 60870-5-104 protocol that is used for communication between substation and control station and focus on the simulation of the substation traffic. Firstly, we will simulate the communication between substation slave and server. Secondly, we will compare the system's normal behavior and its behavior under the attack, in order to extract the right features which will be needed for building an anomaly detection system. Lastly, based on the features we will suggest the anomaly detection system for the asynchronous protocol IEC 60870-5-104.

Keywords: Anomaly detection, IEC-60870-5-104, Machine learning, Man-in-the-Middle attacks, Substation security

Procedia PDF Downloads 336
8572 Managing Data from One Hundred Thousand Internet of Things Devices Globally for Mining Insights

Authors: Julian Wise

Abstract:

Newcrest Mining is one of the world’s top five gold and rare earth mining organizations by production, reserves and market capitalization in the world. This paper elaborates on the data acquisition processes employed by Newcrest in collaboration with Fortune 500 listed organization, Insight Enterprises, to standardize machine learning solutions which process data from over a hundred thousand distributed Internet of Things (IoT) devices located at mine sites globally. Through the utilization of software architecture cloud technologies and edge computing, the technological developments enable for standardized processes of machine learning applications to influence the strategic optimization of mineral processing. Target objectives of the machine learning optimizations include time savings on mineral processing, production efficiencies, risk identification, and increased production throughput. The data acquired and utilized for predictive modelling is processed through edge computing by resources collectively stored within a data lake. Being involved in the digital transformation has necessitated the standardization software architecture to manage the machine learning models submitted by vendors, to ensure effective automation and continuous improvements to the mineral process models. Operating at scale, the system processes hundreds of gigabytes of data per day from distributed mine sites across the globe, for the purposes of increased improved worker safety, and production efficiency through big data applications.

Keywords: mineral technology, big data, machine learning operations, data lake

Procedia PDF Downloads 87
8571 Supervised Learning for Cyber Threat Intelligence

Authors: Jihen Bennaceur, Wissem Zouaghi, Ali Mabrouk

Abstract:

The major aim of cyber threat intelligence (CTI) is to provide sophisticated knowledge about cybersecurity threats to ensure internal and external safeguards against modern cyberattacks. Inaccurate, incomplete, outdated, and invaluable threat intelligence is the main problem. Therefore, data analysis based on AI algorithms is one of the emergent solutions to overcome the threat of information-sharing issues. In this paper, we propose a supervised machine learning-based algorithm to improve threat information sharing by providing a sophisticated classification of cyber threats and data. Extensive simulations investigate the accuracy, precision, recall, f1-score, and support overall to validate the designed algorithm and to compare it with several supervised machine learning algorithms.

Keywords: threat information sharing, supervised learning, data classification, performance evaluation

Procedia PDF Downloads 119
8570 A Comprehensive Review of Artificial Intelligence Applications in Sustainable Building

Authors: Yazan Al-Kofahi, Jamal Alqawasmi.

Abstract:

In this study, a comprehensive literature review (SLR) was conducted, with the main goal of assessing the existing literature about how artificial intelligence (AI), machine learning (ML), deep learning (DL) models are used in sustainable architecture applications and issues including thermal comfort satisfaction, energy efficiency, cost prediction and many others issues. For this reason, the search strategy was initiated by using different databases, including Scopus, Springer and Google Scholar. The inclusion criteria were used by two research strings related to DL, ML and sustainable architecture. Moreover, the timeframe for the inclusion of the papers was open, even though most of the papers were conducted in the previous four years. As a paper filtration strategy, conferences and books were excluded from database search results. Using these inclusion and exclusion criteria, the search was conducted, and a sample of 59 papers was selected as the final included papers in the analysis. The data extraction phase was basically to extract the needed data from these papers, which were analyzed and correlated. The results of this SLR showed that there are many applications of ML and DL in Sustainable buildings, and that this topic is currently trendy. It was found that most of the papers focused their discussions on addressing Environmental Sustainability issues and factors using machine learning predictive models, with a particular emphasis on the use of Decision Tree algorithms. Moreover, it was found that the Random Forest repressor demonstrates strong performance across all feature selection groups in terms of cost prediction of the building as a machine-learning predictive model.

Keywords: machine learning, deep learning, artificial intelligence, sustainable building

Procedia PDF Downloads 35
8569 A Machine Learning Approach for Anomaly Detection in Environmental IoT-Driven Wastewater Purification Systems

Authors: Giovanni Cicceri, Roberta Maisano, Nathalie Morey, Salvatore Distefano

Abstract:

The main goal of this paper is to present a solution for a water purification system based on an Environmental Internet of Things (EIoT) platform to monitor and control water quality and machine learning (ML) models to support decision making and speed up the processes of purification of water. A real case study has been implemented by deploying an EIoT platform and a network of devices, called Gramb meters and belonging to the Gramb project, on wastewater purification systems located in Calabria, south of Italy. The data thus collected are used to control the wastewater quality, detect anomalies and predict the behaviour of the purification system. To this extent, three different statistical and machine learning models have been adopted and thus compared: Autoregressive Integrated Moving Average (ARIMA), Long Short Term Memory (LSTM) autoencoder, and Facebook Prophet (FP). The results demonstrated that the ML solution (LSTM) out-perform classical statistical approaches (ARIMA, FP), in terms of both accuracy, efficiency and effectiveness in monitoring and controlling the wastewater purification processes.

Keywords: environmental internet of things, EIoT, machine learning, anomaly detection, environment monitoring

Procedia PDF Downloads 124
8568 Machine Learning Approach for Automating Electronic Component Error Classification and Detection

Authors: Monica Racha, Siva Chandrasekaran, Alex Stojcevski

Abstract:

The engineering programs focus on promoting students' personal and professional development by ensuring that students acquire technical and professional competencies during four-year studies. The traditional engineering laboratory provides an opportunity for students to "practice by doing," and laboratory facilities aid them in obtaining insight and understanding of their discipline. Due to rapid technological advancements and the current COVID-19 outbreak, the traditional labs were transforming into virtual learning environments. Aim: To better understand the limitations of the physical laboratory, this research study aims to use a Machine Learning (ML) algorithm that interfaces with the Augmented Reality HoloLens and predicts the image behavior to classify and detect the electronic components. The automated electronic components error classification and detection automatically detect and classify the position of all components on a breadboard by using the ML algorithm. This research will assist first-year undergraduate engineering students in conducting laboratory practices without any supervision. With the help of HoloLens, and ML algorithm, students will reduce component placement error on a breadboard and increase the efficiency of simple laboratory practices virtually. Method: The images of breadboards, resistors, capacitors, transistors, and other electrical components will be collected using HoloLens 2 and stored in a database. The collected image dataset will then be used for training a machine learning model. The raw images will be cleaned, processed, and labeled to facilitate further analysis of components error classification and detection. For instance, when students conduct laboratory experiments, the HoloLens captures images of students placing different components on a breadboard. The images are forwarded to the server for detection in the background. A hybrid Convolutional Neural Networks (CNNs) and Support Vector Machines (SVMs) algorithm will be used to train the dataset for object recognition and classification. The convolution layer extracts image features, which are then classified using Support Vector Machine (SVM). By adequately labeling the training data and classifying, the model will predict, categorize, and assess students in placing components correctly. As a result, the data acquired through HoloLens includes images of students assembling electronic components. It constantly checks to see if students appropriately position components in the breadboard and connect the components to function. When students misplace any components, the HoloLens predicts the error before the user places the components in the incorrect proportion and fosters students to correct their mistakes. This hybrid Convolutional Neural Networks (CNNs) and Support Vector Machines (SVMs) algorithm automating electronic component error classification and detection approach eliminates component connection problems and minimizes the risk of component damage. Conclusion: These augmented reality smart glasses powered by machine learning provide a wide range of benefits to supervisors, professionals, and students. It helps customize the learning experience, which is particularly beneficial in large classes with limited time. It determines the accuracy with which machine learning algorithms can forecast whether students are making the correct decisions and completing their laboratory tasks.

Keywords: augmented reality, machine learning, object recognition, virtual laboratories

Procedia PDF Downloads 112
8567 A Machine Learning Pipeline for Real-Time Activity Detection on Low Computational Power Devices for Metaverse Applications

Authors: Amit Kumar, Amanpreet Chander, Ashish Sahani

Abstract:

This paper presents our recent work on real-time human activity detection based on the media pipe pipeline and machine learning algorithms. The proposed system can detect human activities, including running, jumping, squatting, bending to the left or right, and standing still. This is a robust solution for developing a yoga, dance, metaverse, and fitness application that checks for the correction of the pose without having any additional monitor like a personal trainer. MediaPipe solution offers an open-source cross-platform which utilizes a two-step detector-tracker ML pipeline for live detection of key landmarks on our body which can be used for motion data collection. The prediction of real-time poses uses a variety of machine learning techniques and different types of analysis. Without primarily relying on powerful desktop environments for inference, our method achieves real-time performance on the majority of contemporary mobile phones, desktops/laptops, Python, or even the web. Experimental results show that our method outperforms the existing method in terms of accuracy and real-time capability, achieving an accuracy of 99.92% on testing datasets.

Keywords: human activity detection, media pipe, machine learning, metaverse applications

Procedia PDF Downloads 149
8566 Network Analysis and Sex Prediction based on a full Human Brain Connectome

Authors: Oleg Vlasovets, Fabian Schaipp, Christian L. Mueller

Abstract:

we conduct a network analysis and predict the sex of 1000 participants based on ”connectome” - pairwise Pearson’s correlation across 436 brain parcels. We solve the non-smooth convex optimization problem, known under the name of Graphical Lasso, where the solution includes a low-rank component. With this solution and machine learning model for a sex prediction, we explain the brain parcels-sex connectivity patterns.

Keywords: network analysis, neuroscience, machine learning, optimization

Procedia PDF Downloads 117
8565 Structural Reliability Analysis Using Extreme Learning Machine

Authors: Mehul Srivastava, Sharma Tushar Ravikant, Mridul Krishn Mishra

Abstract:

In structural design, the evaluation of safety and probability failure of structure is of significant importance, mainly when the variables are random. On real structures, structural reliability can be evaluated obtaining an implicit limit state function. The structural reliability limit state function is obtained depending upon the statistically independent variables. In the analysis of reliability, we considered the statistically independent random variables to be the load intensity applied and the depth or height of the beam member considered. There are many approaches for structural reliability problems. In this paper Extreme Learning Machine technique and First Order Second Moment Method is used to determine the reliability indices for the same set of variables. The reliability index obtained using ELM is compared with the reliability index obtained using FOSM. Higher the reliability index, more feasible is the method to determine the reliability.

Keywords: reliability, reliability index, statistically independent, extreme learning machine

Procedia PDF Downloads 651
8564 PaSA: A Dataset for Patent Sentiment Analysis to Highlight Patent Paragraphs

Authors: Renukswamy Chikkamath, Vishvapalsinhji Ramsinh Parmar, Christoph Hewel, Markus Endres

Abstract:

Given a patent document, identifying distinct semantic annotations is an interesting research aspect. Text annotation helps the patent practitioners such as examiners and patent attorneys to quickly identify the key arguments of any invention, successively providing a timely marking of a patent text. In the process of manual patent analysis, to attain better readability, recognising the semantic information by marking paragraphs is in practice. This semantic annotation process is laborious and time-consuming. To alleviate such a problem, we proposed a dataset to train machine learning algorithms to automate the highlighting process. The contributions of this work are: i) we developed a multi-class dataset of size 150k samples by traversing USPTO patents over a decade, ii) articulated statistics and distributions of data using imperative exploratory data analysis, iii) baseline Machine Learning models are developed to utilize the dataset to address patent paragraph highlighting task, and iv) future path to extend this work using Deep Learning and domain-specific pre-trained language models to develop a tool to highlight is provided. This work assists patent practitioners in highlighting semantic information automatically and aids in creating a sustainable and efficient patent analysis using the aptitude of machine learning.

Keywords: machine learning, patents, patent sentiment analysis, patent information retrieval

Procedia PDF Downloads 66
8563 Voltage Problem Location Classification Using Performance of Least Squares Support Vector Machine LS-SVM and Learning Vector Quantization LVQ

Authors: M. Khaled Abduesslam, Mohammed Ali, Basher H. Alsdai, Muhammad Nizam Inayati

Abstract:

This paper presents the voltage problem location classification using performance of Least Squares Support Vector Machine (LS-SVM) and Learning Vector Quantization (LVQ) in electrical power system for proper voltage problem location implemented by IEEE 39 bus New-England. The data was collected from the time domain simulation by using Power System Analysis Toolbox (PSAT). Outputs from simulation data such as voltage, phase angle, real power and reactive power were taken as input to estimate voltage stability at particular buses based on Power Transfer Stability Index (PTSI).The simulation data was carried out on the IEEE 39 bus test system by considering load bus increased on the system. To verify of the proposed LS-SVM its performance was compared to Learning Vector Quantization (LVQ). The results showed that LS-SVM is faster and better as compared to LVQ. The results also demonstrated that the LS-SVM was estimated by 0% misclassification whereas LVQ had 7.69% misclassification.

Keywords: IEEE 39 bus, least squares support vector machine, learning vector quantization, voltage collapse

Procedia PDF Downloads 417
8562 Customer Churn Prediction by Using Four Machine Learning Algorithms Integrating Features Selection and Normalization in the Telecom Sector

Authors: Alanoud Moraya Aldalan, Abdulaziz Almaleh

Abstract:

A crucial component of maintaining a customer-oriented business as in the telecom industry is understanding the reasons and factors that lead to customer churn. Competition between telecom companies has greatly increased in recent years. It has become more important to understand customers’ needs in this strong market of telecom industries, especially for those who are looking to turn over their service providers. So, predictive churn is now a mandatory requirement for retaining those customers. Machine learning can be utilized to accomplish this. Churn Prediction has become a very important topic in terms of machine learning classification in the telecommunications industry. Understanding the factors of customer churn and how they behave is very important to building an effective churn prediction model. This paper aims to predict churn and identify factors of customers’ churn based on their past service usage history. Aiming at this objective, the study makes use of feature selection, normalization, and feature engineering. Then, this study compared the performance of four different machine learning algorithms on the Orange dataset: Logistic Regression, Random Forest, Decision Tree, and Gradient Boosting. Evaluation of the performance was conducted by using the F1 score and ROC-AUC. Comparing the results of this study with existing models has proven to produce better results. The results showed the Gradients Boosting with feature selection technique outperformed in this study by achieving a 99% F1-score and 99% AUC, and all other experiments achieved good results as well.

Keywords: machine learning, gradient boosting, logistic regression, churn, random forest, decision tree, ROC, AUC, F1-score

Procedia PDF Downloads 109
8561 Count of Trees in East Africa with Deep Learning

Authors: Nubwimana Rachel, Mugabowindekwe Maurice

Abstract:

Trees play a crucial role in maintaining biodiversity and providing various ecological services. Traditional methods of counting trees are time-consuming, and there is a need for more efficient techniques. However, deep learning makes it feasible to identify the multi-scale elements hidden in aerial imagery. This research focuses on the application of deep learning techniques for tree detection and counting in both forest and non-forest areas through the exploration of the deep learning application for automated tree detection and counting using satellite imagery. The objective is to identify the most effective model for automated tree counting. We used different deep learning models such as YOLOV7, SSD, and UNET, along with Generative Adversarial Networks to generate synthetic samples for training and other augmentation techniques, including Random Resized Crop, AutoAugment, and Linear Contrast Enhancement. These models were trained and fine-tuned using satellite imagery to identify and count trees. The performance of the models was assessed through multiple trials; after training and fine-tuning the models, UNET demonstrated the best performance with a validation loss of 0.1211, validation accuracy of 0.9509, and validation precision of 0.9799. This research showcases the success of deep learning in accurate tree counting through remote sensing, particularly with the UNET model. It represents a significant contribution to the field by offering an efficient and precise alternative to conventional tree-counting methods.

Keywords: remote sensing, deep learning, tree counting, image segmentation, object detection, visualization

Procedia PDF Downloads 31
8560 A Machine Learning Approach for Detecting and Locating Hardware Trojans

Authors: Kaiwen Zheng, Wanting Zhou, Nan Tang, Lei Li, Yuanhang He

Abstract:

The integrated circuit industry has become a cornerstone of the information society, finding widespread application in areas such as industry, communication, medicine, and aerospace. However, with the increasing complexity of integrated circuits, Hardware Trojans (HTs) implanted by attackers have become a significant threat to their security. In this paper, we proposed a hardware trojan detection method for large-scale circuits. As HTs introduce physical characteristic changes such as structure, area, and power consumption as additional redundant circuits, we proposed a machine-learning-based hardware trojan detection method based on the physical characteristics of gate-level netlists. This method transforms the hardware trojan detection problem into a machine-learning binary classification problem based on physical characteristics, greatly improving detection speed. To address the problem of imbalanced data, where the number of pure circuit samples is far less than that of HTs circuit samples, we used the SMOTETomek algorithm to expand the dataset and further improve the performance of the classifier. We used three machine learning algorithms, K-Nearest Neighbors, Random Forest, and Support Vector Machine, to train and validate benchmark circuits on Trust-Hub, and all achieved good results. In our case studies based on AES encryption circuits provided by trust-hub, the test results showed the effectiveness of the proposed method. To further validate the method’s effectiveness for detecting variant HTs, we designed variant HTs using open-source HTs. The proposed method can guarantee robust detection accuracy in the millisecond level detection time for IC, and FPGA design flows and has good detection performance for library variant HTs.

Keywords: hardware trojans, physical properties, machine learning, hardware security

Procedia PDF Downloads 113
8559 Predicting Oil Spills in Real-Time: A Machine Learning and AIS Data-Driven Approach

Authors: Tanmay Bisen, Aastha Shayla, Susham Biswas

Abstract:

Oil spills from tankers can cause significant harm to the environment and local communities, as well as have economic consequences. Early predictions of oil spills can help to minimize these impacts. Our proposed system uses machine learning and neural networks to predict potential oil spills by monitoring data from ship Automatic Identification Systems (AIS). The model analyzes ship movements, speeds, and changes in direction to identify patterns that deviate from the norm and could indicate a potential spill. Our approach not only identifies anomalies but also predicts spills before they occur, providing early detection and mitigation measures. This can prevent or minimize damage to the reputation of the company responsible and the country where the spill takes place. The model's performance on the MV Wakashio oil spill provides insight into its ability to detect and respond to real-world oil spills, highlighting areas for improvement and further research.

Keywords: Anomaly Detection, Oil Spill Prediction, Machine Learning, Image Processing, Graph Neural Network (GNN)

Procedia PDF Downloads 46
8558 Scalable Learning of Tree-Based Models on Sparsely Representable Data

Authors: Fares Hedayatit, Arnauld Joly, Panagiotis Papadimitriou

Abstract:

Many machine learning tasks such as text annotation usually require training over very big datasets, e.g., millions of web documents, that can be represented in a sparse input space. State-of the-art tree-based ensemble algorithms cannot scale to such datasets, since they include operations whose running time is a function of the input space size rather than a function of the non-zero input elements. In this paper, we propose an efficient splitting algorithm to leverage input sparsity within decision tree methods. Our algorithm improves training time over sparse datasets by more than two orders of magnitude and it has been incorporated in the current version of scikit-learn.org, the most popular open source Python machine learning library.

Keywords: big data, sparsely representable data, tree-based models, scalable learning

Procedia PDF Downloads 239
8557 Detecting Music Enjoyment Level Using Electroencephalogram Signals and Machine Learning Techniques

Authors: Raymond Feng, Shadi Ghiasi

Abstract:

An electroencephalogram (EEG) is a non-invasive technique that records electrical activity in the brain using scalp electrodes. Researchers have studied the use of EEG to detect emotions and moods by collecting signals from participants and analyzing how those signals correlate with their activities. In this study, researchers investigated the relationship between EEG signals and music enjoyment. Participants listened to music while data was collected. During the signal-processing phase, power spectral densities (PSDs) were computed from the signals, and dominant brainwave frequencies were extracted from the PSDs to form a comprehensive feature matrix. A machine learning approach was then taken to find correlations between the processed data and the music enjoyment level indicated by the participants. To improve on previous research, multiple machine learning models were employed, including K-Nearest Neighbors Classifier, Support Vector Classifier, and Decision Tree Classifier. Hyperparameters were used to fine-tune each model to further increase its performance. The experiments showed that a strong correlation exists, with the Decision Tree Classifier with hyperparameters yielding 85% accuracy. This study proves that EEG is a reliable means to detect music enjoyment and has future applications, including personalized music recommendation, mood adjustment, and mental health therapy.

Keywords: EEG, electroencephalogram, machine learning, mood, music enjoyment, physiological signals

Procedia PDF Downloads 26
8556 Multi-Label Approach to Facilitate Test Automation Based on Historical Data

Authors: Warda Khan, Remo Lachmann, Adarsh S. Garakahally

Abstract:

The increasing complexity of software and its applicability in a wide range of industries, e.g., automotive, call for enhanced quality assurance techniques. Test automation is one option to tackle the prevailing challenges by supporting test engineers with fast, parallel, and repetitive test executions. A high degree of test automation allows for a shift from mundane (manual) testing tasks to a more analytical assessment of the software under test. However, a high initial investment of test resources is required to establish test automation, which is, in most cases, a limitation to the time constraints provided for quality assurance of complex software systems. Hence, a computer-aided creation of automated test cases is crucial to increase the benefit of test automation. This paper proposes the application of machine learning for the generation of automated test cases. It is based on supervised learning to analyze test specifications and existing test implementations. The analysis facilitates the identification of patterns between test steps and their implementation with test automation components. For the test case generation, this approach exploits historical data of test automation projects. The identified patterns are the foundation to predict the implementation of unknown test case specifications. Based on this support, a test engineer solely has to review and parameterize the test automation components instead of writing them manually, resulting in a significant time reduction for establishing test automation. Compared to other generation approaches, this ML-based solution can handle different writing styles, authors, application domains, and even languages. Furthermore, test automation tools require expert knowledge by means of programming skills, whereas this approach only requires historical data to generate test cases. The proposed solution is evaluated using various multi-label evaluation criteria (EC) and two small-sized real-world systems. The most prominent EC is ‘Subset Accuracy’. The promising results show an accuracy of at least 86% for test cases, where a 1:1 relationship (Multi-Class) between test step specification and test automation component exists. For complex multi-label problems, i.e., one test step can be implemented by several components, the prediction accuracy is still at 60%. It is better than the current state-of-the-art results. It is expected the prediction quality to increase for larger systems with respective historical data. Consequently, this technique facilitates the time reduction for establishing test automation and is thereby independent of the application domain and project. As a work in progress, the next steps are to investigate incremental and active learning as additions to increase the usability of this approach, e.g., in case labelled historical data is scarce.

Keywords: machine learning, multi-class, multi-label, supervised learning, test automation

Procedia PDF Downloads 95
8555 Development of a Decision-Making Method by Using Machine Learning Algorithms in the Early Stage of School Building Design

Authors: Rajaian Hoonejani Mohammad, Eshraghi Pegah, Zomorodian Zahra Sadat, Tahsildoost Mohammad

Abstract:

Over the past decade, energy consumption in educational buildings has steadily increased. The purpose of this research is to provide a method to quickly predict the energy consumption of buildings using separate evaluation of zones and decomposing the building to eliminate the complexity of geometry at the early design stage. To produce this framework, machine learning algorithms such as Support vector regression (SVR) and Artificial neural network (ANN) are used to predict energy consumption and thermal comfort metrics in a school as a case. The database consists of more than 55000 samples in three climates of Iran. Cross-validation evaluation and unseen data have been used for validation. In a specific label, cooling energy, it can be said the accuracy of prediction is at least 84% and 89% in SVR and ANN, respectively. The results show that the SVR performed much better than the ANN.

Keywords: early stage of design, energy, thermal comfort, validation, machine learning

Procedia PDF Downloads 36
8554 Autonomous Kuka Youbot Navigation Based on Machine Learning and Path Planning

Authors: Carlos Gordon, Patricio Encalada, Henry Lema, Diego Leon, Dennis Chicaiza

Abstract:

The following work presents a proposal of autonomous navigation of mobile robots implemented in an omnidirectional robot Kuka Youbot. We have been able to perform the integration of robotic operative system (ROS) and machine learning algorithms. ROS mainly provides two distributions; ROS hydro and ROS Kinect. ROS hydro allows managing the nodes of odometry, kinematics, and path planning with statistical and probabilistic, global and local algorithms based on Adaptive Monte Carlo Localization (AMCL) and Dijkstra. Meanwhile, ROS Kinect is responsible for the detection block of dynamic objects which can be in the points of the planned trajectory obstructing the path of Kuka Youbot. The detection is managed by artificial vision module under a trained neural network based on the single shot multibox detector system (SSD), where the main dynamic objects for detection are human beings and domestic animals among other objects. When the objects are detected, the system modifies the trajectory or wait for the decision of the dynamic obstacle. Finally, the obstacles are skipped from the planned trajectory, and the Kuka Youbot can reach its goal thanks to the machine learning algorithms.

Keywords: autonomous navigation, machine learning, path planning, robotic operative system, open source computer vision library

Procedia PDF Downloads 152
8553 Colored Image Classification Using Quantum Convolutional Neural Networks Approach

Authors: Farina Riaz, Shahab Abdulla, Srinjoy Ganguly, Hajime Suzuki, Ravinesh C. Deo, Susan Hopkins

Abstract:

Recently, quantum machine learning has received significant attention. For various types of data, including text and images, numerous quantum machine learning (QML) models have been created and are being tested. Images are exceedingly complex data components that demand more processing power. Despite being mature, classical machine learning still has difficulties with big data applications. Furthermore, quantum technology has revolutionized how machine learning is thought of, by employing quantum features to address optimization issues. Since quantum hardware is currently extremely noisy, it is not practicable to run machine learning algorithms on it without risking the production of inaccurate results. To discover the advantages of quantum versus classical approaches, this research has concentrated on colored image data. Deep learning classification models are currently being created on Quantum platforms, but they are still in a very early stage. Black and white benchmark image datasets like MNIST and Fashion MINIST have been used in recent research. MNIST and CIFAR-10 were compared for binary classification, but the comparison showed that MNIST performed more accurately than colored CIFAR-10. This research will evaluate the performance of the QML algorithm on the colored benchmark dataset CIFAR-10 to advance QML's real-time applicability. However, deep learning classification models have not been developed to compare colored images like Quantum Convolutional Neural Network (QCNN) to determine how much it is better to classical. Only a few models, such as quantum variational circuits, take colored images. The methodology adopted in this research is a hybrid approach by using penny lane as a simulator. To process the 10 classes of CIFAR-10, the image data has been translated into grey scale and the 28 × 28-pixel image containing 10,000 test and 50,000 training images were used. The objective of this work is to determine how much the quantum approach can outperform a classical approach for a comprehensive dataset of color images. After pre-processing 50,000 images from a classical computer, the QCNN model adopted a hybrid method and encoded the images into a quantum simulator for feature extraction using quantum gate rotations. The measurements were carried out on the classical computer after the rotations were applied. According to the results, we note that the QCNN approach is ~12% more effective than the traditional classical CNN approaches and it is possible that applying data augmentation may increase the accuracy. This study has demonstrated that quantum machine and deep learning models can be relatively superior to the classical machine learning approaches in terms of their processing speed and accuracy when used to perform classification on colored classes.

Keywords: CIFAR-10, quantum convolutional neural networks, quantum deep learning, quantum machine learning

Procedia PDF Downloads 95
8552 Stock Market Prediction Using Convolutional Neural Network That Learns from a Graph

Authors: Mo-Se Lee, Cheol-Hwi Ahn, Kee-Young Kwahk, Hyunchul Ahn

Abstract:

Over the past decade, deep learning has been in spotlight among various machine learning algorithms. In particular, CNN (Convolutional Neural Network), which is known as effective solution for recognizing and classifying images, has been popularly applied to classification and prediction problems in various fields. In this study, we try to apply CNN to stock market prediction, one of the most challenging tasks in the machine learning research. In specific, we propose to apply CNN as the binary classifier that predicts stock market direction (up or down) by using a graph as its input. That is, our proposal is to build a machine learning algorithm that mimics a person who looks at the graph and predicts whether the trend will go up or down. Our proposed model consists of four steps. In the first step, it divides the dataset into 5 days, 10 days, 15 days, and 20 days. And then, it creates graphs for each interval in step 2. In the next step, CNN classifiers are trained using the graphs generated in the previous step. In step 4, it optimizes the hyper parameters of the trained model by using the validation dataset. To validate our model, we will apply it to the prediction of KOSPI200 for 1,986 days in eight years (from 2009 to 2016). The experimental dataset will include 14 technical indicators such as CCI, Momentum, ROC and daily closing price of KOSPI200 of Korean stock market.

Keywords: convolutional neural network, deep learning, Korean stock market, stock market prediction

Procedia PDF Downloads 407