Search results for: machine learning optimization
10720 The Effect of Online Learning During the COVID-19 Pandemic on Student Mental
Authors: Adelia Desi Agnesita
Abstract:
The advent of a new disease called covid-19 made many major changes in the world, one of which is the process of learning and teaching. Learning formerly offline but now is done online, which makes students need adaptation to the learning process. The covid-19 pandemic that occurs almost worldwide causes activities that involve many people to be avoided, one of which is learning to teach. In Indonesia, since March 2020, the process of college learning is turning into online/ long-distance learning. It's to prevent the spread of the covid-19. Student online learning presents some of the obstacles to poor signals, many of the tasks, lack of focus, difficulty sleeping, and resulting stress.Keywords: learning, online, covid-19, pandemic
Procedia PDF Downloads 21310719 Detecting Cyberbullying, Spam and Bot Behavior and Fake News in Social Media Accounts Using Machine Learning
Authors: M. D. D. Chathurangi, M. G. K. Nayanathara, K. M. H. M. M. Gunapala, G. M. R. G. Dayananda, Kavinga Yapa Abeywardena, Deemantha Siriwardana
Abstract:
Due to the growing popularity of social media platforms at present, there are various concerns, mostly cyberbullying, spam, bot accounts, and the spread of incorrect information. To develop a risk score calculation system as a thorough method for deciphering and exposing unethical social media profiles, this research explores the most suitable algorithms to our best knowledge in detecting the mentioned concerns. Various multiple models, such as Naïve Bayes, CNN, KNN, Stochastic Gradient Descent, Gradient Boosting Classifier, etc., were examined, and the best results were taken into the development of the risk score system. For cyberbullying, the Logistic Regression algorithm achieved an accuracy of 84.9%, while the spam-detecting MLP model gained 98.02% accuracy. The bot accounts identifying the Random Forest algorithm obtained 91.06% accuracy, and 84% accuracy was acquired for fake news detection using SVM.Keywords: cyberbullying, spam behavior, bot accounts, fake news, machine learning
Procedia PDF Downloads 3610718 [Keynote Speech]: Feature Selection and Predictive Modeling of Housing Data Using Random Forest
Authors: Bharatendra Rai
Abstract:
Predictive data analysis and modeling involving machine learning techniques become challenging in presence of too many explanatory variables or features. Presence of too many features in machine learning is known to not only cause algorithms to slow down, but they can also lead to decrease in model prediction accuracy. This study involves housing dataset with 79 quantitative and qualitative features that describe various aspects people consider while buying a new house. Boruta algorithm that supports feature selection using a wrapper approach build around random forest is used in this study. This feature selection process leads to 49 confirmed features which are then used for developing predictive random forest models. The study also explores five different data partitioning ratios and their impact on model accuracy are captured using coefficient of determination (r-square) and root mean square error (rsme).Keywords: housing data, feature selection, random forest, Boruta algorithm, root mean square error
Procedia PDF Downloads 32310717 A Systematic Review Investigating the Use of EEG Measures in Neuromarketing
Authors: A. M. Byrne, E. Bonfiglio, C. Rigby, N. Edelstyn
Abstract:
Introduction: Neuromarketing employs numerous methodologies when investigating products and advertisement effectiveness. Electroencephalography (EEG), a non-invasive measure of electrical activity from the brain, is commonly used in neuromarketing. EEG data can be considered using time-frequency (TF) analysis, where changes in the frequency of brainwaves are calculated to infer participant’s mental states, or event-related potential (ERP) analysis, where changes in amplitude are observed in direct response to a stimulus. This presentation discusses the findings of a systematic review of EEG measures in neuromarketing. A systematic review summarises evidence on a research question, using explicit measures to identify, select, and critically appraise relevant research papers. Thissystematic review identifies which EEG measures are the most robust predictor of customer preference and purchase intention. Methods: Search terms identified174 papers that used EEG in combination with marketing-related stimuli. Publications were excluded if they were written in a language other than English or were not published as journal articles (e.g., book chapters). The review investigated which TF effect (e.g., theta-band power) and ERP component (e.g., N400) most consistently reflected preference and purchase intention. Machine-learning prediction was also investigated, along with the use of EEG combined with physiological measures such as eye-tracking. Results: Frontal alpha asymmetry was the most reliable TF signal, where an increase in activity over the left side of the frontal lobe indexed a positive response to marketing stimuli, while an increase in activity over the right side indexed a negative response. The late positive potential, a positive amplitude increase around 600 ms after stimulus presentation, was the most reliable ERP component, reflecting the conscious emotional evaluation of marketing stimuli. However, each measure showed mixed results when related to preference and purchase behaviour. Predictive accuracy was greatly improved through machine-learning algorithms such as deep neural networks, especially when combined with eye-tracking or facial expression analyses. Discussion: This systematic review provides a novel catalogue of the most effective use of each EEG measure commonly used in neuromarketing. Exciting findings to emerge are the identification of the frontal alpha asymmetry and late positive potential as markers of preferential responses to marketing stimuli. Predictive accuracy using machine-learning algorithms achieved predictive accuracies as high as 97%, and future research should therefore focus on machine-learning prediction when using EEG measures in neuromarketing.Keywords: EEG, ERP, neuromarketing, machine-learning, systematic review, time-frequency
Procedia PDF Downloads 11110716 Climate Changes in Albania and Their Effect on Cereal Yield
Authors: Lule Basha, Eralda Gjika
Abstract:
This study is focused on analyzing climate change in Albania and its potential effects on cereal yields. Initially, monthly temperature and rainfalls in Albania were studied for the period 1960-2021. Climacteric variables are important variables when trying to model cereal yield behavior, especially when significant changes in weather conditions are observed. For this purpose, in the second part of the study, linear and nonlinear models explaining cereal yield are constructed for the same period, 1960-2021. The multiple linear regression analysis and lasso regression method are applied to the data between cereal yield and each independent variable: average temperature, average rainfall, fertilizer consumption, arable land, land under cereal production, and nitrous oxide emissions. In our regression model, heteroscedasticity is not observed, data follow a normal distribution, and there is a low correlation between factors, so we do not have the problem of multicollinearity. Machine-learning methods, such as random forest, are used to predict cereal yield responses to climacteric and other variables. Random Forest showed high accuracy compared to the other statistical models in the prediction of cereal yield. We found that changes in average temperature negatively affect cereal yield. The coefficients of fertilizer consumption, arable land, and land under cereal production are positively affecting production. Our results show that the Random Forest method is an effective and versatile machine-learning method for cereal yield prediction compared to the other two methods.Keywords: cereal yield, climate change, machine learning, multiple regression model, random forest
Procedia PDF Downloads 9110715 An Application of a Machine Monitoring by Using the Internet of Things to Improve a Preventive Maintenance: Case Study of an Automated Plastic Granule-Packing Machine
Authors: Anek Apipatkul, Paphakorn Pitayachaval
Abstract:
Preventive maintenance is a standardized procedure to control and prevent risky problems affecting production in order to increase work efficiency. Machine monitoring also routinely works to collect data for a scheduling maintenance period. This paper is to present the application of machine monitoring by using the internet of things (IOTs) and a lean technique in order to manage with complex maintenance tasks of an automated plastic granule packing machine. To organize the preventive maintenance, there are several processes that the machine monitoring was applied, starting with defining a clear scope of the machine, establishing standards in maintenance work, applying a just-in-time (JIT) technique for timely delivery in the maintenance work, solving problems on the floor, and also improving the inspection process. The result has shown that wasted time was reduced, and machines have been operated as scheduled. Furthermore, the efficiency of the scheduled maintenance period was increased by 95%.Keywords: internet of things, preventive maintenance, machine monitoring, lean technique
Procedia PDF Downloads 10210714 ACO-TS: an ACO-based Algorithm for Optimizing Cloud Task Scheduling
Authors: Fahad Y. Al-dawish
Abstract:
The current trend by a large number of organizations and individuals to use cloud computing. Many consider it a significant shift in the field of computing. Cloud computing are distributed and parallel systems consisting of a collection of interconnected physical and virtual machines. With increasing request and profit of cloud computing infrastructure, diverse computing processes can be executed on cloud environment. Many organizations and individuals around the world depend on the cloud computing environments infrastructure to carry their applications, platform, and infrastructure. One of the major and essential issues in this environment related to allocating incoming tasks to suitable virtual machine (cloud task scheduling). Cloud task scheduling is classified as optimization problem, and there are several meta-heuristic algorithms have been anticipated to solve and optimize this problem. Good task scheduler should execute its scheduling technique on altering environment and the types of incoming task set. In this research project a cloud task scheduling methodology based on ant colony optimization ACO algorithm, we call it ACO-TS Ant Colony Optimization for Task Scheduling has been proposed and compared with different scheduling algorithms (Random, First Come First Serve FCFS, and Fastest Processor to the Largest Task First FPLTF). Ant Colony Optimization (ACO) is random optimization search method that will be used for assigning incoming tasks to available virtual machines VMs. The main role of proposed algorithm is to minimizing the makespan of certain tasks set and maximizing resource utilization by balance the load among virtual machines. The proposed scheduling algorithm was evaluated by using Cloudsim toolkit framework. Finally after analyzing and evaluating the performance of experimental results we find that the proposed algorithm ACO-TS perform better than Random, FCFS, and FPLTF algorithms in each of the makespaan and resource utilization.Keywords: cloud Task scheduling, ant colony optimization (ACO), cloudsim, cloud computing
Procedia PDF Downloads 42010713 Machine Learning Techniques for Estimating Ground Motion Parameters
Authors: Farid Khosravikia, Patricia Clayton
Abstract:
The main objective of this study is to evaluate the advantages and disadvantages of various machine learning techniques in forecasting ground-motion intensity measures given source characteristics, source-to-site distance, and local site condition. Intensity measures such as peak ground acceleration and velocity (PGA and PGV, respectively) as well as 5% damped elastic pseudospectral accelerations at different periods (PSA), are indicators of the strength of shaking at the ground surface. Estimating these variables for future earthquake events is a key step in seismic hazard assessment and potentially subsequent risk assessment of different types of structures. Typically, linear regression-based models, with pre-defined equations and coefficients, are used in ground motion prediction. However, due to the restrictions of the linear regression methods, such models may not capture more complex nonlinear behaviors that exist in the data. Thus, this study comparatively investigates potential benefits from employing other machine learning techniques as a statistical method in ground motion prediction such as Artificial Neural Network, Random Forest, and Support Vector Machine. The algorithms are adjusted to quantify event-to-event and site-to-site variability of the ground motions by implementing them as random effects in the proposed models to reduce the aleatory uncertainty. All the algorithms are trained using a selected database of 4,528 ground-motions, including 376 seismic events with magnitude 3 to 5.8, recorded over the hypocentral distance range of 4 to 500 km in Oklahoma, Kansas, and Texas since 2005. The main reason of the considered database stems from the recent increase in the seismicity rate of these states attributed to petroleum production and wastewater disposal activities, which necessities further investigation in the ground motion models developed for these states. Accuracy of the models in predicting intensity measures, generalization capability of the models for future data, as well as usability of the models are discussed in the evaluation process. The results indicate the algorithms satisfy some physically sound characteristics such as magnitude scaling distance dependency without requiring pre-defined equations or coefficients. Moreover, it is shown that, when sufficient data is available, all the alternative algorithms tend to provide more accurate estimates compared to the conventional linear regression-based method, and particularly, Random Forest outperforms the other algorithms. However, the conventional method is a better tool when limited data is available.Keywords: artificial neural network, ground-motion models, machine learning, random forest, support vector machine
Procedia PDF Downloads 12210712 A Survey of Sentiment Analysis Based on Deep Learning
Authors: Pingping Lin, Xudong Luo, Yifan Fan
Abstract:
Sentiment analysis is a very active research topic. Every day, Facebook, Twitter, Weibo, and other social media, as well as significant e-commerce websites, generate a massive amount of comments, which can be used to analyse peoples opinions or emotions. The existing methods for sentiment analysis are based mainly on sentiment dictionaries, machine learning, and deep learning. The first two kinds of methods rely on heavily sentiment dictionaries or large amounts of labelled data. The third one overcomes these two problems. So, in this paper, we focus on the third one. Specifically, we survey various sentiment analysis methods based on convolutional neural network, recurrent neural network, long short-term memory, deep neural network, deep belief network, and memory network. We compare their futures, advantages, and disadvantages. Also, we point out the main problems of these methods, which may be worthy of careful studies in the future. Finally, we also examine the application of deep learning in multimodal sentiment analysis and aspect-level sentiment analysis.Keywords: document analysis, deep learning, multimodal sentiment analysis, natural language processing
Procedia PDF Downloads 16410711 Hyper Tuned RBF SVM: Approach for the Prediction of the Breast Cancer
Authors: Surita Maini, Sanjay Dhanka
Abstract:
Machine learning (ML) involves developing algorithms and statistical models that enable computers to learn and make predictions or decisions based on data without being explicitly programmed. Because of its unlimited abilities ML is gaining popularity in medical sectors; Medical Imaging, Electronic Health Records, Genomic Data Analysis, Wearable Devices, Disease Outbreak Prediction, Disease Diagnosis, etc. In the last few decades, many researchers have tried to diagnose Breast Cancer (BC) using ML, because early detection of any disease can save millions of lives. Working in this direction, the authors have proposed a hybrid ML technique RBF SVM, to predict the BC in earlier the stage. The proposed method is implemented on the Breast Cancer UCI ML dataset with 569 instances and 32 attributes. The authors recorded performance metrics of the proposed model i.e., Accuracy 98.24%, Sensitivity 98.67%, Specificity 97.43%, F1 Score 98.67%, Precision 98.67%, and run time 0.044769 seconds. The proposed method is validated by K-Fold cross-validation.Keywords: breast cancer, support vector classifier, machine learning, hyper parameter tunning
Procedia PDF Downloads 6710710 Predictive Analysis of the Stock Price Market Trends with Deep Learning
Authors: Suraj Mehrotra
Abstract:
The stock market is a volatile, bustling marketplace that is a cornerstone of economics. It defines whether companies are successful or in spiral. A thorough understanding of it is important - many companies have whole divisions dedicated to analysis of both their stock and of rivaling companies. Linking the world of finance and artificial intelligence (AI), especially the stock market, has been a relatively recent development. Predicting how stocks will do considering all external factors and previous data has always been a human task. With the help of AI, however, machine learning models can help us make more complete predictions in financial trends. Taking a look at the stock market specifically, predicting the open, closing, high, and low prices for the next day is very hard to do. Machine learning makes this task a lot easier. A model that builds upon itself that takes in external factors as weights can predict trends far into the future. When used effectively, new doors can be opened up in the business and finance world, and companies can make better and more complete decisions. This paper explores the various techniques used in the prediction of stock prices, from traditional statistical methods to deep learning and neural networks based approaches, among other methods. It provides a detailed analysis of the techniques and also explores the challenges in predictive analysis. For the accuracy of the testing set, taking a look at four different models - linear regression, neural network, decision tree, and naïve Bayes - on the different stocks, Apple, Google, Tesla, Amazon, United Healthcare, Exxon Mobil, J.P. Morgan & Chase, and Johnson & Johnson, the naïve Bayes model and linear regression models worked best. For the testing set, the naïve Bayes model had the highest accuracy along with the linear regression model, followed by the neural network model and then the decision tree model. The training set had similar results except for the fact that the decision tree model was perfect with complete accuracy in its predictions, which makes sense. This means that the decision tree model likely overfitted the training set when used for the testing set.Keywords: machine learning, testing set, artificial intelligence, stock analysis
Procedia PDF Downloads 9510709 Computational Model of Human Cardiopulmonary System
Authors: Julian Thrash, Douglas Folk, Michael Ciracy, Audrey C. Tseng, Kristen M. Stromsodt, Amber Younggren, Christopher Maciolek
Abstract:
The cardiopulmonary system is comprised of the heart, lungs, and many dynamic feedback mechanisms that control its function based on a multitude of variables. The next generation of cardiopulmonary medical devices will involve adaptive control and smart pacing techniques. However, testing these smart devices on living systems may be unethical and exceedingly expensive. As a solution, a comprehensive computational model of the cardiopulmonary system was implemented in Simulink. The model contains over 240 state variables and over 100 equations previously described in a series of published articles. Simulink was chosen because of its ease of introducing machine learning elements. Initial results indicate that physiologically correct waveforms of pressures and volumes were obtained in the simulation. With the development of a comprehensive computational model, we hope to pioneer the future of predictive medicine by applying our research towards the initial stages of smart devices. After validation, we will introduce and train reinforcement learning agents using the cardiopulmonary model to assist in adaptive control system design. With our cardiopulmonary model, we will accelerate the design and testing of smart and adaptive medical devices to better serve those with cardiovascular disease.Keywords: adaptive control, cardiopulmonary, computational model, machine learning, predictive medicine
Procedia PDF Downloads 17910708 Multidisciplinary and Multilevel Design Methodology of Unmanned Aerial Vehicles using Enhanced Collaborative Optimization
Authors: Pedro F. Albuquerque, Pedro V. Gamboa, Miguel A. Silvestre
Abstract:
The present work describes the implementation of the Enhanced Collaborative Optimization (ECO) multilevel architecture with a gradient-based optimization algorithm with the aim of performing a multidisciplinary design optimization of a generic unmanned aerial vehicle with morphing technologies. The concepts of weighting coefficient and a dynamic compatibility parameter are presented for the ECO architecture. A routine that calculates the aircraft performance for the user defined mission profile and vehicle’s performance requirements has been implemented using low fidelity models for the aerodynamics, stability, propulsion, weight, balance and flight performance. A benchmarking case study for evaluating the advantage of using a variable span wing within the optimization methodology developed is presented.Keywords: multidisciplinary, multilevel, morphing, enhanced collaborative optimization
Procedia PDF Downloads 92910707 Application the Queuing Theory in the Warehouse Optimization
Authors: Jaroslav Masek, Juraj Camaj, Eva Nedeliakova
Abstract:
The aim of optimization of store management is not only designing the situation of store management itself including its equipment, technology and operation. In optimization of store management we need to consider also synchronizing of technological, transport, store and service operations throughout the whole process of logistic chain in such a way that a natural flow of material from provider to consumer will be achieved the shortest possible way, in the shortest possible time in requested quality and quantity and with minimum costs. The paper deals with the application of the queuing theory for optimization of warehouse processes. The first part refers to common information about the problematic of warehousing and using mathematical methods for logistics chains optimization. The second part refers to preparing a model of a warehouse within queuing theory. The conclusion of the paper includes two examples of using queuing theory in praxis.Keywords: queuing theory, logistics system, mathematical methods, warehouse optimization
Procedia PDF Downloads 59310706 An Efficient Design of Static Synchronous Series Compensator Based Fractional Order PID Controller Using Invasive Weed Optimization Algorithm
Authors: Abdelghani Choucha, Lakhdar Chaib, Salem Arif
Abstract:
This paper treated the problem of power system stability with the aid of Static Synchronous Series Compensator (SSSC) installed in the transmission line of single machine infinite bus (SMIB) power system. A fractional order PID (FOPID) controller has been applied as a robust controller for optimal SSSC design to control the power system characteristics. Additionally, the SSSC based FOPID parameters are smoothly tuned using Invasive Weed Optimization algorithm (IWO). To verify the strength of the proposed controller, SSSC based FOPID controller is validated in a wide range of operating condition and compared with the conventional scheme SSSC-POD controller. The main purpose of the proposed process is greatly enhanced the dynamic states of the tested system. Simulation results clearly prove the superiority and performance of the proposed controller design.Keywords: SSSC-FOPID, SSSC-POD, SMIB power system, invasive weed optimization algorithm
Procedia PDF Downloads 18810705 Intelligent Software Architecture and Automatic Re-Architecting Based on Machine Learning
Authors: Gebremeskel Hagos Gebremedhin, Feng Chong, Heyan Huang
Abstract:
Software system is the combination of architecture and organized components to accomplish a specific function or set of functions. A good software architecture facilitates application system development, promotes achievement of functional requirements, and supports system reconfiguration. We describe three studies demonstrating the utility of our architecture in the subdomain of mobile office robots and identify software engineering principles embodied in the architecture. The main aim of this paper is to analyze prove architecture design and automatic re-architecting using machine learning. Intelligence software architecture and automatic re-architecting process is reorganizing in to more suitable one of the software organizational structure system using the user access dataset for creating relationship among the components of the system. The 3-step approach of data mining was used to analyze effective recovery, transformation and implantation with the use of clustering algorithm. Therefore, automatic re-architecting without changing the source code is possible to solve the software complexity problem and system software reuse.Keywords: intelligence, software architecture, re-architecting, software reuse, High level design
Procedia PDF Downloads 11910704 Municipal-Level Gender Norms: Measurement and Effects on Women in Politics
Authors: Luisa Carrer, Lorenzo De Masi
Abstract:
In this paper, we exploit the massive amount of information from Facebook to build a measure of gender attitudes in Italy at a previously impossible resolution—the municipal level. We construct our index via a machine learning method to replicate a benchmark region-level measure. Interestingly, we find that most of the variation in our Gender Norms Index (GNI) is across towns within narrowly defined geographical areas rather than across regions or provinces. In a second step, we show how this local variation in norms can be leveraged for identification purposes. In particular, we use our index to investigate whether these differences in norms carry over to the policy activity of politicians elected in the Italian Parliament. We document that females are more likely to sit in parliamentary committees focused on gender-sensitive matters, labor, and social issues, but not if they come from a relatively conservative town. These effects are robust to conditioning the legislative term and electoral district, suggesting the importance of social norms in shaping legislators’ policy activity.Keywords: gender equality, gender norms index, Facebook, machine learning, politics
Procedia PDF Downloads 7810703 Automatic Classification of the Stand-to-Sit Phase in the TUG Test Using Machine Learning
Authors: Yasmine Abu Adla, Racha Soubra, Milana Kasab, Mohamad O. Diab, Aly Chkeir
Abstract:
Over the past several years, researchers have shown a great interest in assessing the mobility of elderly people to measure their functional status. Usually, such an assessment is done by conducting tests that require the subject to walk a certain distance, turn around, and finally sit back down. Consequently, this study aims to provide an at home monitoring system to assess the patient’s status continuously. Thus, we proposed a technique to automatically detect when a subject sits down while walking at home. In this study, we utilized a Doppler radar system to capture the motion of the subjects. More than 20 features were extracted from the radar signals, out of which 11 were chosen based on their intraclass correlation coefficient (ICC > 0.75). Accordingly, the sequential floating forward selection wrapper was applied to further narrow down the final feature vector. Finally, 5 features were introduced to the linear discriminant analysis classifier, and an accuracy of 93.75% was achieved as well as a precision and recall of 95% and 90%, respectively.Keywords: Doppler radar system, stand-to-sit phase, TUG test, machine learning, classification
Procedia PDF Downloads 16110702 A Hybrid System of Hidden Markov Models and Recurrent Neural Networks for Learning Deterministic Finite State Automata
Authors: Pavan K. Rallabandi, Kailash C. Patidar
Abstract:
In this paper, we present an optimization technique or a learning algorithm using the hybrid architecture by combining the most popular sequence recognition models such as Recurrent Neural Networks (RNNs) and Hidden Markov models (HMMs). In order to improve the sequence or pattern recognition/ classification performance by applying a hybrid/neural symbolic approach, a gradient descent learning algorithm is developed using the Real Time Recurrent Learning of Recurrent Neural Network for processing the knowledge represented in trained Hidden Markov Models. The developed hybrid algorithm is implemented on automata theory as a sample test beds and the performance of the designed algorithm is demonstrated and evaluated on learning the deterministic finite state automata.Keywords: hybrid systems, hidden markov models, recurrent neural networks, deterministic finite state automata
Procedia PDF Downloads 38810701 Using Machine Learning to Predict Answers to Big-Five Personality Questions
Authors: Aadityaa Singla
Abstract:
The big five personality traits are as follows: openness, conscientiousness, extraversion, agreeableness, and neuroticism. In order to get an insight into their personality, many flocks to these categories, which each have different meanings/characteristics. This information is important not only to individuals but also to career professionals and psychologists who can use this information for candidate assessment or job recruitment. The links between AI and psychology have been well studied in cognitive science, but it is still a rather novel development. It is possible for various AI classification models to accurately predict a personality question via ten input questions. This would contrast with the hundred questions that normal humans have to answer to gain a complete picture of their five personality traits. In order to approach this problem, various AI classification models were used on a dataset to predict what a user may answer. From there, the model's prediction was compared to its actual response. Normally, there are five answer choices (a 20% chance of correct guess), and the models exceed that value to different degrees, proving their significance. By utilizing an MLP classifier, decision tree, linear model, and K-nearest neighbors, they were able to obtain a test accuracy of 86.643, 54.625, 47.875, and 52.125, respectively. These approaches display that there is potential in the future for more nuanced predictions to be made regarding personality.Keywords: machine learning, personally, big five personality traits, cognitive science
Procedia PDF Downloads 14510700 A Selection Approach: Discriminative Model for Nominal Attributes-Based Distance Measures
Authors: Fang Gong
Abstract:
Distance measures are an indispensable part of many instance-based learning (IBL) and machine learning (ML) algorithms. The value difference metrics (VDM) and inverted specific-class distance measure (ISCDM) are among the top-performing distance measures that address nominal attributes. VDM performs well in some domains owing to its simplicity and poorly in others that exist missing value and non-class attribute noise. ISCDM, however, typically works better than VDM on such domains. To maximize their advantages and avoid disadvantages, in this paper, a selection approach: a discriminative model for nominal attributes-based distance measures is proposed. More concretely, VDM and ISCDM are built independently on a training dataset at the training stage, and the most credible one is recorded for each training instance. At the test stage, its nearest neighbor for each test instance is primarily found by any of VDM and ISCDM and then chooses the most reliable model of its nearest neighbor to predict its class label. It is simply denoted as a discriminative distance measure (DDM). Experiments are conducted on the 34 University of California at Irvine (UCI) machine learning repository datasets, and it shows DDM retains the interpretability and simplicity of VDM and ISCDM but significantly outperforms the original VDM and ISCDM and other state-of-the-art competitors in terms of accuracy.Keywords: distance measure, discriminative model, nominal attributes, nearest neighbor
Procedia PDF Downloads 11410699 Green Thumb Engineering - Explainable Artificial Intelligence for Managing IoT Enabled Houseplants
Authors: Antti Nurminen, Avleen Malhi
Abstract:
Significant progress in intelligent systems in combination with exceedingly wide application domains having machine learning as the core technology are usually opaque, non-intuitive, and commonly complex for human users. We use innovative IoT technology which monitors and analyzes moisture, humidity, luminosity and temperature levels to assist end users for optimization of environmental conditions for their houseplants. For plant health monitoring, we construct a system yielding the Normalized Difference Vegetation Index (NDVI), supported by visual validation by users. We run the system for a selected plant, basil, in varying environmental conditions to cater for typical home conditions, and bootstrap our AI with the acquired data. For end users, we implement a web based user interface which provides both instructions and explanations.Keywords: explainable artificial intelligence, intelligent agent, IoT, NDVI
Procedia PDF Downloads 16310698 A Machine Learning-Based Analysis of Autism Prevalence Rates across US States against Multiple Potential Explanatory Variables
Authors: Ronit Chakraborty, Sugata Banerji
Abstract:
There has been a marked increase in the reported prevalence of Autism Spectrum Disorder (ASD) among children in the US over the past two decades. This research has analyzed the growth in state-level ASD prevalence against 45 different potentially explanatory factors, including socio-economic, demographic, healthcare, public policy, and political factors. The goal was to understand if these factors have adequate predictive power in modeling the differential growth in ASD prevalence across various states and if they do, which factors are the most influential. The key findings of this study include (1) the confirmation that the chosen feature set has considerable power in predicting the growth in ASD prevalence, (2) the identification of the most influential predictive factors, (3) given the nature of the most influential predictive variables, an indication that a considerable portion of the reported ASD prevalence differentials across states could be attributable to over and under diagnosis, and (4) identification of Florida as a key outlier state pointing to a potential under-diagnosis of ASD there.Keywords: autism spectrum disorder, clustering, machine learning, predictive modeling
Procedia PDF Downloads 10210697 Development of Multimedia Learning Application for Mastery Learning Style: A Graduated Difficulty Strategy
Authors: Nur Azlina Mohamed Mokmin, Mona Masood
Abstract:
Guided by the theory of learning style, this study is based on the development of a multimedia learning application for students with mastery learning style. The learning material was developed by applying a graduated difficulty learning strategy. Algebraic fraction was chosen as the learning topic for this application. The effectiveness of this application in helping students learn is measured by giving a pre- and post-test. The result shows that students who learn using the learning material that matches their preferred learning style performs better than the students with a non-personalized learning material.Keywords: algebraic fractions, graduated difficulty, mastery learning style, multimedia
Procedia PDF Downloads 51310696 Classification of Potential Biomarkers in Breast Cancer Using Artificial Intelligence Algorithms and Anthropometric Datasets
Authors: Aref Aasi, Sahar Ebrahimi Bajgani, Erfan Aasi
Abstract:
Breast cancer (BC) continues to be the most frequent cancer in females and causes the highest number of cancer-related deaths in women worldwide. Inspired by recent advances in studying the relationship between different patient attributes and features and the disease, in this paper, we have tried to investigate the different classification methods for better diagnosis of BC in the early stages. In this regard, datasets from the University Hospital Centre of Coimbra were chosen, and different machine learning (ML)-based and neural network (NN) classifiers have been studied. For this purpose, we have selected favorable features among the nine provided attributes from the clinical dataset by using a random forest algorithm. This dataset consists of both healthy controls and BC patients, and it was noted that glucose, BMI, resistin, and age have the most importance, respectively. Moreover, we have analyzed these features with various ML-based classifier methods, including Decision Tree (DT), K-Nearest Neighbors (KNN), eXtreme Gradient Boosting (XGBoost), Logistic Regression (LR), Naive Bayes (NB), and Support Vector Machine (SVM) along with NN-based Multi-Layer Perceptron (MLP) classifier. The results revealed that among different techniques, the SVM and MLP classifiers have the most accuracy, with amounts of 96% and 92%, respectively. These results divulged that the adopted procedure could be used effectively for the classification of cancer cells, and also it encourages further experimental investigations with more collected data for other types of cancers.Keywords: breast cancer, diagnosis, machine learning, biomarker classification, neural network
Procedia PDF Downloads 13510695 Prediction of Alzheimer's Disease Based on Blood Biomarkers and Machine Learning Algorithms
Authors: Man-Yun Liu, Emily Chia-Yu Su
Abstract:
Alzheimer's disease (AD) is the public health crisis of the 21st century. AD is a degenerative brain disease and the most common cause of dementia, a costly disease on the healthcare system. Unfortunately, the cause of AD is poorly understood, furthermore; the treatments of AD so far can only alleviate symptoms rather cure or stop the progress of the disease. Currently, there are several ways to diagnose AD; medical imaging can be used to distinguish between AD, other dementias, and early onset AD, and cerebrospinal fluid (CSF). Compared with other diagnostic tools, blood (plasma) test has advantages as an approach to population-based disease screening because it is simpler, less invasive also cost effective. In our study, we used blood biomarkers dataset of The Alzheimer’s disease Neuroimaging Initiative (ADNI) which was funded by National Institutes of Health (NIH) to do data analysis and develop a prediction model. We used independent analysis of datasets to identify plasma protein biomarkers predicting early onset AD. Firstly, to compare the basic demographic statistics between the cohorts, we used SAS Enterprise Guide to do data preprocessing and statistical analysis. Secondly, we used logistic regression, neural network, decision tree to validate biomarkers by SAS Enterprise Miner. This study generated data from ADNI, contained 146 blood biomarkers from 566 participants. Participants include cognitive normal (healthy), mild cognitive impairment (MCI), and patient suffered Alzheimer’s disease (AD). Participants’ samples were separated into two groups, healthy and MCI, healthy and AD, respectively. We used the two groups to compare important biomarkers of AD and MCI. In preprocessing, we used a t-test to filter 41/47 features between the two groups (healthy and AD, healthy and MCI) before using machine learning algorithms. Then we have built model with 4 machine learning methods, the best AUC of two groups separately are 0.991/0.709. We want to stress the importance that the simple, less invasive, common blood (plasma) test may also early diagnose AD. As our opinion, the result will provide evidence that blood-based biomarkers might be an alternative diagnostics tool before further examination with CSF and medical imaging. A comprehensive study on the differences in blood-based biomarkers between AD patients and healthy subjects is warranted. Early detection of AD progression will allow physicians the opportunity for early intervention and treatment.Keywords: Alzheimer's disease, blood-based biomarkers, diagnostics, early detection, machine learning
Procedia PDF Downloads 32210694 Particle Swarm Optimization and Quantum Particle Swarm Optimization to Multidimensional Function Approximation
Authors: Diogo Silva, Fadul Rodor, Carlos Moraes
Abstract:
This work compares the results of multidimensional function approximation using two algorithms: the classical Particle Swarm Optimization (PSO) and the Quantum Particle Swarm Optimization (QPSO). These algorithms were both tested on three functions - The Rosenbrock, the Rastrigin, and the sphere functions - with different characteristics by increasing their number of dimensions. As a result, this study shows that the higher the function space, i.e. the larger the function dimension, the more evident the advantages of using the QPSO method compared to the PSO method in terms of performance and number of necessary iterations to reach the stop criterion.Keywords: PSO, QPSO, function approximation, AI, optimization, multidimensional functions
Procedia PDF Downloads 58910693 Machine Learning Based Anomaly Detection in Hydraulic Units of Governors in Hydroelectric Power Plants
Authors: Mehmet Akif Bütüner, İlhan Koşalay
Abstract:
Hydroelectric power plants (HEPPs) are renewable energy power plants with the highest installed power in the world. While the control systems operating in these power plants ensure that the system operates at the desired operating point, it is also responsible for stopping the relevant unit safely in case of any malfunction. While these control systems are expected not to miss signals that require stopping, on the other hand, it is desired not to cause unnecessary stops. In traditional control systems including modern systems with SCADA infrastructure, alarm conditions to create warnings or trip conditions to put relevant unit out of service automatically are usually generated with predefined limits regardless of different operating conditions. This approach results in alarm/trip conditions to be less likely to detect minimal changes which may result in serious malfunction scenarios in near future. With the methods proposed in this research, routine behavior of the oil circulation of hydraulic governor of a HEPP will be modeled with machine learning methods using historical data obtained from SCADA system. Using the created model and recently gathered data from control system, oil pressure of hydraulic accumulators will be estimated. Comparison of this estimation with the measurements made and recorded instantly by the SCADA system will help to foresee failure before becoming worse and determine remaining useful life. By using model outputs, maintenance works will be made more planned, so that undesired stops are prevented, and in case of any malfunction, the system will be stopped or several alarms are triggered before the problem grows.Keywords: hydroelectric, governor, anomaly detection, machine learning, regression
Procedia PDF Downloads 9710692 Unveiling Comorbidities in Irritable Bowel Syndrome: A UK BioBank Study utilizing Supervised Machine Learning
Authors: Uswah Ahmad Khan, Muhammad Moazam Fraz, Humayoon Shafique Satti, Qasim Aziz
Abstract:
Approximately 10-14% of the global population experiences a functional disorder known as irritable bowel syndrome (IBS). The disorder is defined by persistent abdominal pain and an irregular bowel pattern. IBS significantly impairs work productivity and disrupts patients' daily lives and activities. Although IBS is widespread, there is still an incomplete understanding of its underlying pathophysiology. This study aims to help characterize the phenotype of IBS patients by differentiating the comorbidities found in IBS patients from those in non-IBS patients using machine learning algorithms. In this study, we extracted samples coding for IBS from the UK BioBank cohort and randomly selected patients without a code for IBS to create a total sample size of 18,000. We selected the codes for comorbidities of these cases from 2 years before and after their IBS diagnosis and compared them to the comorbidities in the non-IBS cohort. Machine learning models, including Decision Trees, Gradient Boosting, Support Vector Machine (SVM), AdaBoost, Logistic Regression, and XGBoost, were employed to assess their accuracy in predicting IBS. The most accurate model was then chosen to identify the features associated with IBS. In our case, we used XGBoost feature importance as a feature selection method. We applied different models to the top 10% of features, which numbered 50. Gradient Boosting, Logistic Regression and XGBoost algorithms yielded a diagnosis of IBS with an optimal accuracy of 71.08%, 71.427%, and 71.53%, respectively. Among the comorbidities most closely associated with IBS included gut diseases (Haemorrhoids, diverticular diseases), atopic conditions(asthma), and psychiatric comorbidities (depressive episodes or disorder, anxiety). This finding emphasizes the need for a comprehensive approach when evaluating the phenotype of IBS, suggesting the possibility of identifying new subsets of IBS rather than relying solely on the conventional classification based on stool type. Additionally, our study demonstrates the potential of machine learning algorithms in predicting the development of IBS based on comorbidities, which may enhance diagnosis and facilitate better management of modifiable risk factors for IBS. Further research is necessary to confirm our findings and establish cause and effect. Alternative feature selection methods and even larger and more diverse datasets may lead to more accurate classification models. Despite these limitations, our findings highlight the effectiveness of Logistic Regression and XGBoost in predicting IBS diagnosis.Keywords: comorbidities, disease association, irritable bowel syndrome (IBS), predictive analytics
Procedia PDF Downloads 11810691 What the Future Holds for Social Media Data Analysis
Authors: P. Wlodarczak, J. Soar, M. Ally
Abstract:
The dramatic rise in the use of Social Media (SM) platforms such as Facebook and Twitter provide access to an unprecedented amount of user data. Users may post reviews on products and services they bought, write about their interests, share ideas or give their opinions and views on political issues. There is a growing interest in the analysis of SM data from organisations for detecting new trends, obtaining user opinions on their products and services or finding out about their online reputations. A recent research trend in SM analysis is making predictions based on sentiment analysis of SM. Often indicators of historic SM data are represented as time series and correlated with a variety of real world phenomena like the outcome of elections, the development of financial indicators, box office revenue and disease outbreaks. This paper examines the current state of research in the area of SM mining and predictive analysis and gives an overview of the analysis methods using opinion mining and machine learning techniques.Keywords: social media, text mining, knowledge discovery, predictive analysis, machine learning
Procedia PDF Downloads 423