Search results for: machine learning algorithms
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 9650

Search results for: machine learning algorithms

9290 Supervised Machine Learning Approach for Studying the Effect of Different Joint Sets on Stability of Mine Pit Slopes Under the Presence of Different External Factors

Authors: Sudhir Kumar Singh, Debashish Chakravarty

Abstract:

Slope stability analysis is an important aspect in the field of geotechnical engineering. It is also important from safety, and economic point of view as any slope failure leads to loss of valuable lives and damage to property worth millions. This paper aims at mitigating the risk of slope failure by studying the effect of different joint sets on the stability of mine pit slopes under the influence of various external factors, namely degree of saturation, rainfall intensity, and seismic coefficients. Supervised machine learning approach has been utilized for making accurate and reliable predictions regarding the stability of slopes based on the value of Factor of Safety. Numerous cases have been studied for analyzing the stability of slopes using the popular Finite Element Method, and the data thus obtained has been used as training data for the supervised machine learning models. The input data has been trained on different supervised machine learning models, namely Random Forest, Decision Tree, Support vector Machine, and XGBoost. Distinct test data that is not present in training data has been used for measuring the performance and accuracy of different models. Although all models have performed well on the test dataset but Random Forest stands out from others due to its high accuracy of greater than 95%, thus helping us by providing a valuable tool at our disposition which is neither computationally expensive nor time consuming and in good accordance with the numerical analysis result.

Keywords: finite element method, geotechnical engineering, machine learning, slope stability

Procedia PDF Downloads 96
9289 StockTwits Sentiment Analysis on Stock Price Prediction

Authors: Min Chen, Rubi Gupta

Abstract:

Understanding and predicting stock market movements is a challenging problem. It is believed stock markets are partially driven by public sentiments, which leads to numerous research efforts to predict stock market trend using public sentiments expressed on social media such as Twitter but with limited success. Recently a microblogging website StockTwits is becoming increasingly popular for users to share their discussions and sentiments about stocks and financial market. In this project, we analyze the text content of StockTwits tweets and extract financial sentiment using text featurization and machine learning algorithms. StockTwits tweets are first pre-processed using techniques including stopword removal, special character removal, and case normalization to remove noise. Features are extracted from these preprocessed tweets through text featurization process using bags of words, N-gram models, TF-IDF (term frequency-inverse document frequency), and latent semantic analysis. Machine learning models are then trained to classify the tweets' sentiment as positive (bullish) or negative (bearish). The correlation between the aggregated daily sentiment and daily stock price movement is then investigated using Pearson’s correlation coefficient. Finally, the sentiment information is applied together with time series stock data to predict stock price movement. The experiments on five companies (Apple, Amazon, General Electric, Microsoft, and Target) in a duration of nine months demonstrate the effectiveness of our study in improving the prediction accuracy.

Keywords: machine learning, sentiment analysis, stock price prediction, tweet processing

Procedia PDF Downloads 147
9288 Transforming Data Science Curriculum Through Design Thinking

Authors: Samar Swaid

Abstract:

Today, corporates are moving toward the adoption of Design-Thinking techniques to develop products and services, putting their consumer as the heart of the development process. One of the leading companies in Design-Thinking, IDEO (Innovation, Design, Engineering Organization), defines Design-Thinking as an approach to problem-solving that relies on a set of multi-layered skills, processes, and mindsets that help people generate novel solutions to problems. Design thinking may result in new ideas, narratives, objects or systems. It is about redesigning systems, organizations, infrastructures, processes, and solutions in an innovative fashion based on the users' feedback. Tim Brown, president and CEO of IDEO, sees design thinking as a human-centered approach that draws from the designer's toolkit to integrate people's needs, innovative technologies, and business requirements. The application of design thinking has been witnessed to be the road to developing innovative applications, interactive systems, scientific software, healthcare application, and even to utilizing Design-Thinking to re-think business operations, as in the case of Airbnb. Recently, there has been a movement to apply design thinking to machine learning and artificial intelligence to ensure creating the "wow" effect on consumers. The Association of Computing Machinery task force on Data Science program states that" Data scientists should be able to implement and understand algorithms for data collection and analysis. They should understand the time and space considerations of algorithms. They should follow good design principles developing software, understanding the importance of those principles for testability and maintainability" However, this definition hides the user behind the machine who works on data preparation, algorithm selection and model interpretation. Thus, the Data Science program includes design thinking to ensure meeting the user demands, generating more usable machine learning tools, and developing ways of framing computational thinking. Here, describe the fundamentals of Design-Thinking and teaching modules for data science programs.

Keywords: data science, design thinking, AI, currculum, transformation

Procedia PDF Downloads 74
9287 Sentiment Analysis of Consumers’ Perceptions on Social Media about the Main Mobile Providers in Jamaica

Authors: Sherrene Bogle, Verlia Bogle, Tyrone Anderson

Abstract:

In recent years, organizations have become increasingly interested in the possibility of analyzing social media as a means of gaining meaningful feedback about their products and services. The aspect based sentiment analysis approach is used to predict the sentiment for Twitter datasets for Digicel and Lime, the main mobile companies in Jamaica, using supervised learning classification techniques. The results indicate an average of 82.2 percent accuracy in classifying tweets when comparing three separate classification algorithms against the purported baseline of 70 percent and an average root mean squared error of 0.31. These results indicate that the analysis of sentiment on social media in order to gain customer feedback can be a viable solution for mobile companies looking to improve business performance.

Keywords: machine learning, sentiment analysis, social media, supervised learning

Procedia PDF Downloads 434
9286 Algorithms Minimizing Total Tardiness

Authors: Harun Aydilek, Asiye Aydilek, Ali Allahverdi

Abstract:

The total tardiness is a widely used performance measure in the scheduling literature. This performance measure is particularly important in situations where there is a cost to complete a job beyond its due date. The cost of scheduling increases as the gap between a job's due date and its completion time increases. Such costs may also be penalty costs in contracts, loss of goodwill. This performance measure is important as the fulfillment of due dates of customers has to be taken into account while making scheduling decisions. The problem is addressed in the literature, however, it has been assumed zero setup times. Even though this assumption may be valid for some environments, it is not valid for some other scheduling environments. When setup times are treated as separate from processing times, it is possible to increase machine utilization and to reduce total tardiness. Therefore, non-zero setup times need to be considered as separate. A dominance relation is developed and several algorithms are proposed. The developed dominance relation is utilized in the proposed algorithms. Extensive computational experiments are conducted for the evaluation of the algorithms. The experiments indicated that the developed algorithms perform much better than the existing algorithms in the literature. More specifically, one of the newly proposed algorithms reduces the error of the best existing algorithm in the literature by 40 percent.

Keywords: algorithm, assembly flowshop, dominance relation, total tardiness

Procedia PDF Downloads 347
9285 Movie Genre Preference Prediction Using Machine Learning for Customer-Based Information

Authors: Haifeng Wang, Haili Zhang

Abstract:

Most movie recommendation systems have been developed for customers to find items of interest. This work introduces a predictive model usable by small and medium-sized enterprises (SMEs) who are in need of a data-based and analytical approach to stock proper movies for local audiences and retain more customers. We used classification models to extract features from thousands of customers’ demographic, behavioral and social information to predict their movie genre preference. In the implementation, a Gaussian kernel support vector machine (SVM) classification model and a logistic regression model were established to extract features from sample data and their test error-in-sample were compared. Comparison of error-out-sample was also made under different Vapnik–Chervonenkis (VC) dimensions in the machine learning algorithm to find and prevent overfitting. Gaussian kernel SVM prediction model can correctly predict movie genre preferences in 85% of positive cases. The accuracy of the algorithm increased to 93% with a smaller VC dimension and less overfitting. These findings advance our understanding of how to use machine learning approach to predict customers’ preferences with a small data set and design prediction tools for these enterprises.

Keywords: computational social science, movie preference, machine learning, SVM

Procedia PDF Downloads 254
9284 An Ensemble Learning Method for Applying Particle Swarm Optimization Algorithms to Systems Engineering Problems

Authors: Ken Hampshire, Thomas Mazzuchi, Shahram Sarkani

Abstract:

As a subset of metaheuristics, nature-inspired optimization algorithms such as particle swarm optimization (PSO) have shown promise both in solving intractable problems and in their extensibility to novel problem formulations due to their general approach requiring few assumptions. Unfortunately, single instantiations of algorithms require detailed tuning of parameters and cannot be proven to be best suited to a particular illustrative problem on account of the “no free lunch” (NFL) theorem. Using these algorithms in real-world problems requires exquisite knowledge of the many techniques and is not conducive to reconciling the various approaches to given classes of problems. This research aims to present a unified view of PSO-based approaches from the perspective of relevant systems engineering problems, with the express purpose of then eliciting the best solution for any problem formulation in an ensemble learning bucket of models approach. The central hypothesis of the research is that extending the PSO algorithms found in the literature to real-world optimization problems requires a general ensemble-based method for all problem formulations but a specific implementation and solution for any instance. The main results are a problem-based literature survey and a general method to find more globally optimal solutions for any systems engineering optimization problem.

Keywords: particle swarm optimization, nature-inspired optimization, metaheuristics, systems engineering, ensemble learning

Procedia PDF Downloads 91
9283 Violence Detection and Tracking on Moving Surveillance Video Using Machine Learning Approach

Authors: Abe Degale D., Cheng Jian

Abstract:

When creating automated video surveillance systems, violent action recognition is crucial. In recent years, hand-crafted feature detectors have been the primary method for achieving violence detection, such as the recognition of fighting activity. Researchers have also looked into learning-based representational models. On benchmark datasets created especially for the detection of violent sequences in sports and movies, these methods produced good accuracy results. The Hockey dataset's videos with surveillance camera motion present challenges for these algorithms for learning discriminating features. Image recognition and human activity detection challenges have shown success with deep representation-based methods. For the purpose of detecting violent images and identifying aggressive human behaviours, this research suggested a deep representation-based model using the transfer learning idea. The results show that the suggested approach outperforms state-of-the-art accuracy levels by learning the most discriminating features, attaining 99.34% and 99.98% accuracy levels on the Hockey and Movies datasets, respectively.

Keywords: violence detection, faster RCNN, transfer learning and, surveillance video

Procedia PDF Downloads 95
9282 Improving Machine Learning Translation of Hausa Using Named Entity Recognition

Authors: Aishatu Ibrahim Birma, Aminu Tukur, Abdulkarim Abbass Gora

Abstract:

Machine translation plays a vital role in the Field of Natural Language Processing (NLP), breaking down language barriers and enabling communication across diverse communities. In the context of Hausa, a widely spoken language in West Africa, mainly in Nigeria, effective translation systems are essential for enabling seamless communication and promoting cultural exchange. However, due to the unique linguistic characteristics of Hausa, accurate translation remains a challenging task. The research proposes an approach to improving the machine learning translation of Hausa by integrating Named Entity Recognition (NER) techniques. Named entities, such as person names, locations, organizations, and dates, are critical components of a language's structure and meaning. Incorporating NER into the translation process can enhance the quality and accuracy of translations by preserving the integrity of named entities and also maintaining consistency in translating entities (e.g., proper names), and addressing the cultural references specific to Hausa. The NER will be incorporated into Neural Machine Translation (NMT) for the Hausa to English Translation.

Keywords: machine translation, natural language processing (NLP), named entity recognition (NER), neural machine translation (NMT)

Procedia PDF Downloads 34
9281 Polarity Classification of Social Media Comments in Turkish

Authors: Migena Ceyhan, Zeynep Orhan, Dimitrios Karras

Abstract:

People in modern societies are continuously sharing their experiences, emotions, and thoughts in different areas of life. The information reaches almost everyone in real-time and can have an important impact in shaping people’s way of living. This phenomenon is very well recognized and advantageously used by the market representatives, trying to earn the most from this means. Given the abundance of information, people and organizations are looking for efficient tools that filter the countless data into important information, ready to analyze. This paper is a modest contribution in this field, describing the process of automatically classifying social media comments in the Turkish language into positive or negative. Once data is gathered and preprocessed, feature sets of selected single words or groups of words are build according to the characteristics of language used in the texts. These features are used later to train, and test a system according to different machine learning algorithms (Naïve Bayes, Sequential Minimal Optimization, J48, and Bayesian Linear Regression). The resultant high accuracies can be important feedback for decision-makers to improve the business strategies accordingly.

Keywords: feature selection, machine learning, natural language processing, sentiment analysis, social media reviews

Procedia PDF Downloads 143
9280 An Automated Stock Investment System Using Machine Learning Techniques: An Application in Australia

Authors: Carol Anne Hargreaves

Abstract:

A key issue in stock investment is how to select representative features for stock selection. The objective of this paper is to firstly determine whether an automated stock investment system, using machine learning techniques, may be used to identify a portfolio of growth stocks that are highly likely to provide returns better than the stock market index. The second objective is to identify the technical features that best characterize whether a stock’s price is likely to go up and to identify the most important factors and their contribution to predicting the likelihood of the stock price going up. Unsupervised machine learning techniques, such as cluster analysis, were applied to the stock data to identify a cluster of stocks that was likely to go up in price – portfolio 1. Next, the principal component analysis technique was used to select stocks that were rated high on component one and component two – portfolio 2. Thirdly, a supervised machine learning technique, the logistic regression method, was used to select stocks with a high probability of their price going up – portfolio 3. The predictive models were validated with metrics such as, sensitivity (recall), specificity and overall accuracy for all models. All accuracy measures were above 70%. All portfolios outperformed the market by more than eight times. The top three stocks were selected for each of the three stock portfolios and traded in the market for one month. After one month the return for each stock portfolio was computed and compared with the stock market index returns. The returns for all three stock portfolios was 23.87% for the principal component analysis stock portfolio, 11.65% for the logistic regression portfolio and 8.88% for the K-means cluster portfolio while the stock market performance was 0.38%. This study confirms that an automated stock investment system using machine learning techniques can identify top performing stock portfolios that outperform the stock market.

Keywords: machine learning, stock market trading, logistic regression, cluster analysis, factor analysis, decision trees, neural networks, automated stock investment system

Procedia PDF Downloads 151
9279 A System to Detect Inappropriate Messages in Online Social Networks

Authors: Shivani Singh, Shantanu Nakhare, Kalyani Nair, Rohan Shetty

Abstract:

As social networking is growing at a rapid pace today it is vital that we work on improving its management. Research has shown that the content present in online social networks may have significant influence on impressionable minds. If such platforms are misused, it will lead to negative consequences. Detecting insults or inappropriate messages continues to be one of the most challenging aspects of Online Social Networks (OSNs) today. We address this problem through a Machine Learning Based Soft Text Classifier approach using Support Vector Machine algorithm. The proposed system acts as a screening mechanism the alerts the user about such messages. The messages are classified according to their subject matter and each comment is labeled for the presence of profanity and insults.

Keywords: machine learning, online social networks, soft text classifier, support vector machine

Procedia PDF Downloads 500
9278 Predicting the Frequencies of Tropical Cyclone-Induced Rainfall Events in the US Using a Machine-Learning Model

Authors: Elham Sharifineyestani, Mohammad Farshchin

Abstract:

Tropical cyclones are one of the most expensive and deadliest natural disasters. They cause heavy rainfall and serious flash flooding that result in billions of dollars of damage and considerable mortality each year in the United States. Prediction of the frequency of tropical cyclone-induced rainfall events can be helpful in emergency planning and flood risk management. In this study, we have developed a machine-learning model to predict the exceedance frequencies of tropical cyclone-induced rainfall events in the United States. Model results show a satisfactory agreement with available observations. To examine the effectiveness of our approach, we also have compared the result of our predictions with the exceedance frequencies predicted using a physics-based rainfall model by Feldmann.

Keywords: flash flooding, tropical cyclones, frequencies, machine learning, risk management

Procedia PDF Downloads 241
9277 Effectiveness of Reinforcement Learning (RL) for Autonomous Energy Management Solutions

Authors: Tesfaye Mengistu

Abstract:

This thesis aims to investigate the effectiveness of Reinforcement Learning (RL) for Autonomous Energy Management solutions. The study explores the potential of Model Free RL approaches, such as Monte Carlo RL and Q-learning, to improve energy management by autonomously adjusting energy management strategies to maximize efficiency. The research investigates the implementation of RL algorithms for optimizing energy consumption in a single-agent environment. The focus is on developing a framework for the implementation of RL algorithms, highlighting the importance of RL for enabling autonomous systems to adapt quickly to changing conditions and make decisions based on previous experiences. Moreover, the paper proposes RL as a novel energy management solution to address nations' CO2 emission goals. Reinforcement learning algorithms are well-suited to solving problems with sequential decision-making patterns and can provide accurate and immediate outputs to ease the planning and decision-making process. This research provides insights into the challenges and opportunities of using RL for energy management solutions and recommends further studies to explore its full potential. In conclusion, this study provides valuable insights into how RL can be used to improve the efficiency of energy management systems and supports the use of RL as a promising approach for developing autonomous energy management solutions in residential buildings.

Keywords: artificial intelligence, reinforcement learning, monte carlo, energy management, CO2 emission

Procedia PDF Downloads 77
9276 Comprehensive Study of Data Science

Authors: Asifa Amara, Prachi Singh, Kanishka, Debargho Pathak, Akshat Kumar, Jayakumar Eravelly

Abstract:

Today's generation is totally dependent on technology that uses data as its fuel. The present study is all about innovations and developments in data science and gives an idea about how efficiently to use the data provided. This study will help to understand the core concepts of data science. The concept of artificial intelligence was introduced by Alan Turing in which the main principle was to create an artificial system that can run independently of human-given programs and can function with the help of analyzing data to understand the requirements of the users. Data science comprises business understanding, analyzing data, ethical concerns, understanding programming languages, various fields and sources of data, skills, etc. The usage of data science has evolved over the years. In this review article, we have covered a part of data science, i.e., machine learning. Machine learning uses data science for its work. Machines learn through their experience, which helps them to do any work more efficiently. This article includes a comparative study image between human understanding and machine understanding, advantages, applications, and real-time examples of machine learning. Data science is an important game changer in the life of human beings. Since the advent of data science, we have found its benefits and how it leads to a better understanding of people, and how it cherishes individual needs. It has improved business strategies, services provided by them, forecasting, the ability to attend sustainable developments, etc. This study also focuses on a better understanding of data science which will help us to create a better world.

Keywords: data science, machine learning, data analytics, artificial intelligence

Procedia PDF Downloads 78
9275 Integration of Big Data to Predict Transportation for Smart Cities

Authors: Sun-Young Jang, Sung-Ah Kim, Dongyoun Shin

Abstract:

The Intelligent transportation system is essential to build smarter cities. Machine learning based transportation prediction could be highly promising approach by delivering invisible aspect visible. In this context, this research aims to make a prototype model that predicts transportation network by using big data and machine learning technology. In detail, among urban transportation systems this research chooses bus system.  The research problem that existing headway model cannot response dynamic transportation conditions. Thus, bus delay problem is often occurred. To overcome this problem, a prediction model is presented to fine patterns of bus delay by using a machine learning implementing the following data sets; traffics, weathers, and bus statues. This research presents a flexible headway model to predict bus delay and analyze the result. The prototyping model is composed by real-time data of buses. The data are gathered through public data portals and real time Application Program Interface (API) by the government. These data are fundamental resources to organize interval pattern models of bus operations as traffic environment factors (road speeds, station conditions, weathers, and bus information of operating in real-time). The prototyping model is designed by the machine learning tool (RapidMiner Studio) and conducted tests for bus delays prediction. This research presents experiments to increase prediction accuracy for bus headway by analyzing the urban big data. The big data analysis is important to predict the future and to find correlations by processing huge amount of data. Therefore, based on the analysis method, this research represents an effective use of the machine learning and urban big data to understand urban dynamics.

Keywords: big data, machine learning, smart city, social cost, transportation network

Procedia PDF Downloads 251
9274 Comparative Analysis of Reinforcement Learning Algorithms for Autonomous Driving

Authors: Migena Mana, Ahmed Khalid Syed, Abdul Malik, Nikhil Cherian

Abstract:

In recent years, advancements in deep learning enabled researchers to tackle the problem of self-driving cars. Car companies use huge datasets to train their deep learning models to make autonomous cars a reality. However, this approach has certain drawbacks in that the state space of possible actions for a car is so huge that there cannot be a dataset for every possible road scenario. To overcome this problem, the concept of reinforcement learning (RL) is being investigated in this research. Since the problem of autonomous driving can be modeled in a simulation, it lends itself naturally to the domain of reinforcement learning. The advantage of this approach is that we can model different and complex road scenarios in a simulation without having to deploy in the real world. The autonomous agent can learn to drive by finding the optimal policy. This learned model can then be easily deployed in a real-world setting. In this project, we focus on three RL algorithms: Q-learning, Deep Deterministic Policy Gradient (DDPG), and Proximal Policy Optimization (PPO). To model the environment, we have used TORCS (The Open Racing Car Simulator), which provides us with a strong foundation to test our model. The inputs to the algorithms are the sensor data provided by the simulator such as velocity, distance from side pavement, etc. The outcome of this research project is a comparative analysis of these algorithms. Based on the comparison, the PPO algorithm gives the best results. When using PPO algorithm, the reward is greater, and the acceleration, steering angle and braking are more stable compared to the other algorithms, which means that the agent learns to drive in a better and more efficient way in this case. Additionally, we have come up with a dataset taken from the training of the agent with DDPG and PPO algorithms. It contains all the steps of the agent during one full training in the form: (all input values, acceleration, steering angle, break, loss, reward). This study can serve as a base for further complex road scenarios. Furthermore, it can be enlarged in the field of computer vision, using the images to find the best policy.

Keywords: autonomous driving, DDPG (deep deterministic policy gradient), PPO (proximal policy optimization), reinforcement learning

Procedia PDF Downloads 138
9273 A Comparative Study of Twin Delayed Deep Deterministic Policy Gradient and Soft Actor-Critic Algorithms for Robot Exploration and Navigation in Unseen Environments

Authors: Romisaa Ali

Abstract:

This paper presents a comparison between twin-delayed Deep Deterministic Policy Gradient (TD3) and Soft Actor-Critic (SAC) reinforcement learning algorithms in the context of training robust navigation policies for Jackal robots. By leveraging an open-source framework and custom motion control environments, the study evaluates the performance, robustness, and transferability of the trained policies across a range of scenarios. The primary focus of the experiments is to assess the training process, the adaptability of the algorithms, and the robot’s ability to navigate in previously unseen environments. Moreover, the paper examines the influence of varying environmental complexities on the learning process and the generalization capabilities of the resulting policies. The results of this study aim to inform and guide the development of more efficient and practical reinforcement learning-based navigation policies for Jackal robots in real-world scenarios.

Keywords: Jackal robot environments, reinforcement learning, TD3, SAC, robust navigation, transferability, custom environment

Procedia PDF Downloads 94
9272 Loan Repayment Prediction Using Machine Learning: Model Development, Django Web Integration and Cloud Deployment

Authors: Seun Mayowa Sunday

Abstract:

Loan prediction is one of the most significant and recognised fields of research in the banking, insurance, and the financial security industries. Some prediction systems on the market include the construction of static software. However, due to the fact that static software only operates with strictly regulated rules, they cannot aid customers beyond these limitations. Application of many machine learning (ML) techniques are required for loan prediction. Four separate machine learning models, random forest (RF), decision tree (DT), k-nearest neighbour (KNN), and logistic regression, are used to create the loan prediction model. Using the anaconda navigator and the required machine learning (ML) libraries, models are created and evaluated using the appropriate measuring metrics. From the finding, the random forest performs with the highest accuracy of 80.17% which was later implemented into the Django framework. For real-time testing, the web application is deployed on the Alibabacloud which is among the top 4 biggest cloud computing provider. Hence, to the best of our knowledge, this research will serve as the first academic paper which combines the model development and the Django framework, with the deployment into the Alibaba cloud computing application.

Keywords: k-nearest neighbor, random forest, logistic regression, decision tree, django, cloud computing, alibaba cloud

Procedia PDF Downloads 123
9271 Breast Cancer Diagnosing Based on Online Sequential Extreme Learning Machine Approach

Authors: Musatafa Abbas Abbood Albadr, Masri Ayob, Sabrina Tiun, Fahad Taha Al-Dhief, Mohammad Kamrul Hasan

Abstract:

Breast Cancer (BC) is considered one of the most frequent reasons of cancer death in women between 40 to 55 ages. The BC is diagnosed by using digital images of the FNA (Fine Needle Aspirate) for both benign and malignant tumors of the breast mass. Therefore, this work proposes the Online Sequential Extreme Learning Machine (OSELM) algorithm for diagnosing BC by using the tumor features of the breast mass. The current work has used the Wisconsin Diagnosis Breast Cancer (WDBC) dataset, which contains 569 samples (i.e., 357 samples for benign class and 212 samples for malignant class). Further, numerous measurements of assessment were used in order to evaluate the proposed OSELM algorithm, such as specificity, precision, F-measure, accuracy, G-mean, MCC, and recall. According to the outcomes of the experiment, the highest performance of the proposed OSELM was accomplished with 97.66% accuracy, 98.39% recall, 95.31% precision, 97.25% specificity, 96.83% F-measure, 95.00% MCC, and 96.84% G-Mean. The proposed OSELM algorithm demonstrates promising results in diagnosing BC. Besides, the performance of the proposed OSELM algorithm was superior to all its comparatives with respect to the rate of classification.

Keywords: breast cancer, machine learning, online sequential extreme learning machine, artificial intelligence

Procedia PDF Downloads 103
9270 Modeling Floodplain Vegetation Response to Groundwater Variability Using ArcSWAT Hydrological Model, Moderate Resolution Imaging Spectroradiometer - Normalised Difference Vegetation Index Data, and Machine Learning

Authors: Newton Muhury, Armando A. Apan, Tek Maraseni

Abstract:

This study modelled the relationships between vegetation response and available water below the soil surface using the Terra’s Moderate Resolution Imaging Spectroradiometer (MODIS) generated Normalised Difference Vegetation Index (NDVI) and soil water content (SWC) data. The Soil & Water Assessment Tool (SWAT) interface known as ArcSWAT was used in ArcGIS for the groundwater analysis. The SWAT model was calibrated and validated in SWAT-CUP software using 10 years (2001-2010) of monthly streamflow data. The average Nash-Sutcliffe Efficiency during the calibration and validation was 0.54 and 0.51, respectively, indicating that the model performances were good. Twenty years (2001-2020) of monthly MODIS NDVI data for three different types of vegetation (forest, shrub, and grass) and soil water content for 43 sub-basins were analysed using the WEKA, machine learning tool with a selection of two supervised machine learning algorithms, i.e., support vector machine (SVM) and random forest (RF). The modelling results show that different types of vegetation response and soil water content vary in the dry and wet season. For example, the model generated high positive relationships (r=0.76, 0.73, and 0.81) between the measured and predicted NDVI values of all vegetation in the study area against the groundwater flow (GW), soil water content (SWC), and the combination of these two variables, respectively, during the dry season. However, these relationships were reduced by 36.8% (r=0.48) and 13.6% (r=0.63) against GW and SWC, respectively, in the wet season. On the other hand, the model predicted a moderate positive relationship (r=0.63) between shrub vegetation type and soil water content during the dry season, which was reduced by 31.7% (r=0.43) during the wet season. Our models also predicted that vegetation in the top location (upper part) of the sub-basin is highly responsive to GW and SWC (r=0.78, and 0.70) during the dry season. The results of this study indicate the study region is suitable for seasonal crop production in dry season. Moreover, the results predicted that the growth of vegetation in the top-point location is highly dependent on groundwater flow in both dry and wet seasons, and any instability or long-term drought can negatively affect these floodplain vegetation communities. This study has enriched our knowledge of vegetation responses to groundwater in each season, which will facilitate better floodplain vegetation management.

Keywords: ArcSWAT, machine learning, floodplain vegetation, MODIS NDVI, groundwater

Procedia PDF Downloads 110
9269 Organizational Innovations of the 20th Century as High Tech of the 21st: Evidence from Patent Data

Authors: Valery Yakubovich, Shuping wu

Abstract:

Organization theorists have long claimed that organizational innovations are nontechnological, in part because they are unpatentable. The claim rests on the assumption that organizational innovations are abstract ideas embodied in persons and contexts rather than in context-free practical tools. However, over the last three decades, organizational knowledge has been increasingly embodied in digital tools which, in principle, can be patented. To provide the first empirical evidence regarding the patentability of organizational innovations, we trained two machine learning algorithms to identify a population of 205,434 patent applications for organizational technologies (OrgTech) and, among them, 141,285 applications that use organizational innovations accumulated over the 20th century. Our event history analysis of the probability of patenting an OrgTech invention shows that ideas from organizational innovations decrease the probability of patent allowance unless they describe a practical tool. We conclude that the present-day digital transformation places organizational innovations in the realm of high tech and turns the debate about organizational technologies into the challenge of designing practical organizational tools that embody big ideas about organizing. We outline an agenda for patent-based research on OrgTech as an emerging phenomenon.

Keywords: organizational innovation, organizational technology, high tech, patents, machine learning

Procedia PDF Downloads 116
9268 Human Digital Twin for Personal Conversation Automation Using Supervised Machine Learning Approaches

Authors: Aya Salama

Abstract:

Digital Twin is an emerging research topic that attracted researchers in the last decade. It is used in many fields, such as smart manufacturing and smart healthcare because it saves time and money. It is usually related to other technologies such as Data Mining, Artificial Intelligence, and Machine Learning. However, Human digital twin (HDT), in specific, is still a novel idea that still needs to prove its feasibility. HDT expands the idea of Digital Twin to human beings, which are living beings and different from the inanimate physical entities. The goal of this research was to create a Human digital twin that is responsible for real-time human replies automation by simulating human behavior. For this reason, clustering, supervised classification, topic extraction, and sentiment analysis were studied in this paper. The feasibility of the HDT for personal replies generation on social messaging applications was proved in this work. The overall accuracy of the proposed approach in this paper was 63% which is a very promising result that can open the way for researchers to expand the idea of HDT. This was achieved by using Random Forest for clustering the question data base and matching new questions. K-nearest neighbor was also applied for sentiment analysis.

Keywords: human digital twin, sentiment analysis, topic extraction, supervised machine learning, unsupervised machine learning, classification, clustering

Procedia PDF Downloads 83
9267 Analysis and Detection of Facial Expressions in Autism Spectrum Disorder People Using Machine Learning

Authors: Muhammad Maisam Abbas, Salman Tariq, Usama Riaz, Muhammad Tanveer, Humaira Abdul Ghafoor

Abstract:

Autism Spectrum Disorder (ASD) refers to a developmental disorder that impairs an individual's communication and interaction ability. Individuals feel difficult to read facial expressions while communicating or interacting. Facial Expression Recognition (FER) is a unique method of classifying basic human expressions, i.e., happiness, fear, surprise, sadness, disgust, neutral, and anger through static and dynamic sources. This paper conducts a comprehensive comparison and proposed optimal method for a continued research project—a system that can assist people who have Autism Spectrum Disorder (ASD) in recognizing facial expressions. Comparison has been conducted on three supervised learning algorithms EigenFace, FisherFace, and LBPH. The JAFFE, CK+, and TFEID (I&II) datasets have been used to train and test the algorithms. The results were then evaluated based on variance, standard deviation, and accuracy. The experiments showed that FisherFace has the highest accuracy for all datasets and is considered the best algorithm to be implemented in our system.

Keywords: autism spectrum disorder, ASD, EigenFace, facial expression recognition, FisherFace, local binary pattern histogram, LBPH

Procedia PDF Downloads 167
9266 Embedded Hybrid Intuition: A Deep Learning and Fuzzy Logic Approach to Collective Creation and Computational Assisted Narratives

Authors: Roberto Cabezas H

Abstract:

The current work shows the methodology developed to create narrative lighting spaces for the multimedia performance piece 'cluster: the vanished paradise.' This empirical research is focused on exploring unconventional roles for machines in subjective creative processes, by delving into the semantics of data and machine intelligence algorithms in hybrid technological, creative contexts to expand epistemic domains trough human-machine cooperation. The creative process in scenic and performing arts is guided mostly by intuition; from that idea, we developed an approach to embed collective intuition in computational creative systems, by joining the properties of Generative Adversarial Networks (GAN’s) and Fuzzy Clustering based on a semi-supervised data creation and analysis pipeline. The model makes use of GAN’s to learn from phenomenological data (data generated from experience with lighting scenography) and algorithmic design data (augmented data by procedural design methods), fuzzy logic clustering is then applied to artificially created data from GAN’s to define narrative transitions built on membership index; this process allowed for the creation of simple and complex spaces with expressive capabilities based on position and light intensity as the parameters to guide the narrative. Hybridization comes not only from the human-machine symbiosis but also on the integration of different techniques for the implementation of the aided design system. Machine intelligence tools as proposed in this work are well suited to redefine collaborative creation by learning to express and expand a conglomerate of ideas and a wide range of opinions for the creation of sensory experiences. We found in GAN’s and Fuzzy Logic an ideal tool to develop new computational models based on interaction, learning, emotion and imagination to expand the traditional algorithmic model of computation.

Keywords: fuzzy clustering, generative adversarial networks, human-machine cooperation, hybrid collective data, multimedia performance

Procedia PDF Downloads 136
9265 The Use of Artificial Intelligence in Diagnosis of Mastitis in Cows

Authors: Djeddi Khaled, Houssou Hind, Miloudi Abdellatif, Rabah Siham

Abstract:

In the field of veterinary medicine, there is a growing application of artificial intelligence (AI) for diagnosing bovine mastitis, a prevalent inflammatory disease in dairy cattle. AI technologies, such as automated milking systems, have streamlined the assessment of key metrics crucial for managing cow health during milking and identifying prevalent diseases, including mastitis. These automated milking systems empower farmers to implement automatic mastitis detection by analyzing indicators like milk yield, electrical conductivity, fat, protein, lactose, blood content in the milk, and milk flow rate. Furthermore, reports highlight the integration of somatic cell count (SCC), thermal infrared thermography, and diverse systems utilizing statistical models and machine learning techniques, including artificial neural networks, to enhance the overall efficiency and accuracy of mastitis detection. According to a review of 15 publications, machine learning technology can predict the risk and detect mastitis in cattle with an accuracy ranging from 87.62% to 98.10% and sensitivity and specificity ranging from 84.62% to 99.4% and 81.25% to 98.8%, respectively. Additionally, machine learning algorithms and microarray meta-analysis are utilized to identify mastitis genes in dairy cattle, providing insights into the underlying functional modules of mastitis disease. Moreover, AI applications can assist in developing predictive models that anticipate the likelihood of mastitis outbreaks based on factors such as environmental conditions, herd management practices, and animal health history. This proactive approach supports farmers in implementing preventive measures and optimizing herd health. By harnessing the power of artificial intelligence, the diagnosis of bovine mastitis can be significantly improved, enabling more effective management strategies and ultimately enhancing the health and productivity of dairy cattle. The integration of artificial intelligence presents valuable opportunities for the precise and early detection of mastitis, providing substantial benefits to the dairy industry.

Keywords: artificial insemination, automatic milking system, cattle, machine learning, mastitis

Procedia PDF Downloads 56
9264 A Machine Learning Model for Dynamic Prediction of Chronic Kidney Disease Risk Using Laboratory Data, Non-Laboratory Data, and Metabolic Indices

Authors: Amadou Wurry Jallow, Adama N. S. Bah, Karamo Bah, Shih-Ye Wang, Kuo-Chung Chu, Chien-Yeh Hsu

Abstract:

Chronic kidney disease (CKD) is a major public health challenge with high prevalence, rising incidence, and serious adverse consequences. Developing effective risk prediction models is a cost-effective approach to predicting and preventing complications of chronic kidney disease (CKD). This study aimed to develop an accurate machine learning model that can dynamically identify individuals at risk of CKD using various kinds of diagnostic data, with or without laboratory data, at different follow-up points. Creatinine is a key component used to predict CKD. These models will enable affordable and effective screening for CKD even with incomplete patient data, such as the absence of creatinine testing. This retrospective cohort study included data on 19,429 adults provided by a private research institute and screening laboratory in Taiwan, gathered between 2001 and 2015. Univariate Cox proportional hazard regression analyses were performed to determine the variables with high prognostic values for predicting CKD. We then identified interacting variables and grouped them according to diagnostic data categories. Our models used three types of data gathered at three points in time: non-laboratory, laboratory, and metabolic indices data. Next, we used subgroups of variables within each category to train two machine learning models (Random Forest and XGBoost). Our machine learning models can dynamically discriminate individuals at risk for developing CKD. All the models performed well using all three kinds of data, with or without laboratory data. Using only non-laboratory-based data (such as age, sex, body mass index (BMI), and waist circumference), both models predict chronic kidney disease as accurately as models using laboratory and metabolic indices data. Our machine learning models have demonstrated the use of different categories of diagnostic data for CKD prediction, with or without laboratory data. The machine learning models are simple to use and flexible because they work even with incomplete data and can be applied in any clinical setting, including settings where laboratory data is difficult to obtain.

Keywords: chronic kidney disease, glomerular filtration rate, creatinine, novel metabolic indices, machine learning, risk prediction

Procedia PDF Downloads 100
9263 Machine Learning Techniques in Bank Credit Analysis

Authors: Fernanda M. Assef, Maria Teresinha A. Steiner

Abstract:

The aim of this paper is to compare and discuss better classifier algorithm options for credit risk assessment by applying different Machine Learning techniques. Using records from a Brazilian financial institution, this study uses a database of 5,432 companies that are clients of the bank, where 2,600 clients are classified as non-defaulters, 1,551 are classified as defaulters and 1,281 are temporarily defaulters, meaning that the clients are overdue on their payments for up 180 days. For each case, a total of 15 attributes was considered for a one-against-all assessment using four different techniques: Artificial Neural Networks Multilayer Perceptron (ANN-MLP), Artificial Neural Networks Radial Basis Functions (ANN-RBF), Logistic Regression (LR) and finally Support Vector Machines (SVM). For each method, different parameters were analyzed in order to obtain different results when the best of each technique was compared. Initially the data were coded in thermometer code (numerical attributes) or dummy coding (for nominal attributes). The methods were then evaluated for each parameter and the best result of each technique was compared in terms of accuracy, false positives, false negatives, true positives and true negatives. This comparison showed that the best method, in terms of accuracy, was ANN-RBF (79.20% for non-defaulter classification, 97.74% for defaulters and 75.37% for the temporarily defaulter classification). However, the best accuracy does not always represent the best technique. For instance, on the classification of temporarily defaulters, this technique, in terms of false positives, was surpassed by SVM, which had the lowest rate (0.07%) of false positive classifications. All these intrinsic details are discussed considering the results found, and an overview of what was presented is shown in the conclusion of this study.

Keywords: artificial neural networks (ANNs), classifier algorithms, credit risk assessment, logistic regression, machine Learning, support vector machines

Procedia PDF Downloads 100
9262 Performance Analysis of Traffic Classification with Machine Learning

Authors: Htay Htay Yi, Zin May Aye

Abstract:

Network security is role of the ICT environment because malicious users are continually growing that realm of education, business, and then related with ICT. The network security contravention is typically described and examined centrally based on a security event management system. The firewalls, Intrusion Detection System (IDS), and Intrusion Prevention System are becoming essential to monitor or prevent of potential violations, incidents attack, and imminent threats. In this system, the firewall rules are set only for where the system policies are needed. Dataset deployed in this system are derived from the testbed environment. The traffic as in DoS and PortScan traffics are applied in the testbed with firewall and IDS implementation. The network traffics are classified as normal or attacks in the existing testbed environment based on six machine learning classification methods applied in the system. It is required to be tested to get datasets and applied for DoS and PortScan. The dataset is based on CICIDS2017 and some features have been added. This system tested 26 features from the applied dataset. The system is to reduce false positive rates and to improve accuracy in the implemented testbed design. The system also proves good performance by selecting important features and comparing existing a dataset by machine learning classifiers.

Keywords: false negative rate, intrusion detection system, machine learning methods, performance

Procedia PDF Downloads 115
9261 Human-Machine Cooperation in Facial Comparison Based on Likelihood Scores

Authors: Lanchi Xie, Zhihui Li, Zhigang Li, Guiqiang Wang, Lei Xu, Yuwen Yan

Abstract:

Image-based facial features can be classified into category recognition features and individual recognition features. Current automated face recognition systems extract a specific feature vector of different dimensions from a facial image according to their pre-trained neural network. However, to improve the efficiency of parameter calculation, an algorithm generally reduces the image details by pooling. The operation will overlook the details concerned much by forensic experts. In our experiment, we adopted a variety of face recognition algorithms based on deep learning, compared a large number of naturally collected face images with the known data of the same person's frontal ID photos. Downscaling and manual handling were performed on the testing images. The results supported that the facial recognition algorithms based on deep learning detected structural and morphological information and rarely focused on specific markers such as stains and moles. Overall performance, distribution of genuine scores and impostor scores, and likelihood ratios were tested to evaluate the accuracy of biometric systems and forensic experts. Experiments showed that the biometric systems were skilled in distinguishing category features, and forensic experts were better at discovering the individual features of human faces. In the proposed approach, a fusion was performed at the score level. At the specified false accept rate, the framework achieved a lower false reject rate. This paper contributes to improving the interpretability of the objective method of facial comparison and provides a novel method for human-machine collaboration in this field.

Keywords: likelihood ratio, automated facial recognition, facial comparison, biometrics

Procedia PDF Downloads 124