Search results for: machine modelling
3874 Software Component Identification from Its Object-Oriented Code: Graph Metrics Based Approach
Authors: Manel Brichni, Abdelhak-Djamel Seriai
Abstract:
Systems are increasingly complex. To reduce their complexity, an abstract view of the system can simplify its development. To overcome this problem, we propose a method to decompose systems into subsystems while reducing their coupling. These subsystems represent components. Consisting of an existing object-oriented systems, the main idea of our approach is based on modelling as graphs all entities of an oriented object source code. Such modelling is easy to handle, so we can apply restructuring algorithms based on graph metrics. The particularity of our approach consists in integrating in addition to standard metrics, such as coupling and cohesion, some graph metrics giving more precision during the components identication. To treat this problem, we relied on the ROMANTIC approach that proposed a component-based software architecture recovery from an object oriented system.Keywords: software reengineering, software component and interfaces, metrics, graphs
Procedia PDF Downloads 5013873 Machine Learning Invariants to Detect Anomalies in Secure Water Treatment
Authors: Jonathan Heng, Yoong Cheah Huei
Abstract:
A strategic model that does not trigger any false alarms to detect anomalies in Secure Water Treatment (SWaT) test bed is presented. This model uses machine learning invariants formulated from streamlining the general form of Auto-Regressive models with eXogenous input. A creative generalized CUSUM algorithm to integrate the invariants and the detection strategy technique is successfully developed and tested in the SWaT Programmable Logic Controllers (PLCs). Three steps to fine-tune parameters, b and τ in the generalized algorithm are stated and an example used to demonstrate the tuning process is discussed. This approach can swiftly and effectively detect various scopes of cyber-attacks such as multiple points single stage and multiple points multiple stages in SWaT. This technique can be applied in water treatment plants and other cyber physical systems like power and gas plants too.Keywords: machine learning invariants, generalized CUSUM algorithm with invariants and detection strategy, scope of cyber attacks, strategic model, tuning parameters
Procedia PDF Downloads 1813872 A Machine Learning Based Method to Detect System Failure in Resource Constrained Environment
Authors: Payel Datta, Abhishek Das, Abhishek Roychoudhury, Dhiman Chattopadhyay, Tanushyam Chattopadhyay
Abstract:
Machine learning (ML) and deep learning (DL) is most predominantly used in image/video processing, natural language processing (NLP), audio and speech recognition but not that much used in system performance evaluation. In this paper, authors are going to describe the architecture of an abstraction layer constructed using ML/DL to detect the system failure. This proposed system is used to detect the system failure by evaluating the performance metrics of an IoT service deployment under constrained infrastructure environment. This system has been tested on the manually annotated data set containing different metrics of the system, like number of threads, throughput, average response time, CPU usage, memory usage, network input/output captured in different hardware environments like edge (atom based gateway) and cloud (AWS EC2). The main challenge of developing such system is that the accuracy of classification should be 100% as the error in the system has an impact on the degradation of the service performance and thus consequently affect the reliability and high availability which is mandatory for an IoT system. Proposed ML/DL classifiers work with 100% accuracy for the data set of nearly 4,000 samples captured within the organization.Keywords: machine learning, system performance, performance metrics, IoT, edge
Procedia PDF Downloads 1953871 Longevity of Soybean Seeds Submitted to Different Mechanized Harvesting Conditions
Authors: Rute Faria, Digo Moraes, Amanda Santos, Dione Morais, Maria Sartori
Abstract:
Seed vigor is a fundamental component for the good performance of the entire soybean production process. Seeds with mechanical damage at harvest time will be more susceptible to fungal and insect attack during storage, which will invariably reduce their vigor to the field, compromising uniformity and final stand performance. Harvesters, even the most modern ones, when not properly regulated or operated, can cause irreversible damages to the seeds, compromising even their commercialization. Therefore, the control of an efficient harvest is necessary in order to guarantee a good quality final product. In this work, the damage caused by two different harvesters (one rented, and another one) was evaluated, traveling in two speeds (4 and 8 km / h). The design was completely randomized in 2 x 2 factorial, with four replications. To evaluate the physiological quality seed germination and vigor tests were carried out over a period of six months. A multivariate analysis of Principal Components (PCA) and clustering allowed us to verify that the leased machine had better performance in the incidence of immediate damages in the seeds, but after a storage period of 6 months the vigor of these seeds reduced more than own machine evidencing that such a machine would bring more damages to the seeds.Keywords: Glycine max (L.), cluster analysis, PCA, vigor
Procedia PDF Downloads 2573870 A Hybrid Model of Goal, Integer and Constraint Programming for Single Machine Scheduling Problem with Sequence Dependent Setup Times: A Case Study in Aerospace Industry
Authors: Didem Can
Abstract:
Scheduling problems are one of the most fundamental issues of production systems. Many different approaches and models have been developed according to the production processes of the parts and the main purpose of the problem. In this study, one of the bottleneck stations of a company serving in the aerospace industry is analyzed and considered as a single machine scheduling problem with sequence-dependent setup times. The objective of the problem is assigning a large number of similar parts to the same shift -to reduce chemical waste- while minimizing the number of tardy jobs. The goal programming method will be used to achieve two different objectives simultaneously. The assignment of parts to the shift will be expressed using the integer programming method. Finally, the constraint programming method will be used as it provides a way to find a result in a short time by avoiding worse resulting feasible solutions with the defined variables set. The model to be established will be tested and evaluated with real data in the application part.Keywords: constraint programming, goal programming, integer programming, sequence-dependent setup, single machine scheduling
Procedia PDF Downloads 2373869 Vibration-Based Data-Driven Model for Road Health Monitoring
Authors: Guru Prakash, Revanth Dugalam
Abstract:
A road’s condition often deteriorates due to harsh loading such as overload due to trucks, and severe environmental conditions such as heavy rain, snow load, and cyclic loading. In absence of proper maintenance planning, this results in potholes, wide cracks, bumps, and increased roughness of roads. In this paper, a data-driven model will be developed to detect these damages using vibration and image signals. The key idea of the proposed methodology is that the road anomaly manifests in these signals, which can be detected by training a machine learning algorithm. The use of various machine learning techniques such as the support vector machine and Radom Forest method will be investigated. The proposed model will first be trained and tested with artificially simulated data, and the model architecture will be finalized by comparing the accuracies of various models. Once a model is fixed, the field study will be performed, and data will be collected. The field data will be used to validate the proposed model and to predict the future road’s health condition. The proposed will help to automate the road condition monitoring process, repair cost estimation, and maintenance planning process.Keywords: SVM, data-driven, road health monitoring, pot-hole
Procedia PDF Downloads 863868 An Enhanced Support Vector Machine Based Approach for Sentiment Classification of Arabic Tweets of Different Dialects
Authors: Gehad S. Kaseb, Mona F. Ahmed
Abstract:
Arabic Sentiment Analysis (SA) is one of the most common research fields with many open areas. Few studies apply SA to Arabic dialects. This paper proposes different pre-processing steps and a modified methodology to improve the accuracy using normal Support Vector Machine (SVM) classification. The paper works on two datasets, Arabic Sentiment Tweets Dataset (ASTD) and Extended Arabic Tweets Sentiment Dataset (Extended-AATSD), which are publicly available for academic use. The results show that the classification accuracy approaches 86%.Keywords: Arabic, classification, sentiment analysis, tweets
Procedia PDF Downloads 1493867 Optimal Placement and Sizing of Distributed Generation in Microgrid for Power Loss Reduction and Voltage Profile Improvement
Authors: Ferinar Moaidi, Mahdi Moaidi
Abstract:
Environmental issues and the ever-increasing in demand of electrical energy make it necessary to have distributed generation (DG) resources in the power system. In this research, in order to realize the goals of reducing losses and improving the voltage profile in a microgrid, the allocation and sizing of DGs have been used. The proposed Genetic Algorithm (GA) is described from the array of artificial intelligence methods for solving the problem. The algorithm is implemented on the IEEE 33 buses network. This study is presented in two scenarios, primarily to illustrate the effect of location and determination of DGs has been done to reduce losses and improve the voltage profile. On the other hand, decisions made with the one-level assumptions of load are not universally accepted for all levels of load. Therefore, in this study, load modelling is performed and the results are presented for multi-levels load state.Keywords: distributed generation, genetic algorithm, microgrid, load modelling, loss reduction, voltage improvement
Procedia PDF Downloads 1433866 Peril´s Environment of Energetic Infrastructure Complex System, Modelling by the Crisis Situation Algorithms
Authors: Jiří F. Urbánek, Alena Oulehlová, Hana Malachová, Jiří J. Urbánek Jr.
Abstract:
Crisis situations investigation and modelling are introduced and made within the complex system of energetic critical infrastructure, operating on peril´s environments. Every crisis situations and perils has an origin in the emergency/ crisis event occurrence and they need critical/ crisis interfaces assessment. Here, the emergency events can be expected - then crisis scenarios can be pre-prepared by pertinent organizational crisis management authorities towards their coping; or it may be unexpected - without pre-prepared scenario of event. But the both need operational coping by means of crisis management as well. The operation, forms, characteristics, behaviour and utilization of crisis management have various qualities, depending on real critical infrastructure organization perils, and prevention training processes. An aim is always - better security and continuity of the organization, which successful obtainment needs to find and investigate critical/ crisis zones and functions in critical infrastructure organization models, operating in pertinent perils environment. Our DYVELOP (Dynamic Vector Logistics of Processes) method is disposables for it. Here, it is necessary to derive and create identification algorithm of critical/ crisis interfaces. The locations of critical/ crisis interfaces are the flags of crisis situation in organization of critical infrastructure models. Then, the model of crisis situation will be displayed at real organization of Czech energetic crisis infrastructure subject in real peril environment. These efficient measures are necessary for the infrastructure protection. They will be derived for peril mitigation, crisis situation coping and for environmentally friendly organization survival, continuity and its sustainable development advanced possibilities.Keywords: algorithms, energetic infrastructure complex system, modelling, peril´s environment
Procedia PDF Downloads 4023865 Evaluation of Machine Learning Algorithms and Ensemble Methods for Prediction of Students’ Graduation
Authors: Soha A. Bahanshal, Vaibhav Verdhan, Bayong Kim
Abstract:
Graduation rates at six-year colleges are becoming a more essential indicator for incoming fresh students and for university rankings. Predicting student graduation is extremely beneficial to schools and has a huge potential for targeted intervention. It is important for educational institutions since it enables the development of strategic plans that will assist or improve students' performance in achieving their degrees on time (GOT). A first step and a helping hand in extracting useful information from these data and gaining insights into the prediction of students' progress and performance is offered by machine learning techniques. Data analysis and visualization techniques are applied to understand and interpret the data. The data used for the analysis contains students who have graduated in 6 years in the academic year 2017-2018 for science majors. This analysis can be used to predict the graduation of students in the next academic year. Different Predictive modelings such as logistic regression, decision trees, support vector machines, Random Forest, Naïve Bayes, and KNeighborsClassifier are applied to predict whether a student will graduate. These classifiers were evaluated with k folds of 5. The performance of these classifiers was compared based on accuracy measurement. The results indicated that Ensemble Classifier achieves better accuracy, about 91.12%. This GOT prediction model would hopefully be useful to university administration and academics in developing measures for assisting and boosting students' academic performance and ensuring they graduate on time.Keywords: prediction, decision trees, machine learning, support vector machine, ensemble model, student graduation, GOT graduate on time
Procedia PDF Downloads 723864 Storms Dynamics in the Black Sea in the Context of the Climate Changes
Authors: Eugen Rusu
Abstract:
The objective of the work proposed is to perform an analysis of the wave conditions in the Black Sea basin. This is especially focused on the spatial and temporal occurrences and on the dynamics of the most extreme storms in the context of the climate changes. A numerical modelling system, based on the spectral phase averaged wave model SWAN, has been implemented and validated against both in situ measurements and remotely sensed data, all along the sea. Moreover, a successive correction method for the assimilation of the satellite data has been associated with the wave modelling system. This is based on the optimal interpolation of the satellite data. Previous studies show that the process of data assimilation improves considerably the reliability of the results provided by the modelling system. This especially concerns the most sensitive cases from the point of view of the accuracy of the wave predictions, as the extreme storm situations are. Following this numerical approach, it has to be highlighted that the results provided by the wave modelling system above described are in general in line with those provided by some similar wave prediction systems implemented in enclosed or semi-enclosed sea basins. Simulations of this wave modelling system with data assimilation have been performed for the 30-year period 1987-2016. Considering this database, the next step was to analyze the intensity and the dynamics of the higher storms encountered in this period. According to the data resulted from the model simulations, the western side of the sea is considerably more energetic than the rest of the basin. In this western region, regular strong storms provide usually significant wave heights greater than 8m. This may lead to maximum wave heights even greater than 15m. Such regular strong storms may occur several times in one year, usually in the wintertime, or in late autumn, and it can be noticed that their frequency becomes higher in the last decade. As regards the case of the most extreme storms, significant wave heights greater than 10m and maximum wave heights close to 20m (and even greater) may occur. Such extreme storms, which in the past were noticed only once in four or five years, are more recent to be faced almost every year in the Black Sea, and this seems to be a consequence of the climate changes. The analysis performed included also the dynamics of the monthly and annual significant wave height maxima as well as the identification of the most probable spatial and temporal occurrences of the extreme storm events. Finally, it can be concluded that the present work provides valuable information related to the characteristics of the storm conditions and on their dynamics in the Black Sea. This environment is currently subjected to high navigation traffic and intense offshore and nearshore activities and the strong storms that systematically occur may produce accidents with very serious consequences.Keywords: Black Sea, extreme storms, SWAN simulations, waves
Procedia PDF Downloads 2483863 Automatic Teller Machine System Security by Using Mobile SMS Code
Authors: Husnain Mushtaq, Mary Anjum, Muhammad Aleem
Abstract:
The main objective of this paper is used to develop a high security in Automatic Teller Machine (ATM). In these system bankers will collect the mobile numbers from the customers and then provide a code on their mobile number. In most country existing ATM machine use the magnetic card reader. The customer is identifying by inserting an ATM card with magnetic card that hold unique information such as card number and some security limitations. By entering a personal identification number, first the customer is authenticated then will access bank account in order to make cash withdraw or other services provided by the bank. Cases of card fraud are another problem once the user’s bank card is missing and the password is stolen, or simply steal a customer’s card & PIN the criminal will draw all cash in very short time, which will being great financial losses in customer, this type of fraud has increase worldwide. So to resolve this problem we are going to provide the solution using “Mobile SMS code” and ATM “PIN code” in order to improve the verify the security of customers using ATM system and confidence in the banking area.Keywords: PIN, inquiry, biometric, magnetic strip, iris recognition, face recognition
Procedia PDF Downloads 3653862 Optimizing the Scanning Time with Radiation Prediction Using a Machine Learning Technique
Authors: Saeed Eskandari, Seyed Rasoul Mehdikhani
Abstract:
Radiation sources have been used in many industries, such as gamma sources in medical imaging. These waves have destructive effects on humans and the environment. It is very important to detect and find the source of these waves because these sources cannot be seen by the eye. A portable robot has been designed and built with the purpose of revealing radiation sources that are able to scan the place from 5 to 20 meters away and shows the location of the sources according to the intensity of the waves on a two-dimensional digital image. The operation of the robot is done by measuring the pixels separately. By increasing the image measurement resolution, we will have a more accurate scan of the environment, and more points will be detected. But this causes a lot of time to be spent on scanning. In this paper, to overcome this challenge, we designed a method that can optimize this time. In this method, a small number of important points of the environment are measured. Hence the remaining pixels are predicted and estimated by regression algorithms in machine learning. The research method is based on comparing the actual values of all pixels. These steps have been repeated with several other radiation sources. The obtained results of the study show that the values estimated by the regression method are very close to the real values.Keywords: regression, machine learning, scan radiation, robot
Procedia PDF Downloads 793861 Early Prediction of Disposable Addresses in Ethereum Blockchain
Authors: Ahmad Saleem
Abstract:
Ethereum is the second largest crypto currency in blockchain ecosystem. Along with standard transactions, it supports smart contracts and NFT’s. Current research trends are focused on analyzing the overall structure of the network its growth and behavior. Ethereum addresses are anonymous and can be created on fly. The nature of Ethereum network and addresses make it hard to predict their behavior. The activity period of an ethereum address is not much analyzed. Using machine learning we can make early prediction about the disposability of the address. In this paper we analyzed the lifetime of the addresses. We also identified and predicted the disposable addresses using machine learning models and compared the results.Keywords: blockchain, Ethereum, cryptocurrency, prediction
Procedia PDF Downloads 973860 LTE Modelling of a DC Arc Ignition on Cold Electrodes
Authors: O. Ojeda Mena, Y. Cressault, P. Teulet, J. P. Gonnet, D. F. N. Santos, MD. Cunha, M. S. Benilov
Abstract:
The assumption of plasma in local thermal equilibrium (LTE) is commonly used to perform electric arc simulations for industrial applications. This assumption allows to model the arc using a set of magneto-hydromagnetic equations that can be solved with a computational fluid dynamic code. However, the LTE description is only valid in the arc column, whereas in the regions close to the electrodes the plasma deviates from the LTE state. The importance of these near-electrode regions is non-trivial since they define the energy and current transfer between the arc and the electrodes. Therefore, any accurate modelling of the arc must include a good description of the arc-electrode phenomena. Due to the modelling complexity and computational cost of solving the near-electrode layers, a simplified description of the arc-electrode interaction was developed in a previous work to study a steady high-pressure arc discharge, where the near-electrode regions are introduced at the interface between arc and electrode as boundary conditions. The present work proposes a similar approach to simulate the arc ignition in a free-burning arc configuration following an LTE description of the plasma. To obtain the transient evolution of the arc characteristics, appropriate boundary conditions for both the near-cathode and the near-anode regions are used based on recent publications. The arc-cathode interaction is modeled using a non-linear surface heating approach considering the secondary electron emission. On the other hand, the interaction between the arc and the anode is taken into account by means of the heating voltage approach. From the numerical modelling, three main stages can be identified during the arc ignition. Initially, a glow discharge is observed, where the cold non-thermionic cathode is uniformly heated at its surface and the near-cathode voltage drop is in the order of a few hundred volts. Next, a spot with high temperature is formed at the cathode tip followed by a sudden decrease of the near-cathode voltage drop, marking the glow-to-arc discharge transition. During this stage, the LTE plasma also presents an important increase of the temperature in the region adjacent to the hot spot. Finally, the near-cathode voltage drop stabilizes at a few volts and both the electrode and plasma temperatures reach the steady solution. The results after some seconds are similar to those presented for thermionic cathodes.Keywords: arc-electrode interaction, thermal plasmas, electric arc simulation, cold electrodes
Procedia PDF Downloads 1223859 Building Information Modelling Based Value for Money Assessment in Public-Private Partnership
Authors: Guoqian Ren, Haijiang Li, Jisong Zhang
Abstract:
Over the past 40 years, urban development has undergone large-scale, high-speed expansion, beyond what was previously considered normal and in a manner not proportionally related to population growth or physical considerations. With more scientific and refined decision-making in the urban construction process, new urbanization approaches, aligned with public-private partnerships (PPPs) which evolved in the early 1990s, have become acceptable and, in some situations, even better solutions to outstanding urban municipal construction projects, especially in developing countries. However, as the main driving force to deal with urban public services, PPPs are still problematic regarding value for money (VFM) process in most large-scale construction projects. This paper therefore reviews recent PPP articles in popular project management journals and relevant toolkits, published in the last 10 years, to identify the indicators that influence VFM within PPPs across regions. With increasing concerns about profitability and environmental and social impacts, the current PPP structure requires a more integrated platform to manage multi-performance project life cycles. Building information modelling (BIM), a popular approach to the procurement process in AEC sectors, provides the potential to ensure VFM while also working in tandem with the semantic approach to holistically measure life cycle costs (LCC) and achieve better sustainability. This paper suggests that BIM applied to the entire PPP life cycle could support holistic decision-making regarding VFM processes and thus meet service targets.Keywords: public-private partnership, value for money, building information modelling, semantic approach
Procedia PDF Downloads 2093858 IoT and Advanced Analytics Integration in Biogas Modelling
Authors: Rakesh Choudhary, Ajay Kumar, Deepak Sharma
Abstract:
The main goal of this paper is to investigate the challenges and benefits of IoT integration in biogas production. This overview explains how the inclusion of IoT can enhance biogas production efficiency. Therefore, such collected data can be explored by advanced analytics, including Artificial intelligence (AI) and Machine Learning (ML) algorithms, consequently improving bio-energy processes. To boost biogas generation efficiency, this report examines the use of IoT devices for real-time data collection on key parameters, e.g., pH, temperature, gas composition, and microbial growth. Real-time monitoring through big data has made it possible to detect diverse, complex trends in the process of producing biogas. The Informed by advanced analytics can also help in improving bio-energy production as well as optimizing operational conditions. Moreover, IoT allows remote observation, control and management, which decreases manual intervention needed whilst increasing process effectiveness. Such a paradigm shift in the incorporation of IoT technologies into biogas production systems helps to achieve higher productivity levels as well as more practical biomass quality biomethane through real-time monitoring-based proactive decision-making, thus driving continuous performance improvement.Keywords: internet of things, biogas, renewable energy, sustainability, anaerobic digestion, real-time monitoring, optimization
Procedia PDF Downloads 203857 The Role of Synthetic Data in Aerial Object Detection
Authors: Ava Dodd, Jonathan Adams
Abstract:
The purpose of this study is to explore the characteristics of developing a machine learning application using synthetic data. The study is structured to develop the application for the purpose of deploying the computer vision model. The findings discuss the realities of attempting to develop a computer vision model for practical purpose, and detail the processes, tools, and techniques that were used to meet accuracy requirements. The research reveals that synthetic data represents another variable that can be adjusted to improve the performance of a computer vision model. Further, a suite of tools and tuning recommendations are provided.Keywords: computer vision, machine learning, synthetic data, YOLOv4
Procedia PDF Downloads 2253856 Modelling of Atomic Force Microscopic Nano Robot's Friction Force on Rough Surfaces
Authors: M. Kharazmi, M. Zakeri, M. Packirisamy, J. Faraji
Abstract:
Micro/Nanorobotics or manipulation of nanoparticles by Atomic Force Microscopic (AFM) is one of the most important solutions for controlling the movement of atoms, particles and micro/nano metrics components and assembling of them to design micro/nano-meter tools. Accurate modelling of manipulation requires identification of forces and mechanical knowledge in the Nanoscale which are different from macro world. Due to the importance of the adhesion forces and the interaction of surfaces at the nanoscale several friction models were presented. In this research, friction and normal forces that are applied on the AFM by using of the dynamic bending-torsion model of AFM are obtained based on Hurtado-Kim friction model (HK), Johnson-Kendall-Robert contact model (JKR) and Greenwood-Williamson roughness model (GW). Finally, the effect of standard deviation of asperities height on the normal load, friction force and friction coefficient are studied.Keywords: atomic force microscopy, contact model, friction coefficient, Greenwood-Williamson model
Procedia PDF Downloads 1993855 Talent-to-Vec: Using Network Graphs to Validate Models with Data Sparsity
Authors: Shaan Khosla, Jon Krohn
Abstract:
In a recruiting context, machine learning models are valuable for recommendations: to predict the best candidates for a vacancy, to match the best vacancies for a candidate, and compile a set of similar candidates for any given candidate. While useful to create these models, validating their accuracy in a recommendation context is difficult due to a sparsity of data. In this report, we use network graph data to generate useful representations for candidates and vacancies. We use candidates and vacancies as network nodes and designate a bi-directional link between them based on the candidate interviewing for the vacancy. After using node2vec, the embeddings are used to construct a validation dataset with a ranked order, which will help validate new recommender systems.Keywords: AI, machine learning, NLP, recruiting
Procedia PDF Downloads 843854 The Convergence of IoT and Machine Learning: A Survey of Real-time Stress Detection System
Authors: Shreyas Gambhirrao, Aditya Vichare, Aniket Tembhurne, Shahuraj Bhosale
Abstract:
In today's rapidly evolving environment, stress has emerged as a significant health concern across different age groups. Stress that isn't controlled, whether it comes from job responsibilities, health issues, or the never-ending news cycle, can have a negative effect on our well-being. The problem is further aggravated by the ongoing connection to technology. In this high-tech age, identifying and controlling stress is vital. In order to solve this health issue, the study focuses on three key metrics for stress detection: body temperature, heart rate, and galvanic skin response (GSR). These parameters along with the Support Vector Machine classifier assist the system to categorize stress into three groups: 1) Stressed, 2) Not stressed, and 3) Moderate stress. Proposed training model, a NodeMCU combined with particular sensors collects data in real-time and rapidly categorizes individuals based on their stress levels. Real-time stress detection is made possible by this creative combination of hardware and software.Keywords: real time stress detection, NodeMCU, sensors, heart-rate, body temperature, galvanic skin response (GSR), support vector machine
Procedia PDF Downloads 723853 Indian Premier League (IPL) Score Prediction: Comparative Analysis of Machine Learning Models
Authors: Rohini Hariharan, Yazhini R, Bhamidipati Naga Shrikarti
Abstract:
In the realm of cricket, particularly within the context of the Indian Premier League (IPL), the ability to predict team scores accurately holds significant importance for both cricket enthusiasts and stakeholders alike. This paper presents a comprehensive study on IPL score prediction utilizing various machine learning algorithms, including Support Vector Machines (SVM), XGBoost, Multiple Regression, Linear Regression, K-nearest neighbors (KNN), and Random Forest. Through meticulous data preprocessing, feature engineering, and model selection, we aimed to develop a robust predictive framework capable of forecasting team scores with high precision. Our experimentation involved the analysis of historical IPL match data encompassing diverse match and player statistics. Leveraging this data, we employed state-of-the-art machine learning techniques to train and evaluate the performance of each model. Notably, Multiple Regression emerged as the top-performing algorithm, achieving an impressive accuracy of 77.19% and a precision of 54.05% (within a threshold of +/- 10 runs). This research contributes to the advancement of sports analytics by demonstrating the efficacy of machine learning in predicting IPL team scores. The findings underscore the potential of advanced predictive modeling techniques to provide valuable insights for cricket enthusiasts, team management, and betting agencies. Additionally, this study serves as a benchmark for future research endeavors aimed at enhancing the accuracy and interpretability of IPL score prediction models.Keywords: indian premier league (IPL), cricket, score prediction, machine learning, support vector machines (SVM), xgboost, multiple regression, linear regression, k-nearest neighbors (KNN), random forest, sports analytics
Procedia PDF Downloads 533852 Development of an Automatic Calibration Framework for Hydrologic Modelling Using Approximate Bayesian Computation
Authors: A. Chowdhury, P. Egodawatta, J. M. McGree, A. Goonetilleke
Abstract:
Hydrologic models are increasingly used as tools to predict stormwater quantity and quality from urban catchments. However, due to a range of practical issues, most models produce gross errors in simulating complex hydraulic and hydrologic systems. Difficulty in finding a robust approach for model calibration is one of the main issues. Though automatic calibration techniques are available, they are rarely used in common commercial hydraulic and hydrologic modelling software e.g. MIKE URBAN. This is partly due to the need for a large number of parameters and large datasets in the calibration process. To overcome this practical issue, a framework for automatic calibration of a hydrologic model was developed in R platform and presented in this paper. The model was developed based on the time-area conceptualization. Four calibration parameters, including initial loss, reduction factor, time of concentration and time-lag were considered as the primary set of parameters. Using these parameters, automatic calibration was performed using Approximate Bayesian Computation (ABC). ABC is a simulation-based technique for performing Bayesian inference when the likelihood is intractable or computationally expensive to compute. To test the performance and usefulness, the technique was used to simulate three small catchments in Gold Coast. For comparison, simulation outcomes from the same three catchments using commercial modelling software, MIKE URBAN were used. The graphical comparison shows strong agreement of MIKE URBAN result within the upper and lower 95% credible intervals of posterior predictions as obtained via ABC. Statistical validation for posterior predictions of runoff result using coefficient of determination (CD), root mean square error (RMSE) and maximum error (ME) was found reasonable for three study catchments. The main benefit of using ABC over MIKE URBAN is that ABC provides a posterior distribution for runoff flow prediction, and therefore associated uncertainty in predictions can be obtained. In contrast, MIKE URBAN just provides a point estimate. Based on the results of the analysis, it appears as though ABC the developed framework performs well for automatic calibration.Keywords: automatic calibration framework, approximate bayesian computation, hydrologic and hydraulic modelling, MIKE URBAN software, R platform
Procedia PDF Downloads 3093851 MSIpred: A Python 2 Package for the Classification of Tumor Microsatellite Instability from Tumor Mutation Annotation Data Using a Support Vector Machine
Authors: Chen Wang, Chun Liang
Abstract:
Microsatellite instability (MSI) is characterized by high degree of polymorphism in microsatellite (MS) length due to a deficiency in mismatch repair (MMR) system. MSI is associated with several tumor types and its status can be considered as an important indicator for tumor prognostic. Conventional clinical diagnosis of MSI examines PCR products of a panel of MS markers using electrophoresis (MSI-PCR) which is laborious, time consuming, and less reliable. MSIpred, a python 2 package for automatic classification of MSI was released by this study. It computes important somatic mutation features from files in mutation annotation format (MAF) generated from paired tumor-normal exome sequencing data, subsequently using these to predict tumor MSI status with a support vector machine (SVM) classifier trained by MAF files of 1074 tumors belonging to four types. Evaluation of MSIpred on an independent 358-tumor test set achieved overall accuracy of over 98% and area under receiver operating characteristic (ROC) curve of 0.967. These results indicated that MSIpred is a robust pan-cancer MSI classification tool and can serve as a complementary diagnostic to MSI-PCR in MSI diagnosis.Keywords: microsatellite instability, pan-cancer classification, somatic mutation, support vector machine
Procedia PDF Downloads 1733850 Crude Oil Electrostatic Mathematical Modelling on an Existing Industrial Plant
Authors: Fatemeh Yazdanmehr, Iulian Nistor
Abstract:
The scope of the current study is the prediction of water separation in a two-stage industrial crude oil desalting plant. This research study was focused on developing a desalting operation in an existing production unit of one Iranian heavy oil field with 75 MBPD capacity. Because of some operational issues, such as oil dehydration at high temperatures, the optimization of the desalter operational parameters was essential. The mathematical desalting is modeled based on the population balance method. The existing operational data is used for tuning and validation of the accuracy of the modeling. The inlet oil temperature to desalter used was decreased from 110°C to 80°C, and the desalted electrical field was increased from 0.75 kv to 2.5 kv. The proposed condition for the desalter also meets the water oil specification. Based on these conditions of desalter, the oil recovery is increased by 574 BBL/D, and the gas flaring decrease by 2.8 MMSCF/D. Depending on the oil price, the additional production of oil can increase the annual income by about $15 MM and reduces greenhouse gas production caused by gas flaring.Keywords: desalter, demulsification, modelling, water-oil separation, crude oil emulsion
Procedia PDF Downloads 763849 Customer Data Analysis Model Using Business Intelligence Tools in Telecommunication Companies
Authors: Monica Lia
Abstract:
This article presents a customer data analysis model using business intelligence tools for data modelling, transforming, data visualization and dynamic reports building. Economic organizational customer’s analysis is made based on the information from the transactional systems of the organization. The paper presents how to develop the data model starting for the data that companies have inside their own operational systems. The owned data can be transformed into useful information about customers using business intelligence tool. For a mature market, knowing the information inside the data and making forecast for strategic decision become more important. Business Intelligence tools are used in business organization as support for decision-making.Keywords: customer analysis, business intelligence, data warehouse, data mining, decisions, self-service reports, interactive visual analysis, and dynamic dashboards, use cases diagram, process modelling, logical data model, data mart, ETL, star schema, OLAP, data universes
Procedia PDF Downloads 4303848 Text Analysis to Support Structuring and Modelling a Public Policy Problem-Outline of an Algorithm to Extract Inferences from Textual Data
Authors: Claudia Ehrentraut, Osama Ibrahim, Hercules Dalianis
Abstract:
Policy making situations are real-world problems that exhibit complexity in that they are composed of many interrelated problems and issues. To be effective, policies must holistically address the complexity of the situation rather than propose solutions to single problems. Formulating and understanding the situation and its complex dynamics, therefore, is a key to finding holistic solutions. Analysis of text based information on the policy problem, using Natural Language Processing (NLP) and Text analysis techniques, can support modelling of public policy problem situations in a more objective way based on domain experts knowledge and scientific evidence. The objective behind this study is to support modelling of public policy problem situations, using text analysis of verbal descriptions of the problem. We propose a formal methodology for analysis of qualitative data from multiple information sources on a policy problem to construct a causal diagram of the problem. The analysis process aims at identifying key variables, linking them by cause-effect relationships and mapping that structure into a graphical representation that is adequate for designing action alternatives, i.e., policy options. This study describes the outline of an algorithm used to automate the initial step of a larger methodological approach, which is so far done manually. In this initial step, inferences about key variables and their interrelationships are extracted from textual data to support a better problem structuring. A small prototype for this step is also presented.Keywords: public policy, problem structuring, qualitative analysis, natural language processing, algorithm, inference extraction
Procedia PDF Downloads 5893847 Predictive Maintenance: Machine Condition Real-Time Monitoring and Failure Prediction
Authors: Yan Zhang
Abstract:
Predictive maintenance is a technique to predict when an in-service machine will fail so that maintenance can be planned in advance. Analytics-driven predictive maintenance is gaining increasing attention in many industries such as manufacturing, utilities, aerospace, etc., along with the emerging demand of Internet of Things (IoT) applications and the maturity of technologies that support Big Data storage and processing. This study aims to build an end-to-end analytics solution that includes both real-time machine condition monitoring and machine learning based predictive analytics capabilities. The goal is to showcase a general predictive maintenance solution architecture, which suggests how the data generated from field machines can be collected, transmitted, stored, and analyzed. We use a publicly available aircraft engine run-to-failure dataset to illustrate the streaming analytics component and the batch failure prediction component. We outline the contributions of this study from four aspects. First, we compare the predictive maintenance problems from the view of the traditional reliability centered maintenance field, and from the view of the IoT applications. When evolving to the IoT era, predictive maintenance has shifted its focus from ensuring reliable machine operations to improve production/maintenance efficiency via any maintenance related tasks. It covers a variety of topics, including but not limited to: failure prediction, fault forecasting, failure detection and diagnosis, and recommendation of maintenance actions after failure. Second, we review the state-of-art technologies that enable a machine/device to transmit data all the way through the Cloud for storage and advanced analytics. These technologies vary drastically mainly based on the power source and functionality of the devices. For example, a consumer machine such as an elevator uses completely different data transmission protocols comparing to the sensor units in an environmental sensor network. The former may transfer data into the Cloud via WiFi directly. The latter usually uses radio communication inherent the network, and the data is stored in a staging data node before it can be transmitted into the Cloud when necessary. Third, we illustrate show to formulate a machine learning problem to predict machine fault/failures. By showing a step-by-step process of data labeling, feature engineering, model construction and evaluation, we share following experiences: (1) what are the specific data quality issues that have crucial impact on predictive maintenance use cases; (2) how to train and evaluate a model when training data contains inter-dependent records. Four, we review the tools available to build such a data pipeline that digests the data and produce insights. We show the tools we use including data injection, streaming data processing, machine learning model training, and the tool that coordinates/schedules different jobs. In addition, we show the visualization tool that creates rich data visualizations for both real-time insights and prediction results. To conclude, there are two key takeaways from this study. (1) It summarizes the landscape and challenges of predictive maintenance applications. (2) It takes an example in aerospace with publicly available data to illustrate each component in the proposed data pipeline and showcases how the solution can be deployed as a live demo.Keywords: Internet of Things, machine learning, predictive maintenance, streaming data
Procedia PDF Downloads 3863846 The Effect on Rolling Mill of Waviness in Hot Rolled Steel
Authors: Sunthorn Sittisakuljaroen
Abstract:
The edge waviness in hot rolled steel is a common defect. Variables that effect for such defect include as raw material and machine. These variables are necessary to consider. This research studied the defect of edge waviness for SS 400 of metal sheet manufacture. Defect of metal sheets divided into two groups. The specimens were investigated on chemical composition and mechanical properties to find the difference. The results of investigate showed that not different to a standard significantly. Therefore the roll milled machine for sample need to adjustable rollers for press on metal sheet which was more appropriate to adjustable at both ends.Keywords: edge waviness, hot rolling steel, metal sheet defect, SS 400, roll leveller
Procedia PDF Downloads 4203845 Advancements in Predicting Diabetes Biomarkers: A Machine Learning Epigenetic Approach
Authors: James Ladzekpo
Abstract:
Background: The urgent need to identify new pharmacological targets for diabetes treatment and prevention has been amplified by the disease's extensive impact on individuals and healthcare systems. A deeper insight into the biological underpinnings of diabetes is crucial for the creation of therapeutic strategies aimed at these biological processes. Current predictive models based on genetic variations fall short of accurately forecasting diabetes. Objectives: Our study aims to pinpoint key epigenetic factors that predispose individuals to diabetes. These factors will inform the development of an advanced predictive model that estimates diabetes risk from genetic profiles, utilizing state-of-the-art statistical and data mining methods. Methodology: We have implemented a recursive feature elimination with cross-validation using the support vector machine (SVM) approach for refined feature selection. Building on this, we developed six machine learning models, including logistic regression, k-Nearest Neighbors (k-NN), Naive Bayes, Random Forest, Gradient Boosting, and Multilayer Perceptron Neural Network, to evaluate their performance. Findings: The Gradient Boosting Classifier excelled, achieving a median recall of 92.17% and outstanding metrics such as area under the receiver operating characteristics curve (AUC) with a median of 68%, alongside median accuracy and precision scores of 76%. Through our machine learning analysis, we identified 31 genes significantly associated with diabetes traits, highlighting their potential as biomarkers and targets for diabetes management strategies. Conclusion: Particularly noteworthy were the Gradient Boosting Classifier and Multilayer Perceptron Neural Network, which demonstrated potential in diabetes outcome prediction. We recommend future investigations to incorporate larger cohorts and a wider array of predictive variables to enhance the models' predictive capabilities.Keywords: diabetes, machine learning, prediction, biomarkers
Procedia PDF Downloads 55