Search results for: prediction model accuracy
19506 Learning to Recommend with Negative Ratings Based on Factorization Machine
Authors: Caihong Sun, Xizi Zhang
Abstract:
Rating prediction is an important problem for recommender systems. The task is to predict the rating for an item that a user would give. Most of the existing algorithms for the task ignore the effect of negative ratings rated by users on items, but the negative ratings have a significant impact on users’ purchasing decisions in practice. In this paper, we present a rating prediction algorithm based on factorization machines that consider the effect of negative ratings inspired by Loss Aversion theory. The aim of this paper is to develop a concave and a convex negative disgust function to evaluate the negative ratings respectively. Experiments are conducted on MovieLens dataset. The experimental results demonstrate the effectiveness of the proposed methods by comparing with other four the state-of-the-art approaches. The negative ratings showed much importance in the accuracy of ratings predictions.Keywords: factorization machines, feature engineering, negative ratings, recommendation systems
Procedia PDF Downloads 24519505 Enhancing a Recidivism Prediction Tool with Machine Learning: Effectiveness and Algorithmic Fairness
Authors: Marzieh Karimihaghighi, Carlos Castillo
Abstract:
This work studies how Machine Learning (ML) may be used to increase the effectiveness of a criminal recidivism risk assessment tool, RisCanvi. The two key dimensions of this analysis are predictive accuracy and algorithmic fairness. ML-based prediction models obtained in this study are more accurate at predicting criminal recidivism than the manually-created formula used in RisCanvi, achieving an AUC of 0.76 and 0.73 in predicting violent and general recidivism respectively. However, the improvements are small, and it is noticed that algorithmic discrimination can easily be introduced between groups such as national vs foreigner, or young vs old. It is described how effectiveness and algorithmic fairness objectives can be balanced, applying a method in which a single error disparity in terms of generalized false positive rate is minimized, while calibration is maintained across groups. Obtained results show that this bias mitigation procedure can substantially reduce generalized false positive rate disparities across multiple groups. Based on these results, it is proposed that ML-based criminal recidivism risk prediction should not be introduced without applying algorithmic bias mitigation procedures.Keywords: algorithmic fairness, criminal risk assessment, equalized odds, recidivism
Procedia PDF Downloads 15619504 Statistical Scientific Investigation of Popular Cultural Heritage in the Relationship between Astronomy and Weather Conditions in the State of Kuwait
Authors: Ahmed M. AlHasem
Abstract:
The Kuwaiti society has long been aware of climatic changes and their annual dates and trying to link them to astronomy in an attempt to forecast the future weather conditions. The reason for this concern is that many of the economic, social and living activities of the society depend deeply on the nature of the weather conditions directly and indirectly. In other words, Kuwaiti society, like the case of many human societies, has in the past tried to predict climatic conditions by linking them to astronomy or popular statements to indicate the timing of climate changes. Accordingly, this study was devoted to scientific investigation based on the statistical analysis of climatic data to show the accuracy and compatibility of some of the most important elements of the cultural heritage in relation to climate change and to relate it scientifically to precise climatic measurements for decades. The research has been divided into 10 topics, each topic has been focused on one legacy, whether by linking climate changes to the appearance/disappearance of star or a popular statement inherited through generations, through explain the nature and timing and thereby statistical analysis to indicate the proportion of accuracy based on official climatic data since 1962. The study's conclusion is that the relationship is weak and, in some cases, non-existent between the popular heritage and the actual climatic data. Therefore, it does not have a dependable relationship and a reliable scientific prediction between both the popular heritage and the forecast of weather conditions.Keywords: astronomy, cultural heritage, statistical analysis, weather prediction
Procedia PDF Downloads 12519503 On Consolidated Predictive Model of the Natural History of Breast Cancer Considering Primary Tumor and Primary Distant Metastases Growth
Authors: Ella Tyuryumina, Alexey Neznanov
Abstract:
Finding algorithms to predict the growth of tumors has piqued the interest of researchers ever since the early days of cancer research. A number of studies were carried out as an attempt to obtain reliable data on the natural history of breast cancer growth. Mathematical modeling can play a very important role in the prognosis of tumor process of breast cancer. However, mathematical models describe primary tumor growth and metastases growth separately. Consequently, we propose a mathematical growth model for primary tumor and primary metastases which may help to improve predicting accuracy of breast cancer progression using an original mathematical model referred to CoM-IV and corresponding software. We are interested in: 1) modelling the whole natural history of primary tumor and primary metastases; 2) developing adequate and precise CoM-IV which reflects relations between PT and MTS; 3) analyzing the CoM-IV scope of application; 4) implementing the model as a software tool. The CoM-IV is based on exponential tumor growth model and consists of a system of determinate nonlinear and linear equations; corresponds to TNM classification. It allows to calculate different growth periods of primary tumor and primary metastases: 1) ‘non-visible period’ for primary tumor; 2) ‘non-visible period’ for primary metastases; 3) ‘visible period’ for primary metastases. The new predictive tool: 1) is a solid foundation to develop future studies of breast cancer models; 2) does not require any expensive diagnostic tests; 3) is the first predictor which makes forecast using only current patient data, the others are based on the additional statistical data. Thus, the CoM-IV model and predictive software: a) detect different growth periods of primary tumor and primary metastases; b) make forecast of the period of primary metastases appearance; c) have higher average prediction accuracy than the other tools; d) can improve forecasts on survival of BC and facilitate optimization of diagnostic tests. The following are calculated by CoM-IV: the number of doublings for ‘nonvisible’ and ‘visible’ growth period of primary metastases; tumor volume doubling time (days) for ‘nonvisible’ and ‘visible’ growth period of primary metastases. The CoM-IV enables, for the first time, to predict the whole natural history of primary tumor and primary metastases growth on each stage (pT1, pT2, pT3, pT4) relying only on primary tumor sizes. Summarizing: a) CoM-IV describes correctly primary tumor and primary distant metastases growth of IV (T1-4N0-3M1) stage with (N1-3) or without regional metastases in lymph nodes (N0); b) facilitates the understanding of the appearance period and manifestation of primary metastases.Keywords: breast cancer, exponential growth model, mathematical modelling, primary metastases, primary tumor, survival
Procedia PDF Downloads 33719502 Equity Risk Premiums and Risk Free Rates in Modelling and Prediction of Financial Markets
Authors: Mohammad Ghavami, Reza S. Dilmaghani
Abstract:
This paper presents an adaptive framework for modelling financial markets using equity risk premiums, risk free rates and volatilities. The recorded economic factors are initially used to train four adaptive filters for a certain limited period of time in the past. Once the systems are trained, the adjusted coefficients are used for modelling and prediction of an important financial market index. Two different approaches based on least mean squares (LMS) and recursive least squares (RLS) algorithms are investigated. Performance analysis of each method in terms of the mean squared error (MSE) is presented and the results are discussed. Computer simulations carried out using recorded data show MSEs of 4% and 3.4% for the next month prediction using LMS and RLS adaptive algorithms, respectively. In terms of twelve months prediction, RLS method shows a better tendency estimation compared to the LMS algorithm.Keywords: adaptive methods, LSE, MSE, prediction of financial Markets
Procedia PDF Downloads 34219501 Influence of Temperature and Immersion on the Behavior of a Polymer Composite
Authors: Quentin C.P. Bourgogne, Vanessa Bouchart, Pierre Chevrier, Emmanuel Dattoli
Abstract:
This study presents an experimental and theoretical work conducted on a PolyPhenylene Sulfide reinforced with 40%wt of short glass fibers (PPS GF40) and its matrix. Thermoplastics are widely used in the automotive industry to lightweight automotive parts. The replacement of metallic parts by thermoplastics is reaching under-the-hood parts, near the engine. In this area, the parts are subjected to high temperatures and are immersed in cooling liquid. This liquid is composed of water and glycol and can affect the mechanical properties of the composite. The aim of this work was thus to quantify the evolution of mechanical properties of the thermoplastic composite, as a function of temperature and liquid aging effects, in order to develop a reliable design of parts. An experimental campaign in the tensile mode was carried out at different temperatures and for various glycol proportions in the cooling liquid, for monotonic and cyclic loadings on a neat and a reinforced PPS. The results of these tests allowed to highlight some of the main physical phenomena occurring during these solicitations under tough hydro-thermal conditions. Indeed, the performed tests showed that temperature and liquid cooling aging can affect the mechanical behavior of the material in several ways. The more the cooling liquid contains water, the more the mechanical behavior is affected. It was observed that PPS showed a higher sensitivity to absorption than to chemical aggressiveness of the cooling liquid, explaining this dominant sensitivity. Two kinds of behaviors were noted: an elasto-plastic type under the glass transition temperature and a visco-pseudo-plastic one above it. It was also shown that viscosity is the leading phenomenon above the glass transition temperature for the PPS and could also be important under this temperature, mostly under cyclic conditions and when the stress rate is low. Finally, it was observed that soliciting this composite at high temperatures is decreasing the advantages of the presence of fibers. A new phenomenological model was then built to take into account these experimental observations. This new model allowed the prediction of the evolution of mechanical properties as a function of the loading environment, with a reduced number of parameters compared to precedent studies. It was also shown that the presented approach enables the description and the prediction of the mechanical response with very good accuracy (2% of average error at worst), over a wide range of hydrothermal conditions. A temperature-humidity equivalence principle was underlined for the PPS, allowing the consideration of aging effects within the proposed model. Then, a limit of improvement of the reachable accuracy was determinate for all models using this set of data by the application of an artificial intelligence-based model allowing a comparison between artificial intelligence-based models and phenomenological based ones.Keywords: aging, analytical modeling, mechanical testing, polymer matrix composites, sequential model, thermomechanical
Procedia PDF Downloads 12019500 Hybrid Model: An Integration of Machine Learning with Traditional Scorecards
Authors: Golnush Masghati-Amoli, Paul Chin
Abstract:
Over the past recent years, with the rapid increases in data availability and computing power, Machine Learning (ML) techniques have been called on in a range of different industries for their strong predictive capability. However, the use of Machine Learning in commercial banking has been limited due to a special challenge imposed by numerous regulations that require lenders to be able to explain their analytic models, not only to regulators but often to consumers. In other words, although Machine Leaning techniques enable better prediction with a higher level of accuracy, in comparison with other industries, they are adopted less frequently in commercial banking especially for scoring purposes. This is due to the fact that Machine Learning techniques are often considered as a black box and fail to provide information on why a certain risk score is given to a customer. In order to bridge this gap between the explain-ability and performance of Machine Learning techniques, a Hybrid Model is developed at Dun and Bradstreet that is focused on blending Machine Learning algorithms with traditional approaches such as scorecards. The Hybrid Model maximizes efficiency of traditional scorecards by merging its practical benefits, such as explain-ability and the ability to input domain knowledge, with the deep insights of Machine Learning techniques which can uncover patterns scorecard approaches cannot. First, through development of Machine Learning models, engineered features and latent variables and feature interactions that demonstrate high information value in the prediction of customer risk are identified. Then, these features are employed to introduce observed non-linear relationships between the explanatory and dependent variables into traditional scorecards. Moreover, instead of directly computing the Weight of Evidence (WoE) from good and bad data points, the Hybrid Model tries to match the score distribution generated by a Machine Learning algorithm, which ends up providing an estimate of the WoE for each bin. This capability helps to build powerful scorecards with sparse cases that cannot be achieved with traditional approaches. The proposed Hybrid Model is tested on different portfolios where a significant gap is observed between the performance of traditional scorecards and Machine Learning models. The result of analysis shows that Hybrid Model can improve the performance of traditional scorecards by introducing non-linear relationships between explanatory and target variables from Machine Learning models into traditional scorecards. Also, it is observed that in some scenarios the Hybrid Model can be almost as predictive as the Machine Learning techniques while being as transparent as traditional scorecards. Therefore, it is concluded that, with the use of Hybrid Model, Machine Learning algorithms can be used in the commercial banking industry without being concerned with difficulties in explaining the models for regulatory purposes.Keywords: machine learning algorithms, scorecard, commercial banking, consumer risk, feature engineering
Procedia PDF Downloads 14019499 ANOVA-Based Feature Selection and Machine Learning System for IoT Anomaly Detection
Authors: Muhammad Ali
Abstract:
Cyber-attacks and anomaly detection on the Internet of Things (IoT) infrastructure is emerging concern in the domain of data-driven intrusion. Rapidly increasing IoT risk is now making headlines around the world. denial of service, malicious control, data type probing, malicious operation, DDos, scan, spying, and wrong setup are attacks and anomalies that can affect an IoT system failure. Everyone talks about cyber security, connectivity, smart devices, and real-time data extraction. IoT devices expose a wide variety of new cyber security attack vectors in network traffic. For further than IoT development, and mainly for smart and IoT applications, there is a necessity for intelligent processing and analysis of data. So, our approach is too secure. We train several machine learning models that have been compared to accurately predicting attacks and anomalies on IoT systems, considering IoT applications, with ANOVA-based feature selection with fewer prediction models to evaluate network traffic to help prevent IoT devices. The machine learning (ML) algorithms that have been used here are KNN, SVM, NB, D.T., and R.F., with the most satisfactory test accuracy with fast detection. The evaluation of ML metrics includes precision, recall, F1 score, FPR, NPV, G.M., MCC, and AUC & ROC. The Random Forest algorithm achieved the best results with less prediction time, with an accuracy of 99.98%.Keywords: machine learning, analysis of variance, Internet of Thing, network security, intrusion detection
Procedia PDF Downloads 12919498 Predicting Trapezoidal Weir Discharge Coefficient Using Evolutionary Algorithm
Authors: K. Roushanger, A. Soleymanzadeh
Abstract:
Weirs are structures often used in irrigation techniques, sewer networks and flood protection. However, the hydraulic behavior of this type of weir is complex and difficult to predict accurately. An accurate flow prediction over a weir mainly depends on the proper estimation of discharge coefficient. In this study, the Genetic Expression Programming (GEP) approach was used for predicting trapezoidal and rectangular sharp-crested side weirs discharge coefficient. Three different performance indexes are used as comparing criteria for the evaluation of the model’s performances. The obtained results approved capability of GEP in prediction of trapezoidal and rectangular side weirs discharge coefficient. The results also revealed the influence of downstream Froude number for trapezoidal weir and upstream Froude number for rectangular weir in prediction of the discharge coefficient for both of side weirs.Keywords: discharge coefficient, genetic expression programming, trapezoidal weir
Procedia PDF Downloads 39119497 Combination of Artificial Neural Network Model and Geographic Information System for Prediction Water Quality
Authors: Sirilak Areerachakul
Abstract:
Water quality has initiated serious management efforts in many countries. Artificial Neural Network (ANN) models are developed as forecasting tools in predicting water quality trend based on historical data. This study endeavors to automatically classify water quality. The water quality classes are evaluated using 6 factor indices. These factors are pH value (pH), Dissolved Oxygen (DO), Biochemical Oxygen Demand (BOD), Nitrate Nitrogen (NO3N), Ammonia Nitrogen (NH3N) and Total Coliform (T-Coliform). The methodology involves applying data mining techniques using multilayer perceptron (MLP) neural network models. The data consisted of 11 sites of Saen Saep canal in Bangkok, Thailand. The data is obtained from the Department of Drainage and Sewerage Bangkok Metropolitan Administration during 2007-2011. The results of multilayer perceptron neural network exhibit a high accuracy multilayer perception rate at 94.23% in classifying the water quality of Saen Saep canal in Bangkok. Subsequently, this encouraging result could be combined with GIS data improves the classification accuracy significantly.Keywords: artificial neural network, geographic information system, water quality, computer science
Procedia PDF Downloads 34619496 Mathematical Modeling of the Fouling Phenomenon in Ultrafiltration of Latex Effluent
Authors: Amira Abdelrasoul, Huu Doan, Ali Lohi
Abstract:
An efficient and well-planned ultrafiltration process is becoming a necessity for monetary returns in the industrial settings. The aim of the present study was to develop a mathematical model for an accurate prediction of ultrafiltration membrane fouling of latex effluent applied to homogeneous and heterogeneous membranes with uniform and non-uniform pore sizes, respectively. The models were also developed for an accurate prediction of power consumption that can handle the large-scale purposes. The model incorporated the fouling attachments as well as chemical and physical factors in membrane fouling for accurate prediction and scale-up application. Both Polycarbonate and Polysulfone flat membranes, with pore sizes of 0.05 µm and a molecular weight cut-off of 60,000, respectively, were used under a constant feed flow rate and a cross-flow mode in ultrafiltration of the simulated paint effluent. Furthermore, hydrophilic ultrafilic and hydrophobic PVDF membranes with MWCO of 100,000 were used to test the reliability of the models. Monodisperse particles of 50 nm and 100 nm in diameter, and a latex effluent with a wide range of particle size distributions were utilized to validate the models. The aggregation and the sphericity of the particles indicated a significant effect on membrane fouling.Keywords: membrane fouling, mathematical modeling, power consumption, attachments, ultrafiltration
Procedia PDF Downloads 47519495 Fast Adjustable Threshold for Uniform Neural Network Quantization
Authors: Alexander Goncharenko, Andrey Denisov, Sergey Alyamkin, Evgeny Terentev
Abstract:
The neural network quantization is highly desired procedure to perform before running neural networks on mobile devices. Quantization without fine-tuning leads to accuracy drop of the model, whereas commonly used training with quantization is done on the full set of the labeled data and therefore is both time- and resource-consuming. Real life applications require simplification and acceleration of quantization procedure that will maintain accuracy of full-precision neural network, especially for modern mobile neural network architectures like Mobilenet-v1, MobileNet-v2 and MNAS. Here we present a method to significantly optimize training with quantization procedure by introducing the trained scale factors for discretization thresholds that are separate for each filter. Using the proposed technique, we quantize the modern mobile architectures of neural networks with the set of train data of only ∼ 10% of the total ImageNet 2012 sample. Such reduction of train dataset size and small number of trainable parameters allow to fine-tune the network for several hours while maintaining the high accuracy of quantized model (accuracy drop was less than 0.5%). Ready-for-use models and code are available in the GitHub repository.Keywords: distillation, machine learning, neural networks, quantization
Procedia PDF Downloads 33119494 Improved 3D Structure Prediction of Beta-Barrel Membrane Proteins by Using Evolutionary Coupling Constraints, Reduced State Space and an Empirical Potential Function
Authors: Wei Tian, Jie Liang, Hammad Naveed
Abstract:
Beta-barrel membrane proteins are found in the outer membrane of gram-negative bacteria, mitochondria, and chloroplasts. They carry out diverse biological functions, including pore formation, membrane anchoring, enzyme activity, and bacterial virulence. In addition, beta-barrel membrane proteins increasingly serve as scaffolds for bacterial surface display and nanopore-based DNA sequencing. Due to difficulties in experimental structure determination, they are sparsely represented in the protein structure databank and computational methods can help to understand their biophysical principles. We have developed a novel computational method to predict the 3D structure of beta-barrel membrane proteins using evolutionary coupling (EC) constraints and a reduced state space. Combined with an empirical potential function, we can successfully predict strand register at > 80% accuracy for a set of 49 non-homologous proteins with known structures. This is a significant improvement from previous results using EC alone (44%) and using empirical potential function alone (73%). Our method is general and can be applied to genome-wide structural prediction.Keywords: beta-barrel membrane proteins, structure prediction, evolutionary constraints, reduced state space
Procedia PDF Downloads 62119493 A Study of ZY3 Satellite Digital Elevation Model Verification and Refinement with Shuttle Radar Topography Mission
Authors: Bo Wang
Abstract:
As the first high-resolution civil optical satellite, ZY-3 satellite is able to obtain high-resolution multi-view images with three linear array sensors. The images can be used to generate Digital Elevation Models (DEM) through dense matching of stereo images. However, due to the clouds, forest, water and buildings covered on the images, there are some problems in the dense matching results such as outliers and areas failed to be matched (matching holes). This paper introduced an algorithm to verify the accuracy of DEM that generated by ZY-3 satellite with Shuttle Radar Topography Mission (SRTM). Since the accuracy of SRTM (Internal accuracy: 5 m; External accuracy: 15 m) is relatively uniform in the worldwide, it may be used to improve the accuracy of ZY-3 DEM. Based on the analysis of mass DEM and SRTM data, the processing can be divided into two aspects. The registration of ZY-3 DEM and SRTM can be firstly performed using the conjugate line features and area features matched between these two datasets. Then the ZY-3 DEM can be refined by eliminating the matching outliers and filling the matching holes. The matching outliers can be eliminated based on the statistics on Local Vector Binning (LVB). The matching holes can be filled by the elevation interpolated from SRTM. Some works are also conducted for the accuracy statistics of the ZY-3 DEM.Keywords: ZY-3 satellite imagery, DEM, SRTM, refinement
Procedia PDF Downloads 34719492 Fatigue Life Prediction under Variable Loading Based a Non-Linear Energy Model
Authors: Aid Abdelkrim
Abstract:
A method of fatigue damage accumulation based upon application of energy parameters of the fatigue process is proposed in the paper. Using this model is simple, it has no parameter to be determined, it requires only the knowledge of the curve W–N (W: strain energy density N: number of cycles at failure) determined from the experimental Wöhler curve. To examine the performance of nonlinear models proposed in the estimation of fatigue damage and fatigue life of components under random loading, a batch of specimens made of 6082 T 6 aluminium alloy has been studied and some of the results are reported in the present paper. The paper describes an algorithm and suggests a fatigue cumulative damage model, especially when random loading is considered. This work contains the results of uni-axial random load fatigue tests with different mean and amplitude values performed on 6082T6 aluminium alloy specimens. The proposed model has been formulated to take into account the damage evolution at different load levels and it allows the effect of the loading sequence to be included by means of a recurrence formula derived for multilevel loading, considering complex load sequences. It is concluded that a ‘damaged stress interaction damage rule’ proposed here allows a better fatigue damage prediction than the widely used Palmgren–Miner rule, and a formula derived in random fatigue could be used to predict the fatigue damage and fatigue lifetime very easily. The results obtained by the model are compared with the experimental results and those calculated by the most fatigue damage model used in fatigue (Miner’s model). The comparison shows that the proposed model, presents a good estimation of the experimental results. Moreover, the error is minimized in comparison to the Miner’s model.Keywords: damage accumulation, energy model, damage indicator, variable loading, random loading
Procedia PDF Downloads 39719491 Graph Neural Network-Based Classification for Disease Prediction in Health Care Heterogeneous Data Structures of Electronic Health Record
Authors: Raghavi C. Janaswamy
Abstract:
In the healthcare sector, heterogenous data elements such as patients, diagnosis, symptoms, conditions, observation text from physician notes, and prescriptions form the essentials of the Electronic Health Record (EHR). The data in the form of clear text and images are stored or processed in a relational format in most systems. However, the intrinsic structure restrictions and complex joins of relational databases limit the widespread utility. In this regard, the design and development of realistic mapping and deep connections as real-time objects offer unparallel advantages. Herein, a graph neural network-based classification of EHR data has been developed. The patient conditions have been predicted as a node classification task using a graph-based open source EHR data, Synthea Database, stored in Tigergraph. The Synthea DB dataset is leveraged due to its closer representation of the real-time data and being voluminous. The graph model is built from the EHR heterogeneous data using python modules, namely, pyTigerGraph to get nodes and edges from the Tigergraph database, PyTorch to tensorize the nodes and edges, PyTorch-Geometric (PyG) to train the Graph Neural Network (GNN) and adopt the self-supervised learning techniques with the AutoEncoders to generate the node embeddings and eventually perform the node classifications using the node embeddings. The model predicts patient conditions ranging from common to rare situations. The outcome is deemed to open up opportunities for data querying toward better predictions and accuracy.Keywords: electronic health record, graph neural network, heterogeneous data, prediction
Procedia PDF Downloads 9019490 Predicting Match Outcomes in Team Sport via Machine Learning: Evidence from National Basketball Association
Authors: Jacky Liu
Abstract:
This paper develops a team sports outcome prediction system with potential for wide-ranging applications across various disciplines. Despite significant advancements in predictive analytics, existing studies in sports outcome predictions possess considerable limitations, including insufficient feature engineering and underutilization of advanced machine learning techniques, among others. To address these issues, we extend the Sports Cross Industry Standard Process for Data Mining (SRP-CRISP-DM) framework and propose a unique, comprehensive predictive system, using National Basketball Association (NBA) data as an example to test this extended framework. Our approach follows a holistic methodology in feature engineering, employing both Time Series and Non-Time Series Data, as well as conducting Explanatory Data Analysis and Feature Selection. Furthermore, we contribute to the discourse on target variable choice in team sports outcome prediction, asserting that point spread prediction yields higher profits as opposed to game-winner predictions. Using machine learning algorithms, particularly XGBoost, results in a significant improvement in predictive accuracy of team sports outcomes. Applied to point spread betting strategies, it offers an astounding annual return of approximately 900% on an initial investment of $100. Our findings not only contribute to academic literature, but have critical practical implications for sports betting. Our study advances the understanding of team sports outcome prediction a burgeoning are in complex system predictions and pave the way for potential profitability and more informed decision making in sports betting markets.Keywords: machine learning, team sports, game outcome prediction, sports betting, profits simulation
Procedia PDF Downloads 10919489 Analysis and Prediction of COVID-19 by Using Recurrent LSTM Neural Network Model in Machine Learning
Authors: Grienggrai Rajchakit
Abstract:
As we all know that coronavirus is announced as a pandemic in the world by WHO. It is speeded all over the world with few days of time. To control this spreading, every citizen maintains social distance and self-preventive measures are the best strategies. As of now, many researchers and scientists are continuing their research in finding out the exact vaccine. The machine learning model finds that the coronavirus disease behaves in an exponential manner. To abolish the consequence of this pandemic, an efficient step should be taken to analyze this disease. In this paper, a recurrent neural network model is chosen to predict the number of active cases in a particular state. To make this prediction of active cases, we need a database. The database of COVID-19 is downloaded from the KAGGLE website and is analyzed by applying a recurrent LSTM neural network with univariant features to predict the number of active cases of patients suffering from the corona virus. The downloaded database is divided into training and testing the chosen neural network model. The model is trained with the training data set and tested with a testing dataset to predict the number of active cases in a particular state; here, we have concentrated on Andhra Pradesh state.Keywords: COVID-19, coronavirus, KAGGLE, LSTM neural network, machine learning
Procedia PDF Downloads 16419488 Reliability Modeling on Drivers’ Decision during Yellow Phase
Authors: Sabyasachi Biswas, Indrajit Ghosh
Abstract:
The random and heterogeneous behavior of vehicles in India puts up a greater challenge for researchers. Stop-and-go modeling at signalized intersections under heterogeneous traffic conditions has remained one of the most sought-after fields. Vehicles are often caught up in the dilemma zone and are unable to take quick decisions whether to stop or cross the intersection. This hampers the traffic movement and may lead to accidents. The purpose of this work is to develop a stop and go prediction model that depicts the drivers’ decision during the yellow time at signalised intersections. To accomplish this, certain traffic parameters were taken into account to develop surrogate model. This research investigated the Stop and Go behavior of the drivers by collecting data from 4-signalized intersections located in two major Indian cities. Model was developed to predict the drivers’ decision making during the yellow phase of the traffic signal. The parameters used for modeling included distance to stop line, time to stop line, speed, and length of the vehicle. A Kriging base surrogate model has been developed to investigate the drivers’ decision-making behavior in amber phase. It is observed that the proposed approach yields a highly accurate result (97.4 percent) by Gaussian function. It was observed that the accuracy for the crossing probability was 95.45, 90.9 and 86.36.11 percent respectively as predicted by the Kriging models with Gaussian, Exponential and Linear functions.Keywords: decision-making decision, dilemma zone, surrogate model, Kriging
Procedia PDF Downloads 31119487 Modeling and Shape Prediction for Elastic Kinematic Chains
Authors: Jiun Jeon, Byung-Ju Yi
Abstract:
This paper investigates modeling and shape prediction of elastic kinematic chains such as colonoscopy. 2D and 3D models of elastic kinematic chains are suggested and their behaviors are demonstrated through simulation. To corroborate the effectiveness of those models, experimental work is performed using a magnetic sensor system.Keywords: elastic kinematic chain, shape prediction, colonoscopy, modeling
Procedia PDF Downloads 61019486 Artificial Neural Network-Based Prediction of Effluent Quality of Wastewater Treatment Plant Employing Data Preprocessing Approaches
Authors: Vahid Nourani, Atefeh Ashrafi
Abstract:
Prediction of treated wastewater quality is a matter of growing importance in water treatment procedure. In this way artificial neural network (ANN), as a robust data-driven approach, has been widely used for forecasting the effluent quality of wastewater treatment. However, developing ANN model based on appropriate input variables is a major concern due to the numerous parameters which are collected from treatment process and the number of them are increasing in the light of electronic sensors development. Various studies have been conducted, using different clustering methods, in order to classify most related and effective input variables. This issue has been overlooked in the selecting dominant input variables among wastewater treatment parameters which could effectively lead to more accurate prediction of water quality. In the presented study two ANN models were developed with the aim of forecasting effluent quality of Tabriz city’s wastewater treatment plant. Biochemical oxygen demand (BOD) was utilized to determine water quality as a target parameter. Model A used Principal Component Analysis (PCA) for input selection as a linear variance-based clustering method. Model B used those variables identified by the mutual information (MI) measure. Therefore, the optimal ANN structure when the result of model B compared with model A showed up to 15% percent increment in Determination Coefficient (DC). Thus, this study highlights the advantage of PCA method in selecting dominant input variables for ANN modeling of wastewater plant efficiency performance.Keywords: Artificial Neural Networks, biochemical oxygen demand, principal component analysis, mutual information, Tabriz wastewater treatment plant, wastewater treatment plant
Procedia PDF Downloads 13319485 Correlation and Prediction of Biodiesel Density
Authors: Nieves M. C. Talavera-Prieto, Abel G. M. Ferreira, António T. G. Portugal, Rui J. Moreira, Jaime B. Santos
Abstract:
The knowledge of biodiesel density over large ranges of temperature and pressure is important for predicting the behavior of fuel injection and combustion systems in diesel engines, and for the optimization of such systems. In this study, cottonseed oil was transesterified into biodiesel and its density was measured at temperatures between 288 K and 358 K and pressures between 0.1 MPa and 30 MPa, with expanded uncertainty estimated as ±1.6 kg.m^-3. Experimental pressure-volume-temperature (pVT) cottonseed data was used along with literature data relative to other 18 biodiesels, in order to build a database used to test the correlation of density with temperarure and pressure using the Goharshadi–Morsali–Abbaspour equation of state (GMA EoS). To our knowledge, this is the first that density measurements are presented for cottonseed biodiesel under such high pressures, and the GMA EoS used to model biodiesel density. The new tested EoS allowed correlations within 0.2 kg•m-3 corresponding to average relative deviations within 0.02%. The built database was used to develop and test a new full predictive model derived from the observed linear relation between density and degree of unsaturation (DU), which depended from biodiesel FAMEs profile. The average density deviation of this method was only about 3 kg.m-3 within the temperature and pressure limits of application. These results represent appreciable improvements in the context of density prediction at high pressure when compared with other equations of state.Keywords: biodiesel density, correlation, equation of state, prediction
Procedia PDF Downloads 61919484 Enhanced CNN for Rice Leaf Disease Classification in Mobile Applications
Authors: Kayne Uriel K. Rodrigo, Jerriane Hillary Heart S. Marcial, Samuel C. Brillo
Abstract:
Rice leaf diseases significantly impact yield production in rice-dependent countries, affecting their agricultural sectors. As part of precision agriculture, early and accurate detection of these diseases is crucial for effective mitigation practices and minimizing crop losses. Hence, this study proposes an enhancement to the Convolutional Neural Network (CNN), a widely-used method for Rice Leaf Disease Image Classification, by incorporating MobileViTV2—a recently advanced architecture that combines CNN and Vision Transformer models while maintaining fewer parameters, making it suitable for broader deployment on edge devices. Our methodology utilizes a publicly available rice disease image dataset from Kaggle, which was validated by a university structural biologist following the guidelines provided by the Philippine Rice Institute (PhilRice). Modifications to the dataset include renaming certain disease categories and augmenting the rice leaf image data through rotation, scaling, and flipping. The enhanced dataset was then used to train the MobileViTV2 model using the Timm library. The results of our approach are as follows: the model achieved notable performance, with 98% accuracy in both training and validation, 6% training and validation loss, and a Receiver Operating Characteristic (ROC) curve ranging from 95% to 100% for each label. Additionally, the F1 score was 97%. These metrics demonstrate a significant improvement compared to a conventional CNN-based approach, which, in a previous 2022 study, achieved only 78% accuracy after using 5 convolutional layers and 2 dense layers. Thus, it can be concluded that MobileViTV2, with its fewer parameters, outperforms traditional CNN models, particularly when applied to Rice Leaf Disease Image Identification. For future work, we recommend extending this model to include datasets validated by international rice experts and broadening the scope to accommodate biotic factors such as rice pest classification, as well as abiotic stressors such as climate, soil quality, and geographic information, which could improve the accuracy of disease prediction.Keywords: convolutional neural network, MobileViTV2, rice leaf disease, precision agriculture, image classification, vision transformer
Procedia PDF Downloads 3319483 Automated Driving Deep Neural Networks Model Accuracy and Performance Assessment in a Simulated Environment
Authors: David Tena-Gago, Jose M. Alcaraz Calero, Qi Wang
Abstract:
The evolution and integration of automated vehicles have become more and more tangible in recent years. State-of-the-art technological advances in the field of camera-based Artificial Intelligence (AI) and computer vision greatly favor the performance and reliability of the Advanced Driver Assistance System (ADAS), leading to a greater knowledge of vehicular operation and resembling human behavior. However, the exclusive use of this technology still seems insufficient to control vehicular operation at 100%. To reveal the degree of accuracy of the current camera-based automated driving AI modules, this paper studies the structure and behavior of one of the main solutions in a controlled testing environment. The results obtained clearly outline the lack of reliability when using exclusively the AI model in the perception stage, thereby entailing using additional complementary sensors to improve its safety and performance.Keywords: accuracy assessment, AI-driven mobility, artificial intelligence, automated vehicles
Procedia PDF Downloads 11719482 Data-Driven Surrogate Models for Damage Prediction of Steel Liquid Storage Tanks under Seismic Hazard
Authors: Laura Micheli, Majd Hijazi, Mahmoud Faytarouni
Abstract:
The damage reported by oil and gas industrial facilities revealed the utmost vulnerability of steel liquid storage tanks to seismic events. The failure of steel storage tanks may yield devastating and long-lasting consequences on built and natural environments, including the release of hazardous substances, uncontrolled fires, and soil contamination with hazardous materials. It is, therefore, fundamental to reliably predict the damage that steel liquid storage tanks will likely experience under future seismic hazard events. The seismic performance of steel liquid storage tanks is usually assessed using vulnerability curves obtained from the numerical simulation of a tank under different hazard scenarios. However, the computational demand of high-fidelity numerical simulation models, such as finite element models, makes the vulnerability assessment of liquid storage tanks time-consuming and often impractical. As a solution, this paper presents a surrogate model-based strategy for predicting seismic-induced damage in steel liquid storage tanks. In the proposed strategy, the surrogate model is leveraged to reduce the computational demand of time-consuming numerical simulations. To create the data set for training the surrogate model, field damage data from past earthquakes reconnaissance surveys and reports are collected. Features representative of steel liquid storage tank characteristics (e.g., diameter, height, liquid level, yielding stress) and seismic excitation parameters (e.g., peak ground acceleration, magnitude) are extracted from the field damage data. The collected data are then utilized to train a surrogate model that maps the relationship between tank characteristics, seismic hazard parameters, and seismic-induced damage via a data-driven surrogate model. Different types of surrogate algorithms, including naïve Bayes, k-nearest neighbors, decision tree, and random forest, are investigated, and results in terms of accuracy are reported. The model that yields the most accurate predictions is employed to predict future damage as a function of tank characteristics and seismic hazard intensity level. Results show that the proposed approach can be used to estimate the extent of damage in steel liquid storage tanks, where the use of data-driven surrogates represents a viable alternative to computationally expensive numerical simulation models.Keywords: damage prediction , data-driven model, seismic performance, steel liquid storage tanks, surrogate model
Procedia PDF Downloads 14719481 Artificial Intelligence-Generated Previews of Hyaluronic Acid-Based Treatments
Authors: Ciro Cursio, Giulia Cursio, Pio Luigi Cursio, Luigi Cursio
Abstract:
Communication between practitioner and patient is of the utmost importance in aesthetic medicine: as of today, images of previous treatments are the most common tool used by doctors to describe and anticipate future results for their patients. However, using photos of other people often reduces the engagement of the prospective patient and is further limited by the number and quality of pictures available to the practitioner. Pre-existing work solves this issue in two ways: 3D scanning of the area with manual editing of the 3D model by the doctor or automatic prediction of the treatment by warping the image with hand-written parameters. The first approach requires the manual intervention of the doctor, while the second approach always generates results that aren’t always realistic. Thus, in one case, there is significant manual work required by the doctor, and in the other case, the prediction looks artificial. We propose an AI-based algorithm that autonomously generates a realistic prediction of treatment results. For the purpose of this study, we focus on hyaluronic acid treatments in the facial area. Our approach takes into account the individual characteristics of each face, and furthermore, the prediction system allows the patient to decide which area of the face she wants to modify. We show that the predictions generated by our system are realistic: first, the quality of the generated images is on par with real images; second, the prediction matches the actual results obtained after the treatment is completed. In conclusion, the proposed approach provides a valid tool for doctors to show patients what they will look like before deciding on the treatment.Keywords: prediction, hyaluronic acid, treatment, artificial intelligence
Procedia PDF Downloads 12019480 Personalized Infectious Disease Risk Prediction System: A Knowledge Model
Authors: Retno A. Vinarti, Lucy M. Hederman
Abstract:
This research describes a knowledge model for a system which give personalized alert to users about infectious disease risks in the context of weather, location and time. The knowledge model is based on established epidemiological concepts augmented by information gleaned from infection-related data repositories. The existing disease risk prediction research has more focuses on utilizing raw historical data and yield seasonal patterns of infectious disease risk emergence. This research incorporates both data and epidemiological concepts gathered from Atlas of Human Infectious Disease (AHID) and Centre of Disease Control (CDC) as basic reasoning of infectious disease risk prediction. Using CommonKADS methodology, the disease risk prediction task is an assignment synthetic task, starting from knowledge identification through specification, refinement to implementation. First, knowledge is gathered from AHID primarily from the epidemiology and risk group chapters for each infectious disease. The result of this stage is five major elements (Person, Infectious Disease, Weather, Location and Time) and their properties. At the knowledge specification stage, the initial tree model of each element and detailed relationships are produced. This research also includes a validation step as part of knowledge refinement: on the basis that the best model is formed using the most common features, Frequency-based Selection (FBS) is applied. The portion of the Infectious Disease risk model relating to Person comes out strongest, with Location next, and Weather weaker. For Person attribute, Age is the strongest, Activity and Habits are moderate, and Blood type is weakest. At the Location attribute, General category (e.g. continents, region, country, and island) results much stronger than Specific category (i.e. terrain feature). For Weather attribute, Less Precise category (i.e. season) comes out stronger than Precise category (i.e. exact temperature or humidity interval). However, given that some infectious diseases are significantly more serious than others, a frequency based metric may not be appropriate. Future work will incorporate epidemiological measurements of disease seriousness (e.g. odds ratio, hazard ratio and fatality rate) into the validation metrics. This research is limited to modelling existing knowledge about epidemiology and chain of infection concepts. Further step, verification in knowledge refinement stage, might cause some minor changes on the shape of tree.Keywords: epidemiology, knowledge modelling, infectious disease, prediction, risk
Procedia PDF Downloads 24419479 Using Machine Learning to Predict Answers to Big-Five Personality Questions
Authors: Aadityaa Singla
Abstract:
The big five personality traits are as follows: openness, conscientiousness, extraversion, agreeableness, and neuroticism. In order to get an insight into their personality, many flocks to these categories, which each have different meanings/characteristics. This information is important not only to individuals but also to career professionals and psychologists who can use this information for candidate assessment or job recruitment. The links between AI and psychology have been well studied in cognitive science, but it is still a rather novel development. It is possible for various AI classification models to accurately predict a personality question via ten input questions. This would contrast with the hundred questions that normal humans have to answer to gain a complete picture of their five personality traits. In order to approach this problem, various AI classification models were used on a dataset to predict what a user may answer. From there, the model's prediction was compared to its actual response. Normally, there are five answer choices (a 20% chance of correct guess), and the models exceed that value to different degrees, proving their significance. By utilizing an MLP classifier, decision tree, linear model, and K-nearest neighbors, they were able to obtain a test accuracy of 86.643, 54.625, 47.875, and 52.125, respectively. These approaches display that there is potential in the future for more nuanced predictions to be made regarding personality.Keywords: machine learning, personally, big five personality traits, cognitive science
Procedia PDF Downloads 14919478 Online Prediction of Nonlinear Signal Processing Problems Based Kernel Adaptive Filtering
Authors: Hamza Nejib, Okba Taouali
Abstract:
This paper presents two of the most knowing kernel adaptive filtering (KAF) approaches, the kernel least mean squares and the kernel recursive least squares, in order to predict a new output of nonlinear signal processing. Both of these methods implement a nonlinear transfer function using kernel methods in a particular space named reproducing kernel Hilbert space (RKHS) where the model is a linear combination of kernel functions applied to transform the observed data from the input space to a high dimensional feature space of vectors, this idea known as the kernel trick. Then KAF is the developing filters in RKHS. We use two nonlinear signal processing problems, Mackey Glass chaotic time series prediction and nonlinear channel equalization to figure the performance of the approaches presented and finally to result which of them is the adapted one.Keywords: online prediction, KAF, signal processing, RKHS, Kernel methods, KRLS, KLMS
Procedia PDF Downloads 40419477 Prediction of Wind Speed by Artificial Neural Networks for Energy Application
Authors: S. Adjiri-Bailiche, S. M. Boudia, H. Daaou, S. Hadouche, A. Benzaoui
Abstract:
In this work the study of changes in the wind speed depending on the altitude is calculated and described by the model of the neural networks, the use of measured data, the speed and direction of wind, temperature and the humidity at 10 m are used as input data and as data targets at 50m above sea level. Comparing predict wind speeds and extrapolated at 50 m above sea level is performed. The results show that the prediction by the method of artificial neural networks is very accurate.Keywords: MATLAB, neural network, power low, vertical extrapolation, wind energy, wind speed
Procedia PDF Downloads 697