Search results for: regression model
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 18335

Search results for: regression model

18125 Predicting Bridge Pier Scour Depth with SVM

Authors: Arun Goel

Abstract:

Prediction of maximum local scour is necessary for the safety and economical design of the bridges. A number of equations have been developed over the years to predict local scour depth using laboratory data and a few pier equations have also been proposed using field data. Most of these equations are empirical in nature as indicated by the past publications. In this paper, attempts have been made to compute local depth of scour around bridge pier in dimensional and non-dimensional form by using linear regression, simple regression and SVM (Poly and Rbf) techniques along with few conventional empirical equations. The outcome of this study suggests that the SVM (Poly and Rbf) based modeling can be employed as an alternate to linear regression, simple regression and the conventional empirical equations in predicting scour depth of bridge piers. The results of present study on the basis of non-dimensional form of bridge pier scour indicates the improvement in the performance of SVM (Poly and Rbf) in comparison to dimensional form of scour.

Keywords: modeling, pier scour, regression, prediction, SVM (Poly and Rbf kernels)

Procedia PDF Downloads 434
18124 Comprehensive Machine Learning-Based Glucose Sensing from Near-Infrared Spectra

Authors: Bitewulign Mekonnen

Abstract:

Context: This scientific paper focuses on the use of near-infrared (NIR) spectroscopy to determine glucose concentration in aqueous solutions accurately and rapidly. The study compares six different machine learning methods for predicting glucose concentration and also explores the development of a deep learning model for classifying NIR spectra. The objective is to optimize the detection model and improve the accuracy of glucose prediction. This research is important because it provides a comprehensive analysis of various machine-learning techniques for estimating aqueous glucose concentrations. Research Aim: The aim of this study is to compare and evaluate different machine-learning methods for predicting glucose concentration from NIR spectra. Additionally, the study aims to develop and assess a deep-learning model for classifying NIR spectra. Methodology: The research methodology involves the use of machine learning and deep learning techniques. Six machine learning regression models, including support vector machine regression, partial least squares regression, extra tree regression, random forest regression, extreme gradient boosting, and principal component analysis-neural network, are employed to predict glucose concentration. The NIR spectra data is randomly divided into train and test sets, and the process is repeated ten times to increase generalization ability. In addition, a convolutional neural network is developed for classifying NIR spectra. Findings: The study reveals that the SVMR, ETR, and PCA-NN models exhibit excellent performance in predicting glucose concentration, with correlation coefficients (R) > 0.99 and determination coefficients (R²)> 0.985. The deep learning model achieves high macro-averaging scores for precision, recall, and F1-measure. These findings demonstrate the effectiveness of machine learning and deep learning methods in optimizing the detection model and improving glucose prediction accuracy. Theoretical Importance: This research contributes to the field by providing a comprehensive analysis of various machine-learning techniques for estimating glucose concentrations from NIR spectra. It also explores the use of deep learning for the classification of indistinguishable NIR spectra. The findings highlight the potential of machine learning and deep learning in enhancing the prediction accuracy of glucose-relevant features. Data Collection and Analysis Procedures: The NIR spectra and corresponding references for glucose concentration are measured in increments of 20 mg/dl. The data is randomly divided into train and test sets, and the models are evaluated using regression analysis and classification metrics. The performance of each model is assessed based on correlation coefficients, determination coefficients, precision, recall, and F1-measure. Question Addressed: The study addresses the question of whether machine learning and deep learning methods can optimize the detection model and improve the accuracy of glucose prediction from NIR spectra. Conclusion: The research demonstrates that machine learning and deep learning methods can effectively predict glucose concentration from NIR spectra. The SVMR, ETR, and PCA-NN models exhibit superior performance, while the deep learning model achieves high classification scores. These findings suggest that machine learning and deep learning techniques can be used to improve the prediction accuracy of glucose-relevant features. Further research is needed to explore their clinical utility in analyzing complex matrices, such as blood glucose levels.

Keywords: machine learning, signal processing, near-infrared spectroscopy, support vector machine, neural network

Procedia PDF Downloads 70
18123 Development of Computational Approach for Calculation of Hydrogen Solubility in Hydrocarbons for Treatment of Petroleum

Authors: Abdulrahman Sumayli, Saad M. AlShahrani

Abstract:

For the hydrogenation process, knowing the solubility of hydrogen (H2) in hydrocarbons is critical to improve the efficiency of the process. We investigated the H2 solubility computation in four heavy crude oil feedstocks using machine learning techniques. Temperature, pressure, and feedstock type were considered as the inputs to the models, while the hydrogen solubility was the sole response. Specifically, we employed three different models: Support Vector Regression (SVR), Gaussian process regression (GPR), and Bayesian ridge regression (BRR). To achieve the best performance, the hyper-parameters of these models are optimized using the whale optimization algorithm (WOA). We evaluated the models using a dataset of solubility measurements in various feedstocks, and we compared their performance based on several metrics. Our results show that the WOA-SVR model tuned with WOA achieves the best performance overall, with an RMSE of 1.38 × 10− 2 and an R-squared of 0.991. These findings suggest that machine learning techniques can provide accurate predictions of hydrogen solubility in different feedstocks, which could be useful in the development of hydrogen-related technologies. Besides, the solubility of hydrogen in the four heavy oil fractions is estimated in different ranges of temperatures and pressures of 150 ◦C–350 ◦C and 1.2 MPa–10.8 MPa, respectively

Keywords: temperature, pressure variations, machine learning, oil treatment

Procedia PDF Downloads 53
18122 Reminiscence Therapy for Alzheimer’s Disease Restrained on Logistic Regression Based Linear Bootstrap Aggregating

Authors: P. S. Jagadeesh Kumar, Mingmin Pan, Xianpei Li, Yanmin Yuan, Tracy Lin Huan

Abstract:

Researchers are doing enchanting research into the inherited features of Alzheimer’s disease and probable consistent therapies. In Alzheimer’s, memories are extinct in reverse order; memories formed lately are more transitory than those from formerly. Reminiscence therapy includes the conversation of past actions, trials and knowledges with another individual or set of people, frequently with the help of perceptible reminders such as photos, household and other acquainted matters from the past, music and collection of tapes. In this manuscript, the competence of reminiscence therapy for Alzheimer’s disease is measured using logistic regression based linear bootstrap aggregating. Logistic regression is used to envisage the experiential features of the patient’s memory through various therapies. Linear bootstrap aggregating shows better stability and accuracy of reminiscence therapy used in statistical classification and regression of memories related to validation therapy, supportive psychotherapy, sensory integration and simulated presence therapy.

Keywords: Alzheimer’s disease, linear bootstrap aggregating, logistic regression, reminiscence therapy

Procedia PDF Downloads 288
18121 Loan Repayment Prediction Using Machine Learning: Model Development, Django Web Integration and Cloud Deployment

Authors: Seun Mayowa Sunday

Abstract:

Loan prediction is one of the most significant and recognised fields of research in the banking, insurance, and the financial security industries. Some prediction systems on the market include the construction of static software. However, due to the fact that static software only operates with strictly regulated rules, they cannot aid customers beyond these limitations. Application of many machine learning (ML) techniques are required for loan prediction. Four separate machine learning models, random forest (RF), decision tree (DT), k-nearest neighbour (KNN), and logistic regression, are used to create the loan prediction model. Using the anaconda navigator and the required machine learning (ML) libraries, models are created and evaluated using the appropriate measuring metrics. From the finding, the random forest performs with the highest accuracy of 80.17% which was later implemented into the Django framework. For real-time testing, the web application is deployed on the Alibabacloud which is among the top 4 biggest cloud computing provider. Hence, to the best of our knowledge, this research will serve as the first academic paper which combines the model development and the Django framework, with the deployment into the Alibaba cloud computing application.

Keywords: k-nearest neighbor, random forest, logistic regression, decision tree, django, cloud computing, alibaba cloud

Procedia PDF Downloads 112
18120 Prediction of Marijuana Use among Iranian Early Youth: an Application of Integrative Model of Behavioral Prediction

Authors: Mehdi Mirzaei Alavijeh, Farzad Jalilian

Abstract:

Background: Marijuana is the most widely used illicit drug worldwide, especially among adolescents and young adults, which can cause numerous complications. The aim of this study was to determine the pattern, motivation use, and factors related to marijuana use among Iranian youths based on the integrative model of behavioral prediction Methods: A cross-sectional study was conducted among 174 youths marijuana user in Kermanshah County and Isfahan County, during summer 2014 which was selected with the convenience sampling for participation in this study. A self-reporting questionnaire was applied for collecting data. Data were analyzed by SPSS version 21 using bivariate correlations and linear regression statistical tests. Results: The mean marijuana use of respondents was 4.60 times at during week [95% CI: 4.06, 5.15]. Linear regression statistical showed, the structures of integrative model of behavioral prediction accounted for 36% of the variation in the outcome measure of the marijuana use at during week (R2 = 36% & P < 0.001); and among them attitude, marijuana refuse, and subjective norms were a stronger predictors. Conclusion: Comprehensive health education and prevention programs need to emphasize on cognitive factors that predict youth’s health-related behaviors. Based on our findings it seems, designing educational and behavioral intervention for reducing positive belief about marijuana, marijuana self-efficacy refuse promotion and reduce subjective norms encourage marijuana use has an effective potential to protect youths marijuana use.

Keywords: marijuana, youth, integrative model of behavioral prediction, Iran

Procedia PDF Downloads 542
18119 Count Regression Modelling on Number of Migrants in Households

Authors: Tsedeke Lambore Gemecho, Ayele Taye Goshu

Abstract:

The main objective of this study is to identify the determinants of the number of international migrants in a household and to compare regression models for count response. This study is done by collecting data from total of 2288 household heads of 16 randomly sampled districts in Hadiya and Kembata-Tembaro zones of Southern Ethiopia. The Poisson mixed models, as special cases of the generalized linear mixed model, is explored to determine effects of the predictors: age of household head, farm land size, and household size. Two ethnicities Hadiya and Kembata are included in the final model as dummy variables. Stepwise variable selection has indentified four predictors: age of head, farm land size, family size and dummy variable ethnic2 (0=other, 1=Kembata). These predictors are significant at 5% significance level with count response number of migrant. The Poisson mixed model consisting of the four predictors with random effects districts. Area specific random effects are significant with the variance of about 0.5105 and standard deviation of 0.7145. The results show that the number of migrant increases with heads age, family size, and farm land size. In conclusion, there is a significantly high number of international migration per household in the area. Age of household head, family size, and farm land size are determinants that increase the number of international migrant in households. Community-based intervention is needed so as to monitor and regulate the international migration for the benefits of the society.

Keywords: Poisson regression, GLM, number of migrant, Hadiya and Kembata Tembaro zones

Procedia PDF Downloads 269
18118 Multilayer Perceptron Neural Network for Rainfall-Water Level Modeling

Authors: Thohidul Islam, Md. Hamidul Haque, Robin Kumar Biswas

Abstract:

Floods are one of the deadliest natural disasters which are very complex to model; however, machine learning is opening the door for more reliable and accurate flood prediction. In this research, a multilayer perceptron neural network (MLP) is developed to model the rainfall-water level relation, in a subtropical monsoon climatic region of the Bangladesh-India border. Our experiments show promising empirical results to forecast the water level for 1 day lead time. Our best performing MLP model achieves 98.7% coefficient of determination with lower model complexity which surpasses previously reported results on similar forecasting problems.

Keywords: flood forecasting, machine learning, multilayer perceptron network, regression

Procedia PDF Downloads 157
18117 Forecast of the Small Wind Turbines Sales with Replacement Purchases and with or without Account of Price Changes

Authors: V. Churkin, M. Lopatin

Abstract:

The purpose of the paper is to estimate the US small wind turbines market potential and forecast the small wind turbines sales in the US. The forecasting method is based on the application of the Bass model and the generalized Bass model of innovations diffusion under replacement purchases. In the work an exponential distribution is used for modeling of replacement purchases. Only one parameter of such distribution is determined by average lifetime of small wind turbines. The identification of the model parameters is based on nonlinear regression analysis on the basis of the annual sales statistics which has been published by the American Wind Energy Association (AWEA) since 2001 up to 2012. The estimation of the US average market potential of small wind turbines (for adoption purchases) without account of price changes is 57080 (confidence interval from 49294 to 64866 at P = 0.95) under average lifetime of wind turbines 15 years, and 62402 (confidence interval from 54154 to 70648 at P = 0.95) under average lifetime of wind turbines 20 years. In the first case the explained variance is 90,7%, while in the second - 91,8%. The effect of the wind turbines price changes on their sales was estimated using generalized Bass model. This required a price forecast. To do this, the polynomial regression function, which is based on the Berkeley Lab statistics, was used. The estimation of the US average market potential of small wind turbines (for adoption purchases) in that case is 42542 (confidence interval from 32863 to 52221 at P = 0.95) under average lifetime of wind turbines 15 years, and 47426 (confidence interval from 36092 to 58760 at P = 0.95) under average lifetime of wind turbines 20 years. In the first case the explained variance is 95,3%, while in the second –95,3%.

Keywords: bass model, generalized bass model, replacement purchases, sales forecasting of innovations, statistics of sales of small wind turbines in the United States

Procedia PDF Downloads 334
18116 Predictors of School Drop out among High School Students

Authors: Osman Zorbaz, Selen Demirtas-Zorbaz, Ozlem Ulas

Abstract:

The factors that cause adolescents to drop out school were several. One of the frameworks about school dropout focuses on the contextual factors around the adolescents whereas the other one focuses on individual factors. It can be said that both factors are important equally. In this study, both adolescent’s individual factors (anti-social behaviors, academic success) and contextual factors (parent academic involvement, parent academic support, number of siblings, living with parent) were examined in the term of school dropout. The study sample consisted of 346 high school students in the public schools in Ankara who continued their education in 2015-2016 academic year. One hundred eighty-five the students (53.5%) were girls and 161 (46.5%) were boys. In addition to this 118 of them were in ninth grade, 122 of them in tenth grade and 106 of them were in eleventh grade. Multiple regression and one-way ANOVA statistical methods were used. First, it was examined if the data meet the assumptions and conditions that are required for regression analysis. After controlling the assumptions, regression analysis was conducted. Parent academic involvement, parent academic support, number of siblings, anti-social behaviors, academic success variables were taken into the regression model and it was seen that parent academic involvement (t=-3.023, p < .01), anti-social behaviors (t=7.038, p < .001), and academic success (t=-3.718, p < .001) predicted school dropout whereas parent academic support (t=-1.403, p > .05) and number of siblings (t=-1.908, p > .05) didn’t. The model explained 30% of the variance (R=.557, R2=.300, F5,345=30.626, p < .001). In addition to this the variance, results showed there was no significant difference on high school students school dropout levels according to living with parents or not (F2;345=1.183, p > .05). Results discussed in the light of the literature and suggestion were made. As a result, academic involvement, academic success and anti-social behaviors will be considered as an important factors for preventing school drop-out.

Keywords: adolescents, anti-social behavior, parent academic involvement, parent academic support, school dropout

Procedia PDF Downloads 255
18115 Machine Vision System for Measuring the Quality of Bulk Sun-dried Organic Raisins

Authors: Navab Karimi, Tohid Alizadeh

Abstract:

An intelligent vision-based system was designed to measure the quality and purity of raisins. A machine vision setup was utilized to capture the images of bulk raisins in ranges of 5-50% mixed pure-impure berries. The textural features of bulk raisins were extracted using Grey-level Histograms, Co-occurrence Matrix, and Local Binary Pattern (a total of 108 features). Genetic Algorithm and neural network regression were used for selecting and ranking the best features (21 features). As a result, the GLCM features set was found to have the highest accuracy (92.4%) among the other sets. Followingly, multiple feature combinations of the previous stage were fed into the second regression (linear regression) to increase accuracy, wherein a combination of 16 features was found to be the optimum. Finally, a Support Vector Machine (SVM) classifier was used to differentiate the mixtures, producing the best efficiency and accuracy of 96.2% and 97.35%, respectively.

Keywords: sun-dried organic raisin, genetic algorithm, feature extraction, ann regression, linear regression, support vector machine, south azerbaijan.

Procedia PDF Downloads 57
18114 Application of the Least Squares Method in the Adjustment of Chlorodifluoromethane (HCFC-142b) Regression Models

Authors: L. J. de Bessa Neto, V. S. Filho, J. V. Ferreira Nunes, G. C. Bergamo

Abstract:

There are many situations in which human activities have significant effects on the environment. Damage to the ozone layer is one of them. The objective of this work is to use the Least Squares Method, considering the linear, exponential, logarithmic, power and polynomial models of the second degree, to analyze through the coefficient of determination (R²), which model best fits the behavior of the chlorodifluoromethane (HCFC-142b) in parts per trillion between 1992 and 2018, as well as estimates of future concentrations between 5 and 10 periods, i.e. the concentration of this pollutant in the years 2023 and 2028 in each of the adjustments. A total of 809 observations of the concentration of HCFC-142b in one of the monitoring stations of gases precursors of the deterioration of the ozone layer during the period of time studied were selected and, using these data, the statistical software Excel was used for make the scatter plots of each of the adjustment models. With the development of the present study, it was observed that the logarithmic fit was the model that best fit the data set, since besides having a significant R² its adjusted curve was compatible with the natural trend curve of the phenomenon.

Keywords: chlorodifluoromethane (HCFC-142b), ozone, least squares method, regression models

Procedia PDF Downloads 107
18113 Development of a Regression Based Model to Predict Subjective Perception of Squeak and Rattle Noise

Authors: Ramkumar R., Gaurav Shinde, Pratik Shroff, Sachin Kumar Jain, Nagesh Walke

Abstract:

Advancements in electric vehicles have significantly reduced the powertrain noise and moving components of vehicles. As a result, in-cab noises have become more noticeable to passengers inside the car. To ensure a comfortable ride for drivers and other passengers, it has become crucial to eliminate undesirable component noises during the development phase. Standard practices are followed to identify the severity of noises based on subjective ratings, but it can be a tedious process to identify the severity of each development sample and make changes to reduce it. Additionally, the severity rating can vary from jury to jury, making it challenging to arrive at a definitive conclusion. To address this, an automotive component was identified to evaluate squeak and rattle noise issue. Physical tests were carried out for random and sine excitation profiles. Aim was to subjectively assess the noise using jury rating method and objectively evaluate the same by measuring the noise. Suitable jury evaluation method was selected for the said activity, and recorded sounds were replayed for jury rating. Objective data sound quality metrics viz., loudness, sharpness, roughness, fluctuation strength and overall Sound Pressure Level (SPL) were measured. Based on this, correlation co-efficients was established to identify the most relevant sound quality metrics that are contributing to particular identified noise issue. Regression analysis was then performed to establish the correlation between subjective and objective data. Mathematical model was prepared using artificial intelligence and machine learning algorithm. The developed model was able to predict the subjective rating with good accuracy.

Keywords: BSR, noise, correlation, regression

Procedia PDF Downloads 63
18112 Estimation of Coefficient of Discharge of Side Trapezoidal Labyrinth Weir Using Group Method of Data Handling Technique

Authors: M. A. Ansari, A. Hussain, A. Uddin

Abstract:

A side weir is a flow diversion structure provided in the side wall of a channel to divert water from the main channel to a branch channel. The trapezoidal labyrinth weir is a special type of weir in which crest length of the weir is increased to pass higher discharge. Experimental and numerical studies related to the coefficient of discharge of trapezoidal labyrinth weir in an open channel have been presented in the present study. Group Method of Data Handling (GMDH) with the transfer function of quadratic polynomial has been used to predict the coefficient of discharge for the side trapezoidal labyrinth weir. A new model is developed for coefficient of discharge of labyrinth weir by regression method. Generalized models for predicting the coefficient of discharge for labyrinth weir using Group Method of Data Handling (GMDH) network have also been developed. The prediction based on GMDH model is more satisfactory than those given by traditional regression equations.

Keywords: discharge coefficient, group method of data handling, open channel, side labyrinth weir

Procedia PDF Downloads 146
18111 A Study on Characteristics of Hedonic Price Models in Korea Based on Meta-Regression Analysis

Authors: Minseo Jo

Abstract:

The purpose of this paper is to examine the factors in the hedonic price models, that has significance impact in determining the price of apartments. There are many variables employed in the hedonic price models and their effectiveness vary differently according to the researchers and the regions they are analysing. In order to consider various conditions, the meta-regression analysis has been selected for the study. In this paper, four meta-independent variables, from the 65 hedonic price models to analysis. The factors that influence the prices of apartments, as well as including factors that influence the prices of apartments, regions, which are divided into two of the research performed, years of research performed, the coefficients of the functions employed. The covariance between the four meta-variables and p-value of the coefficients and the four meta-variables and number of data used in the 65 hedonic price models have been analyzed in this study. The six factors that are most important in deciding the prices of apartments are positioning of apartments, the noise of the apartments, points of the compass and views from the apartments, proximity to the public transportations, companies that have constructed the apartments, social environments (such as schools etc.).

Keywords: hedonic price model, housing price, meta-regression analysis, characteristics

Procedia PDF Downloads 382
18110 Support Vector Regression with Weighted Least Absolute Deviations

Authors: Kang-Mo Jung

Abstract:

Least squares support vector machine (LS-SVM) is a penalized regression which considers both fitting and generalization ability of a model. However, the squared loss function is very sensitive to even single outlier. We proposed a weighted absolute deviation loss function for the robustness of the estimates in least absolute deviation support vector machine. The proposed estimates can be obtained by a quadratic programming algorithm. Numerical experiments on simulated datasets show that the proposed algorithm is competitive in view of robustness to outliers.

Keywords: least absolute deviation, quadratic programming, robustness, support vector machine, weight

Procedia PDF Downloads 507
18109 Statistical Analysis and Impact Forecasting of Connected and Autonomous Vehicles on the Environment: Case Study in the State of Maryland

Authors: Alireza Ansariyar, Safieh Laaly

Abstract:

Over the last decades, the vehicle industry has shown increased interest in integrating autonomous, connected, and electrical technologies in vehicle design with the primary hope of improving mobility and road safety while reducing transportation’s environmental impact. Using the State of Maryland (M.D.) in the United States as a pilot study, this research investigates CAVs’ fuel consumption and air pollutants (C.O., PM, and NOx) and utilizes meaningful linear regression models to predict CAV’s environmental effects. Maryland transportation network was simulated in VISUM software, and data on a set of variables were collected through a comprehensive survey. The number of pollutants and fuel consumption were obtained for the time interval 2010 to 2021 from the macro simulation. Eventually, four linear regression models were proposed to predict the amount of C.O., NOx, PM pollutants, and fuel consumption in the future. The results highlighted that CAVs’ pollutants and fuel consumption have a significant correlation with the income, age, and race of the CAV customers. Furthermore, the reliability of four statistical models was compared with the reliability of macro simulation model outputs in the year 2030. The error of three pollutants and fuel consumption was obtained at less than 9% by statistical models in SPSS. This study is expected to assist researchers and policymakers with planning decisions to reduce CAV environmental impacts in M.D.

Keywords: connected and autonomous vehicles, statistical model, environmental effects, pollutants and fuel consumption, VISUM, linear regression models

Procedia PDF Downloads 426
18108 Application of the Tripartite Model to the Link between Non-Suicidal Self-Injury and Suicidal Risk

Authors: Ashley Wei-Ting Wang, Wen-Yau Hsu

Abstract:

Objectives: The current study applies and expands the Tripartite Model to elaborate the link between non-suicidal self-injury (NSSI) and suicidal behavior. We propose a structural model of NSSI and suicidal risk, in which negative affect (NA) predicts both anxiety and depression, positive affect (PA) predicts depression only, anxiety is linked to NSSI, and depression is linked to suicidal risk. Method: Four hundreds and eighty seven undergraduates participated. Data were collected by administering self-report questionnaires. We performed hierarchical regression and structural equation modeling to test the proposed structural model. Results: The results largely support the proposed structural model, with one exception: anxiety was strongly associated with NSSI and to a lesser extent with suicidal risk. Conclusions: We conclude that the co-occurrence of NSSI and suicidal risk is due to NA and anxiety, and suicidal risk can be differentiated by depression. Further theoretical and practical implications are discussed.

Keywords: non-suicidal self-injury, suicidal risk, anxiety, depression, the tripartite model, hierarchical relationship

Procedia PDF Downloads 451
18107 Switched System Diagnosis Based on Intelligent State Filtering with Unknown Models

Authors: Nada Slimane, Foued Theljani, Faouzi Bouani

Abstract:

The paper addresses the problem of fault diagnosis for systems operating in several modes (normal or faulty) based on states assessment. We use, for this purpose, a methodology consisting of three main processes: 1) sequential data clustering, 2) linear model regression and 3) state filtering. Typically, Kalman Filter (KF) is an algorithm that provides estimation of unknown states using a sequence of I/O measurements. Inevitably, although it is an efficient technique for state estimation, it presents two main weaknesses. First, it merely predicts states without being able to isolate/classify them according to their different operating modes, whether normal or faulty modes. To deal with this dilemma, the KF is endowed with an extra clustering step based fully on sequential version of the k-means algorithm. Second, to provide state estimation, KF requires state space models, which can be unknown. A linear regularized regression is used to identify the required models. To prove its effectiveness, the proposed approach is assessed on a simulated benchmark.

Keywords: clustering, diagnosis, Kalman Filtering, k-means, regularized regression

Procedia PDF Downloads 165
18106 Determining the Causality Variables in Female Genital Mutilation: A Factor Screening Approach

Authors: Ekele Alih, Enejo Jalija

Abstract:

Female Genital Mutilation (FGM) is made up of three types namely: Clitoridectomy, Excision and Infibulation. In this study, we examine the factors responsible for FGM in order to identify the causality variables in a logistic regression approach. From the result of the survey conducted by the Public Health Division, Nigeria Institute of Medical Research, Yaba, Lagos State, the tau statistic, τ was used to screen 9 factors that causes FGM in order to select few of the predictors before multiple regression equation is obtained. The need for this may be that the sample size may not be able to sustain having a regression with all the predictors or to avoid multi-collinearity. A total of 300 respondents, comprising 150 adult males and 150 adult females were selected for the household survey based on the multi-stage sampling procedure. The tau statistic,

Keywords: female genital mutilation, logistic regression, tau statistic, African society

Procedia PDF Downloads 236
18105 An Information Matrix Goodness-of-Fit Test of the Conditional Logistic Model for Matched Case-Control Studies

Authors: Li-Ching Chen

Abstract:

The case-control design has been widely applied in clinical and epidemiological studies to investigate the association between risk factors and a given disease. The retrospective design can be easily implemented and is more economical over prospective studies. To adjust effects for confounding factors, methods such as stratification at the design stage and may be adopted. When some major confounding factors are difficult to be quantified, a matching design provides an opportunity for researchers to control the confounding effects. The matching effects can be parameterized by the intercepts of logistic models and the conditional logistic regression analysis is then adopted. This study demonstrates an information-matrix-based goodness-of-fit statistic to test the validity of the logistic regression model for matched case-control data. The asymptotic null distribution of this proposed test statistic is inferred. It needs neither to employ a simulation to evaluate its critical values nor to partition covariate space. The asymptotic power of this test statistic is also derived. The performance of the proposed method is assessed through simulation studies. An example of the real data set is applied to illustrate the implementation of the proposed method as well.

Keywords: conditional logistic model, goodness-of-fit, information matrix, matched case-control studies

Procedia PDF Downloads 273
18104 Factors Affecting Slot Machine Performance in an Electronic Gaming Machine Facility

Authors: Etienne Provencal, David L. St-Pierre

Abstract:

A facility exploiting only electronic gambling machines (EGMs) opened in 2007 in Quebec City, Canada under the name of Salons de Jeux du Québec (SdjQ). This facility is one of the first worldwide to rely on that business model. This paper models the performance of such EGMs. The interest from a managerial point of view is to identify the variables that can be controlled or influenced so that a comprehensive model can help improve the overall performance of the business. The EGM individual performance model contains eight different variables under study (Game Title, Progressive jackpot, Bonus Round, Minimum Coin-in, Maximum Coin-in, Denomination, Slant Top and Position). Using data from Quebec City’s SdjQ, a linear regression analysis explains 90.80% of the EGM performance. Moreover, results show a behavior slightly different than that of a casino. The addition of GameTitle as a factor to predict the EGM performance is one of the main contributions of this paper. The choice of the game (GameTitle) is very important. Games having better position do not have significantly better performance than games located elsewhere on the gaming floor. Progressive jackpots have a positive and significant effect on the individual performance of EGMs. The impact of BonusRound on the dependent variable is significant but negative. The effect of Denomination is significant but weakly negative. As expected, the Language of an EGMS does not impact its individual performance. This paper highlights some possible improvements by indicating which features are performing well. Recommendations are given to increase the performance of the EGMs performance.

Keywords: EGM, linear regression, model prediction, slot operations

Procedia PDF Downloads 241
18103 Robust Variable Selection Based on Schwarz Information Criterion for Linear Regression Models

Authors: Shokrya Saleh A. Alshqaq, Abdullah Ali H. Ahmadini

Abstract:

The Schwarz information criterion (SIC) is a popular tool for selecting the best variables in regression datasets. However, SIC is defined using an unbounded estimator, namely, the least-squares (LS), which is highly sensitive to outlying observations, especially bad leverage points. A method for robust variable selection based on SIC for linear regression models is thus needed. This study investigates the robustness properties of SIC by deriving its influence function and proposes a robust SIC based on the MM-estimation scale. The aim of this study is to produce a criterion that can effectively select accurate models in the presence of vertical outliers and high leverage points. The advantages of the proposed robust SIC is demonstrated through a simulation study and an analysis of a real dataset.

Keywords: influence function, robust variable selection, robust regression, Schwarz information criterion

Procedia PDF Downloads 126
18102 Regression Analysis in Estimating Stream-Flow and the Effect of Hierarchical Clustering Analysis: A Case Study in Euphrates-Tigris Basin

Authors: Goksel Ezgi Guzey, Bihrat Onoz

Abstract:

The scarcity of streamflow gauging stations and the increasing effects of global warming cause designing water management systems to be very difficult. This study is a significant contribution to assessing regional regression models for estimating streamflow. In this study, simulated meteorological data was related to the observed streamflow data from 1971 to 2020 for 33 stream gauging stations of the Euphrates-Tigris Basin. Ordinary least squares regression was used to predict flow for 2020-2100 with the simulated meteorological data. CORDEX- EURO and CORDEX-MENA domains were used with 0.11 and 0.22 grids, respectively, to estimate climate conditions under certain climate scenarios. Twelve meteorological variables simulated by two regional climate models, RCA4 and RegCM4, were used as independent variables in the ordinary least squares regression, where the observed streamflow was the dependent variable. The variability of streamflow was then calculated with 5-6 meteorological variables and watershed characteristics such as area and height prior to the application. Of the regression analysis of 31 stream gauging stations' data, the stations were subjected to a clustering analysis, which grouped the stations in two clusters in terms of their hydrometeorological properties. Two streamflow equations were found for the two clusters of stream gauging stations for every domain and every regional climate model, which increased the efficiency of streamflow estimation by a range of 10-15% for all the models. This study underlines the importance of homogeneity of a region in estimating streamflow not only in terms of the geographical location but also in terms of the meteorological characteristics of that region.

Keywords: hydrology, streamflow estimation, climate change, hydrologic modeling, HBV, hydropower

Procedia PDF Downloads 106
18101 A Comparison of Neural Network and DOE-Regression Analysis for Predicting Resource Consumption of Manufacturing Processes

Authors: Frank Kuebler, Rolf Steinhilper

Abstract:

Artificial neural networks (ANN) as well as Design of Experiments (DOE) based regression analysis (RA) are mainly used for modeling of complex systems. Both methodologies are commonly applied in process and quality control of manufacturing processes. Due to the fact that resource efficiency has become a critical concern for manufacturing companies, these models needs to be extended to predict resource-consumption of manufacturing processes. This paper describes an approach to use neural networks as well as DOE based regression analysis for predicting resource consumption of manufacturing processes and gives a comparison of the achievable results based on an industrial case study of a turning process.

Keywords: artificial neural network, design of experiments, regression analysis, resource efficiency, manufacturing process

Procedia PDF Downloads 507
18100 Hybrid Rocket Motor Performance Parameters: Theoretical and Experimental Evaluation

Authors: A. El-S. Makled, M. K. Al-Tamimi

Abstract:

A mathematical model to predict the performance parameters (thrusts, chamber pressures, fuel mass flow rates, mixture ratios, and regression rates during firing time) of hybrid rocket motor (HRM) is evaluated. The internal ballistic (IB) hybrid combustion model assumes that the solid fuel surface regression rate is controlled only by heat transfer (convective and radiative) from flame zone to solid fuel burning surface. A laboratory HRM is designed, manufactured, and tested for low thrust profile space missions (10-15 N) and for validating the mathematical model (computer program). The polymer material and gaseous oxidizer which are selected for this experimental work are polymethyle-methacrylate (PMMA) and polyethylene (PE) as solid fuel grain and gaseous oxygen (GO2) as oxidizer. The variation of various operational parameters with time is determined systematically and experimentally in firing of up to 20 seconds, and an average combustion efficiency of 95% of theory is achieved, which was the goal of these experiments. The comparison between recording fire data and predicting analytical parameters shows good agreement with the error that does not exceed 4.5% during all firing time. The current mathematical (computer) code can be used as a powerful tool for HRM analytical design parameters.

Keywords: hybrid combustion, internal ballistics, hybrid rocket motor, performance parameters

Procedia PDF Downloads 292
18099 Deriving an Index of Adoption Rate and Assessing Factors Affecting Adoption of an Agroforestry-Based Farming System in Dhanusha District, Nepal

Authors: Arun Dhakal, Geoff Cockfield, Tek Narayan Maraseni

Abstract:

This paper attempts to fulfil the gap in measuring adoption in agroforestry studies. It explains the derivation of an index of adoption rate in a Nepalese context and examines the factors affecting adoption of agroforestry-based land management practice (AFLMP) in the Dhanusha District of Nepal. Data about the different farm practices and the factors (bio-physical, socio-economic) influencing adoption were collected during focus group discussion and from the randomly selected households using a household survey questionnaire, respectively. A multivariate regression model was used to determine the factors. The factors (variables) found to significantly affect adoption of AFLMP were: farm size, availability of irrigation water, education of household heads, agricultural labour force, frequency of visits by extension workers, expenditure on farm inputs purchase, household’s experience in agroforestry, and distance from home to government forest. The regression model explained about 75% of variation in adoption decision. The model rejected ‘erosion hazard’, ‘flood hazard’ and ‘gender’ as determinants of adoption, which in case of single agroforestry practice were major variables and played positive role. Out of eight variables, farm size played the most powerful role in explaining the variation in adoption, followed by availability of irrigation water and education of household heads. The results of this study suggest that policies to promote the provision of irrigation water, extension services and motivation to obtaining higher education would probably provide the incentive to adopt agroforestry elsewhere in the terai of Nepal.

Keywords: agroforestry, adoption index, determinants of adoption, step-wise linear regression, Nepal

Procedia PDF Downloads 482
18098 Factors Affecting Green Consumption Behaviors of the Urban Residents in Hanoi, Vietnam

Authors: Phan Thi Song Thuong

Abstract:

This paper uses data from a survey on the green consumption behavior of Hanoi residents in October 2022. Data was gathered from a survey conducted in ten districts in the center of Hanoi, with 393 respondents. The hypothesis focuses on understanding the factors that may affect green consumption behavior, such as demographic characteristics, concerns about the environment and health, people living around, self-efficiency, and mass media. A number of methods, such as the T-test, exploratory factor analysis, and a linear regression model, are used to prove the hypotheses. Accordingly, the results show that gender, age, and education level have separate effects on the green consumption behavior of respondents.

Keywords: green consumption, urban residents, environment, sustainable, linear regression

Procedia PDF Downloads 100
18097 A Machine Learning Approach for Earthquake Prediction in Various Zones Based on Solar Activity

Authors: Viacheslav Shkuratskyy, Aminu Bello Usman, Michael O’Dea, Saifur Rahman Sabuj

Abstract:

This paper examines relationships between solar activity and earthquakes; it applied machine learning techniques: K-nearest neighbour, support vector regression, random forest regression, and long short-term memory network. Data from the SILSO World Data Center, the NOAA National Center, the GOES satellite, NASA OMNIWeb, and the United States Geological Survey were used for the experiment. The 23rd and 24th solar cycles, daily sunspot number, solar wind velocity, proton density, and proton temperature were all included in the dataset. The study also examined sunspots, solar wind, and solar flares, which all reflect solar activity and earthquake frequency distribution by magnitude and depth. The findings showed that the long short-term memory network model predicts earthquakes more correctly than the other models applied in the study, and solar activity is more likely to affect earthquakes of lower magnitude and shallow depth than earthquakes of magnitude 5.5 or larger with intermediate depth and deep depth.

Keywords: k-nearest neighbour, support vector regression, random forest regression, long short-term memory network, earthquakes, solar activity, sunspot number, solar wind, solar flares

Procedia PDF Downloads 54
18096 Spatial Pattern and Predictors of Malaria in Ethiopia: Application of Auto Logistics Spatial Regression

Authors: Melkamu A. Zeru, Yamral M. Warkaw, Aweke A. Mitku, Muluwerk Ayele

Abstract:

Introduction: Malaria is a severe health threat in the World, mainly in Africa. It is the major cause of health problems in which the risk of morbidity and mortality associated with malaria cases are characterized by spatial variations across the county. This study aimed to investigate the spatial patterns and predictors of malaria distribution in Ethiopia. Methods: A weighted sample of 15,239 individuals with rapid diagnosis tests was obtained from the Central Statistical Agency and Ethiopia malaria indicator survey of 2015. Global Moran's I and Moran scatter plots were used in determining the distribution of malaria cases, whereas the local Moran's I statistic was used in identifying exposed areas. In data manipulation, machine learning was used for variable reduction and statistical software R, Stata, and Python were used for data management and analysis. The auto logistics spatial binary regression model was used to investigate the predictors of malaria. Results: The final auto logistics regression model reported that male clients had a positive significant effect on malaria cases as compared to female clients [AOR=2.401, 95 % CI: (2.125 - 2.713)]. The distribution of malaria across the regions was different. The highest incidence of malaria was found in Gambela [AOR=52.55, 95%CI: (40.54-68.12)] followed by Beneshangul [AOR=34.95, 95%CI: (27.159 - 44.963)]. Similarly, individuals in Amhara [AOR=0.243, 95% CI:(0.1950.303],Oromiya[AOR=0.197,95%CI:(0.1580.244)],DireDawa[AOR=0.064,95%CI(0.049-0.082)],AddisAbaba[AOR=0.057,95%CI:(0.044-0.075)], Somali[AOR=0.077,95%CI:(0.059-0.097)], SNNPR[OR=0.329, 95%CI: (0.261- 0.413)] and Harari [AOR=0.256, 95%CI:(0.201 - 0.325)] were less likely to had low incidence of malaria as compared with Tigray. Furthermore, for a one-meter increase in altitude, the odds of a positive rapid diagnostic test (RDT) decrease by 1.6% [AOR = 0.984, 95% CI :( 0.984 - 0.984)]. The use of a shared toilet facility was found as a protective factor for malaria in Ethiopia [AOR=1.671, 95% CI: (1.504 - 1.854)]. The spatial autocorrelation variable changes the constant from AOR = 0.471 for logistic regression to AOR = 0.164 for auto logistics regression. Conclusions: This study found that the incidence of malaria in Ethiopia had a spatial pattern that is associated with socio-economic, demographic, and geographic risk factors. Spatial clustering of malaria cases had occurred in all regions, and the risk of clustering was different across the regions. The risk of malaria was found to be higher for those who live in soil floor-type houses as compared to those who live in cement or ceramics floor type. Similarly, households with thatched, metal and thin, and other roof-type houses have a higher risk of malaria than ceramic tiles roof houses. Moreover, using a protected anti-mosquito net reduced the risk of malaria incidence.

Keywords: malaria, Ethiopia, auto logistics, spatial model, spatial clustering

Procedia PDF Downloads 12