Search results for: general linear regression model
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 23483

Search results for: general linear regression model

23453 Apricot Insurance Portfolio Risk

Authors: Kasirga Yildirak, Ismail Gur

Abstract:

We propose a model to measure hail risk of an Agricultural Insurance portfolio. Hail is one of the major catastrophic event that causes big amount of loss to an insurer. Moreover, it is very hard to predict due to its strange atmospheric characteristics. We make use of parcel based claims data on apricot damage collected by the Turkish Agricultural Insurance Pool (TARSIM). As our ultimate aim is to compute the loadings assigned to specific parcels, we build a portfolio risk model that makes use of PD and the severity of the exposures. PD is computed by Spherical-Linear and Circular –Linear regression models as the data carries coordinate information and seasonality. Severity is mapped into integer brackets so that Probability Generation Function could be employed. Individual regressions are run on each clusters estimated on different criteria. Loss distribution is constructed by Panjer Recursion technique. We also show that one risk-one crop model can easily be extended to the multi risk–multi crop model by assuming conditional independency.

Keywords: hail insurance, spherical regression, circular regression, spherical clustering

Procedia PDF Downloads 230
23452 Stock Market Prediction by Regression Model with Social Moods

Authors: Masahiro Ohmura, Koh Kakusho, Takeshi Okadome

Abstract:

This paper presents a regression model with autocorrelated errors in which the inputs are social moods obtained by analyzing the adjectives in Twitter posts using a document topic model. The regression model predicts Dow Jones Industrial Average (DJIA) more precisely than autoregressive moving-average models.

Keywords: stock market prediction, social moods, regression model, DJIA

Procedia PDF Downloads 521
23451 Non-Methane Hydrocarbons Emission during the Photocopying Process

Authors: Kiurski S. Jelena, Aksentijević M. Snežana, Kecić S. Vesna, Oros B. Ivana

Abstract:

The prosperity of electronic equipment in photocopying environment not only has improved work efficiency, but also has changed indoor air quality. Considering the number of photocopying employed, indoor air quality might be worse than in general office environments. Determining the contribution from any type of equipment to indoor air pollution is a complex matter. Non-methane hydrocarbons are known to have an important role of air quality due to their high reactivity. The presence of hazardous pollutants in indoor air has been detected in one photocopying shop in Novi Sad, Serbia. Air samples were collected and analyzed for five days, during 8-hr working time in three-time intervals, whereas three different sampling points were determined. Using multiple linear regression model and software package STATISTICA 10 the concentrations of occupational hazards and micro-climates parameters were mutually correlated. Based on the obtained multiple coefficients of determination (0.3751, 0.2389, and 0.1975), a weak positive correlation between the observed variables was determined. Small values of parameter F indicated that there was no statistically significant difference between the concentration levels of non-methane hydrocarbons and micro-climates parameters. The results showed that variable could be presented by the general regression model: y = b0 + b1xi1+ b2xi2. Obtained regression equations allow to measure the quantitative agreement between the variation of variables and thus obtain more accurate knowledge of their mutual relations.

Keywords: non-methane hydrocarbons, photocopying process, multiple regression analysis, indoor air quality, pollutant emission

Procedia PDF Downloads 351
23450 Generic Model for Timetabling Problems by Integer Linear Programmimg Approach

Authors: Nur Aidya Hanum Aizam, Vikneswary Uvaraja

Abstract:

The agenda of showing the scheduled time for performing certain tasks is known as timetabling. It widely used in many departments such as transportation, education, and production. Some difficulties arise to ensure all tasks happen in the time and place allocated. Therefore, many researchers invented various programming model to solve the scheduling problems from several fields. However, the studies in developing the general integer programming model for many timetabling problems are still questionable. Meanwhile, this thesis describe about creating a general model which solve different types of timetabling problems by considering the basic constraints. Initially, the common basic constraints from five different fields are selected and analyzed. A general basic integer programming model was created and then verified by using the medium set of data obtained randomly which is much similar to realistic data. The mathematical software, AIMMS with CPLEX as a solver has been used to solve the model. The model obtained is significant in solving many timetabling problems easily since it is modifiable to all types of scheduling problems which have same basic constraints.

Keywords: AIMMS mathematical software, integer linear programming, scheduling problems, timetabling

Procedia PDF Downloads 407
23449 Survival Analysis Based Delivery Time Estimates for Display FAB

Authors: Paul Han, Jun-Geol Baek

Abstract:

In the flat panel display industry, the scheduler and dispatching system to meet production target quantities and the deadline of production are the major production management system which controls each facility production order and distribution of WIP (Work in Process). In dispatching system, delivery time is a key factor for the time when a lot can be supplied to the facility. In this paper, we use survival analysis methods to identify main factors and a forecasting model of delivery time. Of survival analysis techniques to select important explanatory variables, the cox proportional hazard model is used to. To make a prediction model, the Accelerated Failure Time (AFT) model was used. Performance comparisons were conducted with two other models, which are the technical statistics model based on transfer history and the linear regression model using same explanatory variables with AFT model. As a result, the Mean Square Error (MSE) criteria, the AFT model decreased by 33.8% compared to the existing prediction model, decreased by 5.3% compared to the linear regression model. This survival analysis approach is applicable to implementing a delivery time estimator in display manufacturing. And it can contribute to improve the productivity and reliability of production management system.

Keywords: delivery time, survival analysis, Cox PH model, accelerated failure time model

Procedia PDF Downloads 505
23448 Reminiscence Therapy for Alzheimer’s Disease Restrained on Logistic Regression Based Linear Bootstrap Aggregating

Authors: P. S. Jagadeesh Kumar, Mingmin Pan, Xianpei Li, Yanmin Yuan, Tracy Lin Huan

Abstract:

Researchers are doing enchanting research into the inherited features of Alzheimer’s disease and probable consistent therapies. In Alzheimer’s, memories are extinct in reverse order; memories formed lately are more transitory than those from formerly. Reminiscence therapy includes the conversation of past actions, trials and knowledges with another individual or set of people, frequently with the help of perceptible reminders such as photos, household and other acquainted matters from the past, music and collection of tapes. In this manuscript, the competence of reminiscence therapy for Alzheimer’s disease is measured using logistic regression based linear bootstrap aggregating. Logistic regression is used to envisage the experiential features of the patient’s memory through various therapies. Linear bootstrap aggregating shows better stability and accuracy of reminiscence therapy used in statistical classification and regression of memories related to validation therapy, supportive psychotherapy, sensory integration and simulated presence therapy.

Keywords: Alzheimer’s disease, linear bootstrap aggregating, logistic regression, reminiscence therapy

Procedia PDF Downloads 273
23447 Logistic Regression Model versus Additive Model for Recurrent Event Data

Authors: Entisar A. Elgmati

Abstract:

Recurrent infant diarrhea is studied using daily data collected in Salvador, Brazil over one year and three months. A logistic regression model is fitted instead of Aalen's additive model using the same covariates that were used in the analysis with the additive model. The model gives reasonably similar results to that using additive regression model. In addition, the problem with the estimated conditional probabilities not being constrained between zero and one in additive model is solved here. Also martingale residuals that have been used to judge the goodness of fit for the additive model are shown to be useful for judging the goodness of fit of the logistic model.

Keywords: additive model, cumulative probabilities, infant diarrhoea, recurrent event

Procedia PDF Downloads 608
23446 The Predictors of Student Engagement: Instructional Support vs Emotional Support

Authors: Tahani Salman Alangari

Abstract:

Student success can be impacted by internal factors such as their emotional well-being and external factors such as organizational support and instructional support in the classroom. This study is to identify at least one factor that forecasts student engagement. It is a cross-sectional, conducted on 6206 teachers and encompassed three years of data collection and observations of math instruction in approximately 50 schools and 300 classrooms. A multiple linear regression revealed that a model predicting student engagement from emotional support, classroom organization, and instructional support was significant. Four linear regression models were tested using hierarchical regression to examine the effects of independent variables: emotional support was the highest predictor of student engagement while instructional support was the lowest.

Keywords: student engagement, emotional support, organizational support, instructional support, well-being

Procedia PDF Downloads 53
23445 A Research on Tourism Market Forecast and Its Evaluation

Authors: Min Wei

Abstract:

The traditional prediction methods of the forecast for tourism market are paid more attention to the accuracy of the forecasts, ignoring the results of the feasibility of forecasting and predicting operability, which had made it difficult to predict the results of scientific testing. With the application of Linear Regression Model, this paper attempts to construct a scientific evaluation system for predictive value, both to ensure the accuracy, stability of the predicted value, and to ensure the feasibility of forecasting and predicting the results of operation. The findings show is that a scientific evaluation system can implement the scientific concept of development, the harmonious development of man and nature co-ordinate.

Keywords: linear regression model, tourism market, forecast, tourism economics

Procedia PDF Downloads 302
23444 Formulating a Flexible-Spread Fuzzy Regression Model Based on Dissemblance Index

Authors: Shih-Pin Chen, Shih-Syuan You

Abstract:

This study proposes a regression model with flexible spreads for fuzzy input-output data to cope with the situation that the existing measures cannot reflect the actual estimation error. The main idea is that a dissemblance index (DI) is carefully identified and defined for precisely measuring the actual estimation error. Moreover, the graded mean integration (GMI) representation is adopted for determining more representative numeric regression coefficients. Notably, to comprehensively compare the performance of the proposed model with other ones, three different criteria are adopted. The results from commonly used test numerical examples and an application to Taiwan's business monitoring indicator illustrate that the proposed dissemblance index method not only produces valid fuzzy regression models for fuzzy input-output data, but also has satisfactory and stable performance in terms of the total estimation error based on these three criteria.

Keywords: dissemblance index, forecasting, fuzzy sets, linear regression

Procedia PDF Downloads 331
23443 Model-Based Software Regression Test Suite Reduction

Authors: Shiwei Deng, Yang Bao

Abstract:

In this paper, we present a model-based regression test suite reducing approach that uses EFSM model dependence analysis and probability-driven greedy algorithm to reduce software regression test suites. The approach automatically identifies the difference between the original model and the modified model as a set of elementary model modifications. The EFSM dependence analysis is performed for each elementary modification to reduce the regression test suite, and then the probability-driven greedy algorithm is adopted to select the minimum set of test cases from the reduced regression test suite that cover all interaction patterns. Our initial experience shows that the approach may significantly reduce the size of regression test suites.

Keywords: dependence analysis, EFSM model, greedy algorithm, regression test

Procedia PDF Downloads 399
23442 Statistical Analysis of the Impact of Maritime Transport Gross Domestic Product (GDP) on Nigeria’s Economy

Authors: Kehinde Peter Oyeduntan, Kayode Oshinubi

Abstract:

Nigeria is referred as the ‘Giant of Africa’ due to high population, land mass and large economy. However, it still trails far behind many smaller economies in the continent in terms of maritime operations. As we have seen that the maritime industry is the spark plug for national growth, because it houses the most crucial infrastructure that generates wealth for a nation, it is worrisome that a nation with six seaports lag in maritime activities. In this research, we have studied how the Gross Domestic Product (GDP) of the maritime transport influences the Nigerian economy. To do this, we applied Simple Linear Regression (SLR), Support Vector Machine (SVM), Polynomial Regression Model (PRM), Generalized Additive Model (GAM) and Generalized Linear Mixed Model (GLMM) to model the relationship between the nation’s Total GDP (TGDP) and the Maritime Transport GDP (MGDP) using a time series data of 20 years. The result showed that the MGDP is statistically significant to the Nigerian economy. Amongst the statistical tool applied, the PRM of order 4 describes the relationship better when compared to other methods. The recommendations presented in this study will guide policy makers and help improve the economy of Nigeria in terms of its GDP.

Keywords: maritime transport, economy, GDP, regression, port

Procedia PDF Downloads 119
23441 Licensing in a Hotelling Model with Quadratic Transportation Costs

Authors: Fehmi Bouguezzi

Abstract:

This paper studies optimal licensing regimes in a linear Hotelling model where firms are located at the end points of the city and where the transportation cost is not linear but quadratic. We study for that a more general cost function and we try to compare the findings with the results of the linear cost. We find the same optimal licensing regimes. A per unit royalty is optimal when innovation is not drastic and no licensing is better when innovation is drastic. We also find that no licensing is always better than fixed fee licensing.

Keywords: Hotelling model, technology transfer, patent licensing, quadratic transportation cost

Procedia PDF Downloads 324
23440 Segmentation of Piecewise Polynomial Regression Model by Using Reversible Jump MCMC Algorithm

Authors: Suparman

Abstract:

Piecewise polynomial regression model is very flexible model for modeling the data. If the piecewise polynomial regression model is matched against the data, its parameters are not generally known. This paper studies the parameter estimation problem of piecewise polynomial regression model. The method which is used to estimate the parameters of the piecewise polynomial regression model is Bayesian method. Unfortunately, the Bayes estimator cannot be found analytically. Reversible jump MCMC algorithm is proposed to solve this problem. Reversible jump MCMC algorithm generates the Markov chain that converges to the limit distribution of the posterior distribution of piecewise polynomial regression model parameter. The resulting Markov chain is used to calculate the Bayes estimator for the parameters of piecewise polynomial regression model.

Keywords: piecewise regression, bayesian, reversible jump MCMC, segmentation

Procedia PDF Downloads 342
23439 Electricity Load Modeling: An Application to Italian Market

Authors: Giovanni Masala, Stefania Marica

Abstract:

Forecasting electricity load plays a crucial role regards decision making and planning for economical purposes. Besides, in the light of the recent privatization and deregulation of the power industry, the forecasting of future electricity load turned out to be a very challenging problem. Empirical data about electricity load highlights a clear seasonal behavior (higher load during the winter season), which is partly due to climatic effects. We also emphasize the presence of load periodicity at a weekly basis (electricity load is usually lower on weekends or holidays) and at daily basis (electricity load is clearly influenced by the hour). Finally, a long-term trend may depend on the general economic situation (for example, industrial production affects electricity load). All these features must be captured by the model. The purpose of this paper is then to build an hourly electricity load model. The deterministic component of the model requires non-linear regression and Fourier series while we will investigate the stochastic component through econometrical tools. The calibration of the parameters’ model will be performed by using data coming from the Italian market in a 6 year period (2007- 2012). Then, we will perform a Monte Carlo simulation in order to compare the simulated data respect to the real data (both in-sample and out-of-sample inspection). The reliability of the model will be deduced thanks to standard tests which highlight a good fitting of the simulated values.

Keywords: ARMA-GARCH process, electricity load, fitting tests, Fourier series, Monte Carlo simulation, non-linear regression

Procedia PDF Downloads 376
23438 A Fuzzy Nonlinear Regression Model for Interval Type-2 Fuzzy Sets

Authors: O. Poleshchuk, E. Komarov

Abstract:

This paper presents a regression model for interval type-2 fuzzy sets based on the least squares estimation technique. Unknown coefficients are assumed to be triangular fuzzy numbers. The basic idea is to determine aggregation intervals for type-1 fuzzy sets, membership functions of whose are low membership function and upper membership function of interval type-2 fuzzy set. These aggregation intervals were called weighted intervals. Low and upper membership functions of input and output interval type-2 fuzzy sets for developed regression models are considered as piecewise linear functions.

Keywords: interval type-2 fuzzy sets, fuzzy regression, weighted interval

Procedia PDF Downloads 336
23437 Competition between Regression Technique and Statistical Learning Models for Predicting Credit Risk Management

Authors: Chokri Slim

Abstract:

The objective of this research is attempting to respond to this question: Is there a significant difference between the regression model and statistical learning models in predicting credit risk management? A Multiple Linear Regression (MLR) model was compared with neural networks including Multi-Layer Perceptron (MLP), and a Support vector regression (SVR). The population of this study includes 50 listed Banks in Tunis Stock Exchange (TSE) market from 2000 to 2016. Firstly, we show the factors that have significant effect on the quality of loan portfolios of banks in Tunisia. Secondly, it attempts to establish that the systematic use of objective techniques and methods designed to apprehend and assess risk when considering applications for granting credit, has a positive effect on the quality of loan portfolios of banks and their future collectability. Finally, we will try to show that the bank governance has an impact on the choice of methods and techniques for analyzing and measuring the risks inherent in the banking business, including the risk of non-repayment. The results of empirical tests confirm our claims.

Keywords: credit risk management, multiple linear regression, principal components analysis, artificial neural networks, support vector machines

Procedia PDF Downloads 122
23436 Regression Model Evaluation on Depth Camera Data for Gaze Estimation

Authors: James Purnama, Riri Fitri Sari

Abstract:

We investigate the machine learning algorithm selection problem in the term of a depth image based eye gaze estimation, with respect to its essential difficulty in reducing the number of required training samples and duration time of training. Statistics based prediction accuracy are increasingly used to assess and evaluate prediction or estimation in gaze estimation. This article evaluates Root Mean Squared Error (RMSE) and R-Squared statistical analysis to assess machine learning methods on depth camera data for gaze estimation. There are 4 machines learning methods have been evaluated: Random Forest Regression, Regression Tree, Support Vector Machine (SVM), and Linear Regression. The experiment results show that the Random Forest Regression has the lowest RMSE and the highest R-Squared, which means that it is the best among other methods.

Keywords: gaze estimation, gaze tracking, eye tracking, kinect, regression model, orange python

Procedia PDF Downloads 508
23435 Time Series Regression with Meta-Clusters

Authors: Monika Chuchro

Abstract:

This paper presents a preliminary attempt to apply classification of time series using meta-clusters in order to improve the quality of regression models. In this case, clustering was performed as a method to obtain a subgroups of time series data with normal distribution from inflow into waste water treatment plant data which Composed of several groups differing by mean value. Two simple algorithms: K-mean and EM were chosen as a clustering method. The rand index was used to measure the similarity. After simple meta-clustering, regression model was performed for each subgroups. The final model was a sum of subgroups models. The quality of obtained model was compared with the regression model made using the same explanatory variables but with no clustering of data. Results were compared by determination coefficient (R2), measure of prediction accuracy mean absolute percentage error (MAPE) and comparison on linear chart. Preliminary results allows to foresee the potential of the presented technique.

Keywords: clustering, data analysis, data mining, predictive models

Procedia PDF Downloads 437
23434 The Extended Skew Gaussian Process for Regression

Authors: M. T. Alodat

Abstract:

In this paper, we propose a generalization to the Gaussian process regression(GPR) model called the extended skew Gaussian process for regression(ESGPr) model. The ESGPR model works better than the GPR model when the errors are skewed. We derive the predictive distribution for the ESGPR model at a new input. Also we apply the ESGPR model to FOREX data and we find that it fits the Forex data better than the GPR model.

Keywords: extended skew normal distribution, Gaussian process for regression, predictive distribution, ESGPr model

Procedia PDF Downloads 523
23433 Predicting Options Prices Using Machine Learning

Authors: Krishang Surapaneni

Abstract:

The goal of this project is to determine how to predict important aspects of options, including the ask price. We want to compare different machine learning models to learn the best model and the best hyperparameters for that model for this purpose and data set. Option pricing is a relatively new field, and it can be very complicated and intimidating, especially to inexperienced people, so we want to create a machine learning model that can predict important aspects of an option stock, which can aid in future research. We tested multiple different models and experimented with hyperparameter tuning, trying to find some of the best parameters for a machine-learning model. We tested three different models: a Random Forest Regressor, a linear regressor, and an MLP (multi-layer perceptron) regressor. The most important feature in this experiment is the ask price; this is what we were trying to predict. In the field of stock pricing prediction, there is a large potential for error, so we are unable to determine the accuracy of the models based on if they predict the pricing perfectly. Due to this factor, we determined the accuracy of the model by finding the average percentage difference between the predicted and actual values. We tested the accuracy of the machine learning models by comparing the actual results in the testing data and the predictions made by the models. The linear regression model performed worst, with an average percentage error of 17.46%. The MLP regressor had an average percentage error of 11.45%, and the random forest regressor had an average percentage error of 7.42%

Keywords: finance, linear regression model, machine learning model, neural network, stock price

Procedia PDF Downloads 54
23432 Application Difference between Cox and Logistic Regression Models

Authors: Idrissa Kayijuka

Abstract:

The logistic regression and Cox regression models (proportional hazard model) at present are being employed in the analysis of prospective epidemiologic research looking into risk factors in their application on chronic diseases. However, a theoretical relationship between the two models has been studied. By definition, Cox regression model also called Cox proportional hazard model is a procedure that is used in modeling data regarding time leading up to an event where censored cases exist. Whereas the Logistic regression model is mostly applicable in cases where the independent variables consist of numerical as well as nominal values while the resultant variable is binary (dichotomous). Arguments and findings of many researchers focused on the overview of Cox and Logistic regression models and their different applications in different areas. In this work, the analysis is done on secondary data whose source is SPSS exercise data on BREAST CANCER with a sample size of 1121 women where the main objective is to show the application difference between Cox regression model and logistic regression model based on factors that cause women to die due to breast cancer. Thus we did some analysis manually i.e. on lymph nodes status, and SPSS software helped to analyze the mentioned data. This study found out that there is an application difference between Cox and Logistic regression models which is Cox regression model is used if one wishes to analyze data which also include the follow-up time whereas Logistic regression model analyzes data without follow-up-time. Also, they have measurements of association which is different: hazard ratio and odds ratio for Cox and logistic regression models respectively. A similarity between the two models is that they are both applicable in the prediction of the upshot of a categorical variable i.e. a variable that can accommodate only a restricted number of categories. In conclusion, Cox regression model differs from logistic regression by assessing a rate instead of proportion. The two models can be applied in many other researches since they are suitable methods for analyzing data but the more recommended is the Cox, regression model.

Keywords: logistic regression model, Cox regression model, survival analysis, hazard ratio

Procedia PDF Downloads 424
23431 Exploration and Evaluation of the Effect of Multiple Countermeasures on Road Safety

Authors: Atheer Al-Nuaimi, Harry Evdorides

Abstract:

Every day many people die or get disabled or injured on roads around the world, which necessitates more specific treatments for transportation safety issues. International road assessment program (iRAP) model is one of the comprehensive road safety models which accounting for many factors that affect road safety in a cost-effective way in low and middle income countries. In iRAP model road safety has been divided into five star ratings from 1 star (the lowest level) to 5 star (the highest level). These star ratings are based on star rating score which is calculated by iRAP methodology depending on road attributes, traffic volumes and operating speeds. The outcome of iRAP methodology are the treatments that can be used to improve road safety and reduce fatalities and serious injuries (FSI) numbers. These countermeasures can be used separately as a single countermeasure or mix as multiple countermeasures for a location. There is general agreement that the adequacy of a countermeasure is liable to consistent losses when it is utilized as a part of mix with different countermeasures. That is, accident diminishment appraisals of individual countermeasures cannot be easily added together. The iRAP model philosophy makes utilization of a multiple countermeasure adjustment factors to predict diminishments in the effectiveness of road safety countermeasures when more than one countermeasure is chosen. A multiple countermeasure correction factors are figured for every 100-meter segment and for every accident type. However, restrictions of this methodology incorporate a presumable over-estimation in the predicted crash reduction. This study aims to adjust this correction factor by developing new models to calculate the effect of using multiple countermeasures on the number of fatalities for a location or an entire road. Regression models have been used to establish relationships between crash frequencies and the factors that affect their rates. Multiple linear regression, negative binomial regression, and Poisson regression techniques were used to develop models that can address the effectiveness of using multiple countermeasures. Analyses are conducted using The R Project for Statistical Computing showed that a model developed by negative binomial regression technique could give more reliable results of the predicted number of fatalities after the implementation of road safety multiple countermeasures than the results from iRAP model. The results also showed that the negative binomial regression approach gives more precise results in comparison with multiple linear and Poisson regression techniques because of the overdispersion and standard error issues.

Keywords: international road assessment program, negative binomial, road multiple countermeasures, road safety

Procedia PDF Downloads 212
23430 Measurement Errors and Misclassifications in Covariates in Logistic Regression: Bayesian Adjustment of Main and Interaction Effects and the Sample Size Implications

Authors: Shahadut Hossain

Abstract:

Measurement errors in continuous covariates and/or misclassifications in categorical covariates are common in epidemiological studies. Regression analysis ignoring such mismeasurements seriously biases the estimated main and interaction effects of covariates on the outcome of interest. Thus, adjustments for such mismeasurements are necessary. In this research, we propose a Bayesian parametric framework for eliminating deleterious impacts of covariate mismeasurements in logistic regression. The proposed adjustment method is unified and thus can be applied to any generalized linear and non-linear regression models. Furthermore, adjustment for covariate mismeasurements requires validation data usually in the form of either gold standard measurements or replicates of the mismeasured covariates on a subset of the study population. Initial investigation shows that adequacy of such adjustment depends on the sizes of main and validation samples, especially when prevalences of the categorical covariates are low. Thus, we investigate the impact of main and validation sample sizes on the adjusted estimates, and provide a general guideline about these sample sizes based on simulation studies.

Keywords: measurement errors, misclassification, mismeasurement, validation sample, Bayesian adjustment

Procedia PDF Downloads 386
23429 Identifying Factors Contributing to the Spread of Lyme Disease: A Regression Analysis of Virginia’s Data

Authors: Fatemeh Valizadeh Gamchi, Edward L. Boone

Abstract:

This research focuses on Lyme disease, a widespread infectious condition in the United States caused by the bacterium Borrelia burgdorferi sensu stricto. It is critical to identify environmental and economic elements that are contributing to the spread of the disease. This study examined data from Virginia to identify a subset of explanatory variables significant for Lyme disease case numbers. To identify relevant variables and avoid overfitting, linear poisson, and regularization regression methods such as a ridge, lasso, and elastic net penalty were employed. Cross-validation was performed to acquire tuning parameters. The methods proposed can automatically identify relevant disease count covariates. The efficacy of the techniques was assessed using four criteria on three simulated datasets. Finally, using the Virginia Department of Health’s Lyme disease data set, the study successfully identified key factors, and the results were consistent with previous studies.

Keywords: lyme disease, Poisson generalized linear model, ridge regression, lasso regression, elastic net regression

Procedia PDF Downloads 98
23428 In and Out-Of-Sample Performance of Non Simmetric Models in International Price Differential Forecasting in a Commodity Country Framework

Authors: Nicola Rubino

Abstract:

This paper presents an analysis of a group of commodity exporting countries' nominal exchange rate movements in relationship to the US dollar. Using a series of Unrestricted Self-exciting Threshold Autoregressive models (SETAR), we model and evaluate sixteen national CPI price differentials relative to the US dollar CPI. Out-of-sample forecast accuracy is evaluated through calculation of mean absolute error measures on the basis of two-hundred and fifty-three months rolling window forecasts and extended to three additional models, namely a logistic smooth transition regression (LSTAR), an additive non linear autoregressive model (AAR) and a simple linear Neural Network model (NNET). Our preliminary results confirm presence of some form of TAR non linearity in the majority of the countries analyzed, with a relatively higher goodness of fit, with respect to the linear AR(1) benchmark, in five countries out of sixteen considered. Although no model appears to statistically prevail over the other, our final out-of-sample forecast exercise shows that SETAR models tend to have quite poor relative forecasting performance, especially when compared to alternative non-linear specifications. Finally, by analyzing the implied half-lives of the > coefficients, our results confirms the presence, in the spirit of arbitrage band adjustment, of band convergence with an inner unit root behaviour in five of the sixteen countries analyzed.

Keywords: transition regression model, real exchange rate, nonlinearities, price differentials, PPP, commodity points

Procedia PDF Downloads 254
23427 A General Approach to Define Adjoint of Linear and Non-linear Operators

Authors: Mehdi Jafari Matehkolaee

Abstract:

In this paper, we have obtained the adjoint of an arbitrary operator (linear and nonlinear) in Hilbert space by introducing an n-dimensional Riemannian manifold. This general formalism covers every linear operator (non – differential) in Hilbert space. In fact, our approach shows that instead of using the adjoint definition of an operator directly, it can be obtained directly by relying on a suitable generalized space according to the action of the operator in question. For the case of nonlinear operators, we have to change the definition of the linear operator adjoint. But here, we have obtained an adjoint of these operators with respect to the definition of the derivative of the operator. As a matter of fact, we have shown one of the straight applications of the ''Frechet derivative'' in the algebra of the operators.

Keywords: adjoint operator, non-linear operator, differentiable operator, manifold

Procedia PDF Downloads 92
23426 Modeling Aeration of Sharp Crested Weirs by Using Support Vector Machines

Authors: Arun Goel

Abstract:

The present paper attempts to investigate the prediction of air entrainment rate and aeration efficiency of a free over-fall jets issuing from a triangular sharp crested weir by using regression based modelling. The empirical equations, support vector machine (polynomial and radial basis function) models and the linear regression techniques were applied on the triangular sharp crested weirs relating the air entrainment rate and the aeration efficiency to the input parameters namely drop height, discharge, and vertex angle. It was observed that there exists a good agreement between the measured values and the values obtained using empirical equations, support vector machine (Polynomial and rbf) models, and the linear regression techniques. The test results demonstrated that the SVM based (Poly & rbf) model also provided acceptable prediction of the measured values with reasonable accuracy along with empirical equations and linear regression techniques in modelling the air entrainment rate and the aeration efficiency of a free over-fall jets issuing from triangular sharp crested weir. Further sensitivity analysis has also been performed to study the impact of input parameter on the output in terms of air entrainment rate and aeration efficiency.

Keywords: air entrainment rate, dissolved oxygen, weir, SVM, regression

Procedia PDF Downloads 405
23425 Optimization of Machine Learning Regression Results: An Application on Health Expenditures

Authors: Songul Cinaroglu

Abstract:

Machine learning regression methods are recommended as an alternative to classical regression methods in the existence of variables which are difficult to model. Data for health expenditure is typically non-normal and have a heavily skewed distribution. This study aims to compare machine learning regression methods by hyperparameter tuning to predict health expenditure per capita. A multiple regression model was conducted and performance results of Lasso Regression, Random Forest Regression and Support Vector Machine Regression recorded when different hyperparameters are assigned. Lambda (λ) value for Lasso Regression, number of trees for Random Forest Regression, epsilon (ε) value for Support Vector Regression was determined as hyperparameters. Study results performed by using 'k' fold cross validation changed from 5 to 50, indicate the difference between machine learning regression results in terms of R², RMSE and MAE values that are statistically significant (p < 0.001). Study results reveal that Random Forest Regression (R² ˃ 0.7500, RMSE ≤ 0.6000 ve MAE ≤ 0.4000) outperforms other machine learning regression methods. It is highly advisable to use machine learning regression methods for modelling health expenditures.

Keywords: machine learning, lasso regression, random forest regression, support vector regression, hyperparameter tuning, health expenditure

Procedia PDF Downloads 190
23424 Electrical Load Estimation Using Estimated Fuzzy Linear Parameters

Authors: Bader Alkandari, Jamal Y. Madouh, Ahmad M. Alkandari, Anwar A. Alnaqi

Abstract:

A new formulation of fuzzy linear estimation problem is presented. It is formulated as a linear programming problem. The objective is to minimize the spread of the data points, taking into consideration the type of the membership function of the fuzzy parameters to satisfy the constraints on each measurement point and to insure that the original membership is included in the estimated membership. Different models are developed for a fuzzy triangular membership. The proposed models are applied to different examples from the area of fuzzy linear regression and finally to different examples for estimating the electrical load on a busbar. It had been found that the proposed technique is more suited for electrical load estimation, since the nature of the load is characterized by the uncertainty and vagueness.

Keywords: fuzzy regression, load estimation, fuzzy linear parameters, electrical load estimation

Procedia PDF Downloads 510