Search results for: nonparametric regression
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3144

Search results for: nonparametric regression

3054 Automatic API Regression Analyzer and Executor

Authors: Praveena Sridhar, Nihar Devathi, Parikshit Chakraborty

Abstract:

As the software product changes versions across releases, there are changes to the API’s and features and the upgrades become necessary. Hence, it becomes imperative to get the impact of upgrading the dependent components. This tool finds out API changes across two versions and their impact on other API’s followed by execution of the automated regression suites relevant to updates and their impacted areas. This tool has 4 layer architecture, each layer with its own unique pre-assigned capability which it does and sends the required information to next layer. This are the 4 layers. 1) Comparator: Compares the two versions of API. 2) Analyzer: Analyses the API doc and gives the modified class and its dependencies along with implemented interface details. 3) Impact Filter: Find the impact of the modified class on the other API methods. 4) Auto Executer: Based on the output given by Impact Filter, Executor will run the API regression Suite. Tool reads the java doc and extracts the required information of classes, interfaces and enumerations. The extracted information is saved into a data structure which shows the class details and its dependencies along with interfaces and enumerations that are listed in the java doc.

Keywords: automation impact regression, java doc, executor, analyzer, layers

Procedia PDF Downloads 459
3053 Multiobjective Optimization of a Pharmaceutical Formulation Using Regression Method

Authors: J. Satya Eswari, Ch. Venkateswarlu

Abstract:

The formulation of a commercial pharmaceutical product involves several composition factors and response characteristics. When the formulation requires to satisfy multiple response characteristics which are conflicting, an optimal solution requires the need for an efficient multiobjective optimization technique. In this work, a regression is combined with a non-dominated sorting differential evolution (NSDE) involving Naïve & Slow and ε constraint techniques to derive different multiobjective optimization strategies, which are then evaluated by means of a trapidil pharmaceutical formulation. The analysis of the results show the effectiveness of the strategy that combines the regression model and NSDE with the integration of both Naïve & Slow and ε constraint techniques for Pareto optimization of trapidil formulation. With this strategy, the optimal formulation at pH=6.8 is obtained with the decision variables of micro crystalline cellulose, hydroxypropyl methylcellulose and compression pressure. The corresponding response characteristics of rate constant and release order are also noted down. The comparison of these results with the experimental data and with those of other multiple regression model based multiobjective evolutionary optimization strategies signify the better performance for optimal trapidil formulation.

Keywords: pharmaceutical formulation, multiple regression model, response surface method, radial basis function network, differential evolution, multiobjective optimization

Procedia PDF Downloads 381
3052 Multi-Linear Regression Based Prediction of Mass Transfer by Multiple Plunging Jets

Authors: S. Deswal, M. Pal

Abstract:

The paper aims to compare the performance of vertical and inclined multiple plunging jets and to model and predict their mass transfer capacity by multi-linear regression based approach. The multiple vertical plunging jets have jet impact angle of θ = 90O; whereas, multiple inclined plunging jets have jet impact angle of θ = 600. The results of the study suggests that mass transfer is higher for multiple jets, and inclined multiple plunging jets have up to 1.6 times higher mass transfer than vertical multiple plunging jets under similar conditions. The derived relationship, based on multi-linear regression approach, has successfully predicted the volumetric mass transfer coefficient (KLa) from operational parameters of multiple plunging jets with a correlation coefficient of 0.973, root mean square error of 0.002 and coefficient of determination of 0.946. The results suggests that predicted overall mass transfer coefficient is in good agreement with actual experimental values; thereby suggesting the utility of derived relationship based on multi-linear regression based approach and can be successfully employed in modelling mass transfer by multiple plunging jets.

Keywords: mass transfer, multiple plunging jets, multi-linear regression, earth sciences

Procedia PDF Downloads 426
3051 Competition between Regression Technique and Statistical Learning Models for Predicting Credit Risk Management

Authors: Chokri Slim

Abstract:

The objective of this research is attempting to respond to this question: Is there a significant difference between the regression model and statistical learning models in predicting credit risk management? A Multiple Linear Regression (MLR) model was compared with neural networks including Multi-Layer Perceptron (MLP), and a Support vector regression (SVR). The population of this study includes 50 listed Banks in Tunis Stock Exchange (TSE) market from 2000 to 2016. Firstly, we show the factors that have significant effect on the quality of loan portfolios of banks in Tunisia. Secondly, it attempts to establish that the systematic use of objective techniques and methods designed to apprehend and assess risk when considering applications for granting credit, has a positive effect on the quality of loan portfolios of banks and their future collectability. Finally, we will try to show that the bank governance has an impact on the choice of methods and techniques for analyzing and measuring the risks inherent in the banking business, including the risk of non-repayment. The results of empirical tests confirm our claims.

Keywords: credit risk management, multiple linear regression, principal components analysis, artificial neural networks, support vector machines

Procedia PDF Downloads 120
3050 Credit Risk Prediction Based on Bayesian Estimation of Logistic Regression Model with Random Effects

Authors: Sami Mestiri, Abdeljelil Farhat

Abstract:

The aim of this current paper is to predict the credit risk of banks in Tunisia, over the period (2000-2005). For this purpose, two methods for the estimation of the logistic regression model with random effects: Penalized Quasi Likelihood (PQL) method and Gibbs Sampler algorithm are applied. By using the information on a sample of 528 Tunisian firms and 26 financial ratios, we show that Bayesian approach improves the quality of model predictions in terms of good classification as well as by the ROC curve result.

Keywords: forecasting, credit risk, Penalized Quasi Likelihood, Gibbs Sampler, logistic regression with random effects, curve ROC

Procedia PDF Downloads 509
3049 Bayesian Variable Selection in Quantile Regression with Application to the Health and Retirement Study

Authors: Priya Kedia, Kiranmoy Das

Abstract:

There is a rich literature on variable selection in regression setting. However, most of these methods assume normality for the response variable under consideration for implementing the methodology and establishing the statistical properties of the estimates. In many real applications, the distribution for the response variable may be non-Gaussian, and one might be interested in finding the best subset of covariates at some predetermined quantile level. We develop dynamic Bayesian approach for variable selection in quantile regression framework. We use a zero-inflated mixture prior for the regression coefficients, and consider the asymmetric Laplace distribution for the response variable for modeling different quantiles of its distribution. An efficient Gibbs sampler is developed for our computation. Our proposed approach is assessed through extensive simulation studies, and real application of the proposed approach is also illustrated. We consider the data from health and retirement study conducted by the University of Michigan, and select the important predictors when the outcome of interest is out-of-pocket medical cost, which is considered as an important measure for financial risk. Our analysis finds important predictors at different quantiles of the outcome, and thus enhance our understanding on the effects of different predictors on the out-of-pocket medical cost.

Keywords: variable selection, quantile regression, Gibbs sampler, asymmetric Laplace distribution

Procedia PDF Downloads 123
3048 Ordinal Regression with Fenton-Wilkinson Order Statistics: A Case Study of an Orienteering Race

Authors: Joonas Pääkkönen

Abstract:

In sports, individuals and teams are typically interested in final rankings. Final results, such as times or distances, dictate these rankings, also known as places. Places can be further associated with ordered random variables, commonly referred to as order statistics. In this work, we introduce a simple, yet accurate order statistical ordinal regression function that predicts relay race places with changeover-times. We call this function the Fenton-Wilkinson Order Statistics model. This model is built on the following educated assumption: individual leg-times follow log-normal distributions. Moreover, our key idea is to utilize Fenton-Wilkinson approximations of changeover-times alongside an estimator for the total number of teams as in the notorious German tank problem. This original place regression function is sigmoidal and thus correctly predicts the existence of a small number of elite teams that significantly outperform the rest of the teams. Our model also describes how place increases linearly with changeover-time at the inflection point of the log-normal distribution function. With real-world data from Jukola 2019, a massive orienteering relay race, the model is shown to be highly accurate even when the size of the training set is only 5% of the whole data set. Numerical results also show that our model exhibits smaller place prediction root-mean-square-errors than linear regression, mord regression and Gaussian process regression.

Keywords: Fenton-Wilkinson approximation, German tank problem, log-normal distribution, order statistics, ordinal regression, orienteering, sports analytics, sports modeling

Procedia PDF Downloads 98
3047 The Predictors of Student Engagement: Instructional Support vs Emotional Support

Authors: Tahani Salman Alangari

Abstract:

Student success can be impacted by internal factors such as their emotional well-being and external factors such as organizational support and instructional support in the classroom. This study is to identify at least one factor that forecasts student engagement. It is a cross-sectional, conducted on 6206 teachers and encompassed three years of data collection and observations of math instruction in approximately 50 schools and 300 classrooms. A multiple linear regression revealed that a model predicting student engagement from emotional support, classroom organization, and instructional support was significant. Four linear regression models were tested using hierarchical regression to examine the effects of independent variables: emotional support was the highest predictor of student engagement while instructional support was the lowest.

Keywords: student engagement, emotional support, organizational support, instructional support, well-being

Procedia PDF Downloads 50
3046 Modeling Standpipe Pressure Using Multivariable Regression Analysis by Combining Drilling Parameters and a Herschel-Bulkley Model

Authors: Seydou Sinde

Abstract:

The aims of this paper are to formulate mathematical expressions that can be used to estimate the standpipe pressure (SPP). The developed formulas take into account the main factors that, directly or indirectly, affect the behavior of SPP values. Fluid rheology and well hydraulics are some of these essential factors. Mud Plastic viscosity, yield point, flow power, consistency index, flow rate, drillstring, and annular geometries are represented by the frictional pressure (Pf), which is one of the input independent parameters and is calculated, in this paper, using Herschel-Bulkley rheological model. Other input independent parameters include the rate of penetration (ROP), applied load or weight on the bit (WOB), bit revolutions per minute (RPM), bit torque (TRQ), and hole inclination and direction coupled in the hole curvature or dogleg (DL). The technique of repeating parameters and Buckingham PI theorem are used to reduce the number of the input independent parameters into the dimensionless revolutions per minute (RPMd), the dimensionless torque (TRQd), and the dogleg, which is already in the dimensionless form of radians. Multivariable linear and polynomial regression technique using PTC Mathcad Prime 4.0 is used to analyze and determine the exact relationships between the dependent parameter, which is SPP, and the remaining three dimensionless groups. Three models proved sufficiently satisfactory to estimate the standpipe pressure: multivariable linear regression model 1 containing three regression coefficients for vertical wells; multivariable linear regression model 2 containing four regression coefficients for deviated wells; and multivariable polynomial quadratic regression model containing six regression coefficients for both vertical and deviated wells. Although that the linear regression model 2 (with four coefficients) is relatively more complex and contains an additional term over the linear regression model 1 (with three coefficients), the former did not really add significant improvements to the later except for some minor values. Thus, the effect of the hole curvature or dogleg is insignificant and can be omitted from the input independent parameters without significant losses of accuracy. The polynomial quadratic regression model is considered the most accurate model due to its relatively higher accuracy for most of the cases. Data of nine wells from the Middle East were used to run the developed models with satisfactory results provided by all of them, even if the multivariable polynomial quadratic regression model gave the best and most accurate results. Development of these models is useful not only to monitor and predict, with accuracy, the values of SPP but also to early control and check for the integrity of the well hydraulics as well as to take the corrective actions should any unexpected problems appear, such as pipe washouts, jet plugging, excessive mud losses, fluid gains, kicks, etc.

Keywords: standpipe, pressure, hydraulics, nondimensionalization, parameters, regression

Procedia PDF Downloads 56
3045 Estimation of Functional Response Model by Supervised Functional Principal Component Analysis

Authors: Hyon I. Paek, Sang Rim Kim, Hyon A. Ryu

Abstract:

In functional linear regression, one typical problem is to reduce dimension. Compared with multivariate linear regression, functional linear regression is regarded as an infinite-dimensional case, and the main task is to reduce dimensions of functional response and functional predictors. One common approach is to adapt functional principal component analysis (FPCA) on functional predictors and then use a few leading functional principal components (FPC) to predict the functional model. The leading FPCs estimated by the typical FPCA explain a major variation of the functional predictor, but these leading FPCs may not be mostly correlated with the functional response, so they may not be significant in the prediction for response. In this paper, we propose a supervised functional principal component analysis method for a functional response model with FPCs obtained by considering the correlation of the functional response. Our method would have a better prediction accuracy than the typical FPCA method.

Keywords: supervised, functional principal component analysis, functional response, functional linear regression

Procedia PDF Downloads 38
3044 Analyzing the Influence of Hydrometeorlogical Extremes, Geological Setting, and Social Demographic on Public Health

Authors: Irfan Ahmad Afip

Abstract:

This main research objective is to accurately identify the possibility for a Leptospirosis outbreak severity of a certain area based on its input features into a multivariate regression model. The research question is the possibility of an outbreak in a specific area being influenced by this feature, such as social demographics and hydrometeorological extremes. If the occurrence of an outbreak is being subjected to these features, then the epidemic severity for an area will be different depending on its environmental setting because the features will influence the possibility and severity of an outbreak. Specifically, this research objective was three-fold, namely: (a) to identify the relevant multivariate features and visualize the patterns data, (b) to develop a multivariate regression model based from the selected features and determine the possibility for Leptospirosis outbreak in an area, and (c) to compare the predictive ability of multivariate regression model and machine learning algorithms. Several secondary data features were collected locations in the state of Negeri Sembilan, Malaysia, based on the possibility it would be relevant to determine the outbreak severity in the area. The relevant features then will become an input in a multivariate regression model; a linear regression model is a simple and quick solution for creating prognostic capabilities. A multivariate regression model has proven more precise prognostic capabilities than univariate models. The expected outcome from this research is to establish a correlation between the features of social demographic and hydrometeorological with Leptospirosis bacteria; it will also become a contributor for understanding the underlying relationship between the pathogen and the ecosystem. The relationship established can be beneficial for the health department or urban planner to inspect and prepare for future outcomes in event detection and system health monitoring.

Keywords: geographical information system, hydrometeorological, leptospirosis, multivariate regression

Procedia PDF Downloads 80
3043 On Estimating the Headcount Index by Using the Logistic Regression Estimator

Authors: Encarnación Álvarez, Rosa M. García-Fernández, Juan F. Muñoz, Francisco J. Blanco-Encomienda

Abstract:

The problem of estimating a proportion has important applications in the field of economics, and in general, in many areas such as social sciences. A common application in economics is the estimation of the headcount index. In this paper, we define the general headcount index as a proportion. Furthermore, we introduce a new quantitative method for estimating the headcount index. In particular, we suggest to use the logistic regression estimator for the problem of estimating the headcount index. Assuming a real data set, results derived from Monte Carlo simulation studies indicate that the logistic regression estimator can be more accurate than the traditional estimator of the headcount index.

Keywords: poverty line, poor, risk of poverty, Monte Carlo simulations, sample

Procedia PDF Downloads 395
3042 A Comparative Study on Sampling Techniques of Polynomial Regression Model Based Stochastic Free Vibration of Composite Plates

Authors: S. Dey, T. Mukhopadhyay, S. Adhikari

Abstract:

This paper presents an exhaustive comparative investigation on sampling techniques of polynomial regression model based stochastic natural frequency of composite plates. Both individual and combined variations of input parameters are considered to map the computational time and accuracy of each modelling techniques. The finite element formulation of composites is capable to deal with both correlated and uncorrelated random input variables such as fibre parameters and material properties. The results obtained by Polynomial regression (PR) using different sampling techniques are compared. Depending on the suitability of sampling techniques such as 2k Factorial designs, Central composite design, A-Optimal design, I-Optimal, D-Optimal, Taguchi’s orthogonal array design, Box-Behnken design, Latin hypercube sampling, sobol sequence are illustrated. Statistical analysis of the first three natural frequencies is presented to compare the results and its performance.

Keywords: composite plate, natural frequency, polynomial regression model, sampling technique, uncertainty quantification

Procedia PDF Downloads 480
3041 Heart Attack Prediction Using Several Machine Learning Methods

Authors: Suzan Anwar, Utkarsh Goyal

Abstract:

Heart rate (HR) is a predictor of cardiovascular, cerebrovascular, and all-cause mortality in the general population, as well as in patients with cardio and cerebrovascular diseases. Machine learning (ML) significantly improves the accuracy of cardiovascular risk prediction, increasing the number of patients identified who could benefit from preventive treatment while avoiding unnecessary treatment of others. This research examines relationship between the individual's various heart health inputs like age, sex, cp, trestbps, thalach, oldpeaketc, and the likelihood of developing heart disease. Machine learning techniques like logistic regression and decision tree, and Python are used. The results of testing and evaluating the model using the Heart Failure Prediction Dataset show the chance of a person having a heart disease with variable accuracy. Logistic regression has yielded an accuracy of 80.48% without data handling. With data handling (normalization, standardscaler), the logistic regression resulted in improved accuracy of 87.80%, decision tree 100%, random forest 100%, and SVM 100%.

Keywords: heart rate, machine learning, SVM, decision tree, logistic regression, random forest

Procedia PDF Downloads 112
3040 Efficient Model Selection in Linear and Non-Linear Quantile Regression by Cross-Validation

Authors: Yoonsuh Jung, Steven N. MacEachern

Abstract:

Check loss function is used to define quantile regression. In the prospect of cross validation, it is also employed as a validation function when underlying truth is unknown. However, our empirical study indicates that the validation with check loss often leads to choosing an over estimated fits. In this work, we suggest a modified or L2-adjusted check loss which rounds the sharp corner in the middle of check loss. It has a large effect of guarding against over fitted model in some extent. Through various simulation settings of linear and non-linear regressions, the improvement of check loss by L2 adjustment is empirically examined. This adjustment is devised to shrink to zero as sample size grows.

Keywords: cross-validation, model selection, quantile regression, tuning parameter selection

Procedia PDF Downloads 406
3039 Instability Index Method and Logistic Regression to Assess Landslide Susceptibility in County Route 89, Taiwan

Authors: Y. H. Wu, Ji-Yuan Lin, Yu-Ming Liou

Abstract:

This study aims to set up the landslide susceptibility map of County Route 89 at Ren-Ai Township in Nantou County using the Instability Index Method and Logistic regression. Seven susceptibility factors including Slope Angle, Aspect, Elevation, Distance to fold, Distance to River, Distance to Road and Accumulated Rainfall were obtained by GIS based on the Typhoon Toraji landslide area identified by Industrial Technology Research Institute in 2001. To calculate the landslide percentage of each factor and acquire the weight and grade the grid by means of Instability Index Method. In this study, landslide susceptibility can be classified into four grades: high, medium high, medium low and low, in order to determine the advantages and disadvantages of the two models. The precision of this model is verified by classification error matrix and SRC curve. These results suggest that the logistic regression model is a preferred method than instability index in the assessment of landslide susceptibility. It is suitable for the landslide prediction and precaution in this area in the future.

Keywords: instability index method, logistic regression, landslide susceptibility, SRC curve

Procedia PDF Downloads 257
3038 Parametric Inference of Elliptical and Archimedean Family of Copulas

Authors: Alam Ali, Ashok Kumar Pathak

Abstract:

Nowadays, copulas have attracted significant attention for modeling multivariate observations, and the foremost feature of copula functions is that they give us the liberty to study the univariate marginal distributions and their joint behavior separately. The copula parameter apprehends the intrinsic dependence among the marginal variables, and it can be estimated using parametric, semiparametric, or nonparametric techniques. This work aims to compare the coverage rates between an Elliptical and an Archimedean family of copulas via a fully parametric estimation technique.

Keywords: elliptical copula, archimedean copula, estimation, coverage rate

Procedia PDF Downloads 36
3037 Regret-Regression for Multi-Armed Bandit Problem

Authors: Deyadeen Ali Alshibani

Abstract:

In the literature, the multi-armed bandit problem as a statistical decision model of an agent trying to optimize his decisions while improving his information at the same time. There are several different algorithms models and their applications on this problem. In this paper, we evaluate the Regret-regression through comparing with Q-learning method. A simulation on determination of optimal treatment regime is presented in detail.

Keywords: optimal, bandit problem, optimization, dynamic programming

Procedia PDF Downloads 421
3036 The Strengths and Limitations of the Statistical Modeling of Complex Social Phenomenon: Focusing on SEM, Path Analysis, or Multiple Regression Models

Authors: Jihye Jeon

Abstract:

This paper analyzes the conceptual framework of three statistical methods, multiple regression, path analysis, and structural equation models. When establishing research model of the statistical modeling of complex social phenomenon, it is important to know the strengths and limitations of three statistical models. This study explored the character, strength, and limitation of each modeling and suggested some strategies for accurate explaining or predicting the causal relationships among variables. Especially, on the studying of depression or mental health, the common mistakes of research modeling were discussed.

Keywords: multiple regression, path analysis, structural equation models, statistical modeling, social and psychological phenomenon

Procedia PDF Downloads 599
3035 QSRR Analysis of 17-Picolyl and 17-Picolinylidene Androstane Derivatives Based on Partial Least Squares and Principal Component Regression

Authors: Sanja Podunavac-Kuzmanović, Strahinja Kovačević, Lidija Jevrić, Evgenija Djurendić, Jovana Ajduković

Abstract:

There are several methods for determination of the lipophilicity of biologically active compounds, however chromatography has been shown as a very suitable method for this purpose. Chromatographic (C18-RP-HPLC) analysis of a series of 24 17-picolyl and 17-picolinylidene androstane derivatives was carried out. The obtained retention indices (logk, methanol (90%) / water (10%)) were correlated with calculated physicochemical and lipophilicity descriptors. The QSRR analysis was carried out applying principal component regression (PCR) and partial least squares regression (PLS). The PCR and PLS model were selected on the basis of the highest variance and the lowest root mean square error of cross-validation. The obtained PCR and PLS model successfully correlate the calculated molecular descriptors with logk parameter indicating the significance of the lipophilicity of compounds in chromatographic process. On the basis of the obtained results it can be concluded that the obtained logk parameters of the analyzed androstane derivatives can be considered as their chromatographic lipophilicity. These results are the part of the project No. 114-451-347/2015-02, financially supported by the Provincial Secretariat for Science and Technological Development of Vojvodina and CMST COST Action CM1105.

Keywords: androstane derivatives, chromatography, molecular structure, principal component regression, partial least squares regression

Procedia PDF Downloads 237
3034 Detecting Earnings Management via Statistical and Neural Networks Techniques

Authors: Mohammad Namazi, Mohammad Sadeghzadeh Maharluie

Abstract:

Predicting earnings management is vital for the capital market participants, financial analysts and managers. The aim of this research is attempting to respond to this query: Is there a significant difference between the regression model and neural networks’ models in predicting earnings management, and which one leads to a superior prediction of it? In approaching this question, a Linear Regression (LR) model was compared with two neural networks including Multi-Layer Perceptron (MLP), and Generalized Regression Neural Network (GRNN). The population of this study includes 94 listed companies in Tehran Stock Exchange (TSE) market from 2003 to 2011. After the results of all models were acquired, ANOVA was exerted to test the hypotheses. In general, the summary of statistical results showed that the precision of GRNN did not exhibit a significant difference in comparison with MLP. In addition, the mean square error of the MLP and GRNN showed a significant difference with the multi variable LR model. These findings support the notion of nonlinear behavior of the earnings management. Therefore, it is more appropriate for capital market participants to analyze earnings management based upon neural networks techniques, and not to adopt linear regression models.

Keywords: earnings management, generalized linear regression, neural networks multi-layer perceptron, Tehran stock exchange

Procedia PDF Downloads 393
3033 Minimizing the Impact of Covariate Detection Limit in Logistic Regression

Authors: Shahadut Hossain, Jacek Wesolowski, Zahirul Hoque

Abstract:

In many epidemiological and environmental studies covariate measurements are subject to the detection limit. In most applications, covariate measurements are usually truncated from below which is known as left-truncation. Because the measuring device, which we use to measure the covariate, fails to detect values falling below the certain threshold. In regression analyses, it causes inflated bias and inaccurate mean squared error (MSE) to the estimators. This paper suggests a response-based regression calibration method to correct the deleterious impact introduced by the covariate detection limit in the estimators of the parameters of simple logistic regression model. Compared to the maximum likelihood method, the proposed method is computationally simpler, and hence easier to implement. It is robust to the violation of distributional assumption about the covariate of interest. In producing correct inference, the performance of the proposed method compared to the other competing methods has been investigated through extensive simulations. A real-life application of the method is also shown using data from a population-based case-control study of non-Hodgkin lymphoma.

Keywords: environmental exposure, detection limit, left truncation, bias, ad-hoc substitution

Procedia PDF Downloads 210
3032 Comparative Study od Three Artificial Intelligence Techniques for Rain Domain in Precipitation Forecast

Authors: Nabilah Filzah Mohd Radzuan, Andi Putra, Zalinda Othman, Azuraliza Abu Bakar, Abdul Razak Hamdan

Abstract:

Precipitation forecast is important to avoid natural disaster incident which can cause losses in the involved area. This paper reviews three techniques logistic regression, decision tree, and random forest which are used in making precipitation forecast. These combination techniques through the vector auto-regression (VAR) model help in finding the advantages and strengths of each technique in the forecast process. The data-set contains variables of the rain’s domain. Adaptation of artificial intelligence techniques involved in rain domain enables the forecast process to be easier and systematic for precipitation forecast.

Keywords: logistic regression, decisions tree, random forest, VAR model

Procedia PDF Downloads 414
3031 A Research on Inference from Multiple Distance Variables in Hedonic Regression Focus on Three Variables

Authors: Yan Wang, Yasushi Asami, Yukio Sadahiro

Abstract:

In urban context, urban nodes such as amenity or hazard will certainly affect house price, while classic hedonic analysis will employ distance variables measured from each urban nodes. However, effects from distances to facilities on house prices generally do not represent the true price of the property. Distance variables measured on the same surface are suffering a problem called multicollinearity, which is usually presented as magnitude variance and mean value in regression, errors caused by instability. In this paper, we provided a theoretical framework to identify and gather the data with less bias, and also provided specific sampling method on locating the sample region to avoid the spatial multicollinerity problem in three distance variable’s case.

Keywords: hedonic regression, urban node, distance variables, multicollinerity, collinearity

Procedia PDF Downloads 439
3030 Urban Energy Demand Modelling: Spatial Analysis Approach

Authors: Hung-Chu Chen, Han Qi, Bauke de Vries

Abstract:

Energy consumption in the urban environment has attracted numerous researches in recent decades. However, it is comparatively rare to find literary works which investigated 3D spatial analysis of urban energy demand modelling. In order to analyze the spatial correlation between urban morphology and energy demand comprehensively, this paper investigates their relation by using the spatial regression tool. In addition, the spatial regression tool which is applied in this paper is ordinary least squares regression (OLS) and geographically weighted regression (GWR) model. Normalized Difference Built-up Index (NDBI), Normalized Difference Vegetation Index (NDVI), and building volume are explainers of urban morphology, which act as independent variables of Energy-land use (E-L) model. NDBI and NDVI are used as the index to describe five types of land use: urban area (U), open space (O), artificial green area (G), natural green area (V), and water body (W). Accordingly, annual electricity, gas demand and energy demand are dependent variables of the E-L model. Based on the analytical result of E-L model relation, it revealed that energy demand and urban morphology are closely connected and the possible causes and practical use are discussed. Besides, the spatial analysis methods of OLS and GWR are compared.

Keywords: energy demand model, geographically weighted regression, normalized difference built-up index, normalized difference vegetation index, spatial statistics

Procedia PDF Downloads 117
3029 Modeling Aeration of Sharp Crested Weirs by Using Support Vector Machines

Authors: Arun Goel

Abstract:

The present paper attempts to investigate the prediction of air entrainment rate and aeration efficiency of a free over-fall jets issuing from a triangular sharp crested weir by using regression based modelling. The empirical equations, support vector machine (polynomial and radial basis function) models and the linear regression techniques were applied on the triangular sharp crested weirs relating the air entrainment rate and the aeration efficiency to the input parameters namely drop height, discharge, and vertex angle. It was observed that there exists a good agreement between the measured values and the values obtained using empirical equations, support vector machine (Polynomial and rbf) models, and the linear regression techniques. The test results demonstrated that the SVM based (Poly & rbf) model also provided acceptable prediction of the measured values with reasonable accuracy along with empirical equations and linear regression techniques in modelling the air entrainment rate and the aeration efficiency of a free over-fall jets issuing from triangular sharp crested weir. Further sensitivity analysis has also been performed to study the impact of input parameter on the output in terms of air entrainment rate and aeration efficiency.

Keywords: air entrainment rate, dissolved oxygen, weir, SVM, regression

Procedia PDF Downloads 403
3028 Use of Regression Analysis in Determining the Length of Plastic Hinge in Reinforced Concrete Columns

Authors: Mehmet Alpaslan Köroğlu, Musa Hakan Arslan, Muslu Kazım Körez

Abstract:

Basic objective of this study is to create a regression analysis method that can estimate the length of a plastic hinge which is an important design parameter, by making use of the outcomes of (lateral load-lateral displacement hysteretic curves) the experimental studies conducted for the reinforced square concrete columns. For this aim, 170 different square reinforced concrete column tests results have been collected from the existing literature. The parameters which are thought affecting the plastic hinge length such as cross-section properties, features of material used, axial loading level, confinement of the column, longitudinal reinforcement bars in the columns etc. have been obtained from these 170 different square reinforced concrete column tests. In the study, when determining the length of plastic hinge, using the experimental test results, a regression analysis have been separately tested and compared with each other. In addition, the outcome of mentioned methods on determination of plastic hinge length of the reinforced concrete columns has been compared to other methods available in the literature.

Keywords: columns, plastic hinge length, regression analysis, reinforced concrete

Procedia PDF Downloads 446
3027 Measurement Errors and Misclassifications in Covariates in Logistic Regression: Bayesian Adjustment of Main and Interaction Effects and the Sample Size Implications

Authors: Shahadut Hossain

Abstract:

Measurement errors in continuous covariates and/or misclassifications in categorical covariates are common in epidemiological studies. Regression analysis ignoring such mismeasurements seriously biases the estimated main and interaction effects of covariates on the outcome of interest. Thus, adjustments for such mismeasurements are necessary. In this research, we propose a Bayesian parametric framework for eliminating deleterious impacts of covariate mismeasurements in logistic regression. The proposed adjustment method is unified and thus can be applied to any generalized linear and non-linear regression models. Furthermore, adjustment for covariate mismeasurements requires validation data usually in the form of either gold standard measurements or replicates of the mismeasured covariates on a subset of the study population. Initial investigation shows that adequacy of such adjustment depends on the sizes of main and validation samples, especially when prevalences of the categorical covariates are low. Thus, we investigate the impact of main and validation sample sizes on the adjusted estimates, and provide a general guideline about these sample sizes based on simulation studies.

Keywords: measurement errors, misclassification, mismeasurement, validation sample, Bayesian adjustment

Procedia PDF Downloads 382
3026 Quantitative Structure-Activity Relationship Study of Some Quinoline Derivatives as Antimalarial Agents

Authors: M. Ouassaf, S. Belaid

Abstract:

A series of quinoline derivatives with antimalarial activity were subjected to two-dimensional quantitative structure-activity relationship (2D-QSAR) studies. Three models were implemented using multiple regression linear MLR, a regression partial least squares (PLS), nonlinear regression (MNLR), to see which descriptors are closely related to the activity biologic. We relied on a principal component analysis (PCA). Based on our results, a comparison of the quality of, MLR, PLS, and MNLR models shows that the MNLR (R = 0.914 and R² = 0.835, RCV= 0.853) models have substantially better predictive capability because the MNLR approach gives better results than MLR (R = 0.835 and R² = 0,752, RCV=0.601)), PLS (R = 0.742 and R² = 0.552, RCV=0.550) The model of MNLR gave statistically significant results and showed good stability to data variation in leave-one-out cross-validation. The obtained results suggested that our proposed model MNLR may be useful to predict the biological activity of derivatives of quinoline.

Keywords: antimalarial, quinoline, QSAR, PCA, MLR , MNLR, MLR

Procedia PDF Downloads 121
3025 Agile Software Effort Estimation Using Regression Techniques

Authors: Mikiyas Adugna

Abstract:

Effort estimation is among the activities carried out in software development processes. An accurate model of estimation leads to project success. The method of agile effort estimation is a complex task because of the dynamic nature of software development. Researchers are still conducting studies on agile effort estimation to enhance prediction accuracy. Due to these reasons, we investigated and proposed a model on LASSO and Elastic Net regression to enhance estimation accuracy. The proposed model has major components: preprocessing, train-test split, training with default parameters, and cross-validation. During the preprocessing phase, the entire dataset is normalized. After normalization, a train-test split is performed on the dataset, setting training at 80% and testing set to 20%. We chose two different phases for training the two algorithms (Elastic Net and LASSO) regression following the train-test-split. In the first phase, the two algorithms are trained using their default parameters and evaluated on the testing data. In the second phase, the grid search technique (the grid is used to search for tuning and select optimum parameters) and 5-fold cross-validation to get the final trained model. Finally, the final trained model is evaluated using the testing set. The experimental work is applied to the agile story point dataset of 21 software projects collected from six firms. The results show that both Elastic Net and LASSO regression outperformed the compared ones. Compared to the proposed algorithms, LASSO regression achieved better predictive performance and has acquired PRED (8%) and PRED (25%) results of 100.0, MMRE of 0.0491, MMER of 0.0551, MdMRE of 0.0593, MdMER of 0.063, and MSE of 0.0007. The result implies LASSO regression algorithm trained model is the most acceptable, and higher estimation performance exists in the literature.

Keywords: agile software development, effort estimation, elastic net regression, LASSO

Procedia PDF Downloads 19