Search results for: forced regression
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3660

Search results for: forced regression

3630 The Theory behind Logistic Regression

Authors: Jan Henrik Wosnitza

Abstract:

The logistic regression has developed into a standard approach for estimating conditional probabilities in a wide range of applications including credit risk prediction. The article at hand contributes to the current literature on logistic regression fourfold: First, it is demonstrated that the binary logistic regression automatically meets its model assumptions under very general conditions. This result explains, at least in part, the logistic regression's popularity. Second, the requirement of homoscedasticity in the context of binary logistic regression is theoretically substantiated. The variances among the groups of defaulted and non-defaulted obligors have to be the same across the level of the aggregated default indicators in order to achieve linear logits. Third, this article sheds some light on the question why nonlinear logits might be superior to linear logits in case of a small amount of data. Fourth, an innovative methodology for estimating correlations between obligor-specific log-odds is proposed. In order to crystallize the key ideas, this paper focuses on the example of credit risk prediction. However, the results presented in this paper can easily be transferred to any other field of application.

Keywords: correlation, credit risk estimation, default correlation, homoscedasticity, logistic regression, nonlinear logistic regression

Procedia PDF Downloads 394
3629 Experimental on Free and Forced Heat Transfer and Pressure Drop of Copper Oxide-Heat Transfer Oil Nanofluid in Horizontal and Inclined Microfin Tube

Authors: F. Hekmatipour, M. A. Akhavan-Behabadi, B. Sajadi

Abstract:

In this paper, the combined free and forced convection heat transfer of the Copper Oxide-Heat Transfer Oil (CuO-HTO) nanofluid flow in horizontal and inclined microfin tubes is studied experimentally. The flow regime is laminar, and pipe surface temperature is constant. The effect of nanoparticle and microfin tube on the heat transfer rate is investigated with the Richardson number which is between 0.1 and 0.7. The results show an increasing nanoparticle concentration between 0% and 1.5% leads to enhance the combined free and forced convection heat transfer rate. According to the results, five correlations are proposed to provide estimating the free and forced heat transfer rate as the increasing Richardson number from 0.1 to 0.7. The maximum deviation of both correlations is less than 16%. Moreover, four correlations are suggested to assess the Nusselt number based on the Rayleigh number in inclined tubes from 1800000 to 7000000. The maximum deviation of the correlation is almost 16%. The Darcy friction factor of the nanofluid flow has been investigated. Furthermore, CuO-HTO nanofluid flows in inclined microfin tubes.

Keywords: nanofluid, heat transfer oil, mixed convection, inclined tube, laminar flow

Procedia PDF Downloads 231
3628 Model Averaging for Poisson Regression

Authors: Zhou Jianhong

Abstract:

Model averaging is a desirable approach to deal with model uncertainty, which, however, has rarely been explored for Poisson regression. In this paper, we propose a model averaging procedure based on an unbiased estimator of the expected Kullback-Leibler distance for the Poisson regression. Simulation study shows that the proposed model average estimator outperforms some other commonly used model selection and model average estimators in some situations. Our proposed methods are further applied to a real data example and the advantage of this method is demonstrated again.

Keywords: model averaging, poission regression, Kullback-Leibler distance, statistics

Procedia PDF Downloads 485
3627 Unsteady Forced Convection Flow and Heat Transfer Past a Blunt Headed Semi-Circular Cylinder at Low Reynolds Numbers

Authors: Y. El Khchine, M. Sriti

Abstract:

In the present work, the forced convection heat transfer and fluid flow past an unconfined semi-circular cylinder is investigated. The two-dimensional simulation is employed for Reynolds numbers ranging from 10 ≤ Re ≤ 200, employing air (Pr = 0.71) as an operating fluid with Newtonian constant physics property. Continuity, momentum, and energy equations with appropriate boundary conditions are solved using the Computational Fluid Dynamics (CFD) solver Ansys Fluent. Various parameters flow such as lift, drag, pressure, skin friction coefficients, Nusselt number, Strouhal number, and vortex strength are calculated. The transition from steady to time-periodic flow occurs between Re=60 and 80. The effect of the Reynolds number on heat transfer is discussed. Finally, a developed correlation of Nusselt and Strouhal numbers is presented.

Keywords: forced convection, semi-circular cylinder, Nusselt number, Prandtl number

Procedia PDF Downloads 85
3626 Establishment of the Regression Uncertainty of the Critical Heat Flux Power Correlation for an Advanced Fuel Bundle

Authors: L. Q. Yuan, J. Yang, A. Siddiqui

Abstract:

A new regression uncertainty analysis methodology was applied to determine the uncertainties of the critical heat flux (CHF) power correlation for an advanced 43-element bundle design, which was developed by Canadian Nuclear Laboratories (CNL) to achieve improved economics, resource utilization and energy sustainability. The new methodology is considered more appropriate than the traditional methodology in the assessment of the experimental uncertainty associated with regressions. The methodology was first assessed using both the Monte Carlo Method (MCM) and the Taylor Series Method (TSM) for a simple linear regression model, and then extended successfully to a non-linear CHF power regression model (CHF power as a function of inlet temperature, outlet pressure and mass flow rate). The regression uncertainty assessed by MCM agrees well with that by TSM. An equation to evaluate the CHF power regression uncertainty was developed and expressed as a function of independent variables that determine the CHF power.

Keywords: CHF experiment, CHF correlation, regression uncertainty, Monte Carlo Method, Taylor Series Method

Procedia PDF Downloads 388
3625 Non-Parametric Regression over Its Parametric Couterparts with Large Sample Size

Authors: Jude Opara, Esemokumo Perewarebo Akpos

Abstract:

This paper is on non-parametric linear regression over its parametric counterparts with large sample size. Data set on anthropometric measurement of primary school pupils was taken for the analysis. The study used 50 randomly selected pupils for the study. The set of data was subjected to normality test, and it was discovered that the residuals are not normally distributed (i.e. they do not follow a Gaussian distribution) for the commonly used least squares regression method for fitting an equation into a set of (x,y)-data points using the Anderson-Darling technique. The algorithms for the nonparametric Theil’s regression are stated in this paper as well as its parametric OLS counterpart. The use of a programming language software known as “R Development” was used in this paper. From the analysis, the result showed that there exists a significant relationship between the response and the explanatory variable for both the parametric and non-parametric regression. To know the efficiency of one method over the other, the Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) are used, and it is discovered that the nonparametric regression performs better than its parametric regression counterparts due to their lower values in both the AIC and BIC. The study however recommends that future researchers should study a similar work by examining the presence of outliers in the data set, and probably expunge it if detected and re-analyze to compare results.

Keywords: Theil’s regression, Bayesian information criterion, Akaike information criterion, OLS

Procedia PDF Downloads 276
3624 Use of Multistage Transition Regression Models for Credit Card Income Prediction

Authors: Denys Osipenko, Jonathan Crook

Abstract:

Because of the variety of the card holders’ behaviour types and income sources each consumer account can be transferred to a variety of states. Each consumer account can be inactive, transactor, revolver, delinquent, defaulted and requires an individual model for the income prediction. The estimation of transition probabilities between statuses at the account level helps to avoid the memorylessness of the Markov Chains approach. This paper investigates the transition probabilities estimation approaches to credit cards income prediction at the account level. The key question of empirical research is which approach gives more accurate results: multinomial logistic regression or multistage conditional logistic regression with binary target. Both models have shown moderate predictive power. Prediction accuracy for conditional logistic regression depends on the order of stages for the conditional binary logistic regression. On the other hand, multinomial logistic regression is easier for usage and gives integrate estimations for all states without priorities. Thus further investigations can be concentrated on alternative modeling approaches such as discrete choice models.

Keywords: multinomial regression, conditional logistic regression, credit account state, transition probability

Procedia PDF Downloads 460
3623 Internet Purchases in European Union Countries: Multiple Linear Regression Approach

Authors: Ksenija Dumičić, Anita Čeh Časni, Irena Palić

Abstract:

This paper examines economic and Information and Communication Technology (ICT) development influence on recently increasing Internet purchases by individuals for European Union member states. After a growing trend for Internet purchases in EU27 was noticed, all possible regression analysis was applied using nine independent variables in 2011. Finally, two linear regression models were studied in detail. Conducted simple linear regression analysis confirmed the research hypothesis that the Internet purchases in analysed EU countries is positively correlated with statistically significant variable Gross Domestic Product per capita (GDPpc). Also, analysed multiple linear regression model with four regressors, showing ICT development level, indicates that ICT development is crucial for explaining the Internet purchases by individuals, confirming the research hypothesis.

Keywords: European union, Internet purchases, multiple linear regression model, outlier

Procedia PDF Downloads 277
3622 Failing to Protect Bare Life During the COVID-19 Pandemic: Forced Migrants as Carriers of the Virus

Authors: Claudia Donoso

Abstract:

This study compares the restriction of mobility of migrants and asylum seekers during the COVID-19 pandemic in the United States and Ecuador. Based on the discourse analysis of anti-migrant rhetoric in press articles, migrant stories in the press, reports, and border control practices, the study examines the Ecuadorian government’s response to the migration flow of Venezuelans and the United States enforcement practices against Latin American asylum seekers. By exploring Giorgio Agamben’s concept of bare life, the article argues that this failure to protect mobility rights is due to the United States and Ecuador’s views of forced migrants as bare life and carriers of the virus, justifying xenophobia, resistance to humanitarian international law, and exceptionalism. By drawing on a feminist intersectional approach, the study adds to recent research on the securitization of forced migration and challenge the race/ethnicity, immigration status, class, and nationality-based discrimination of the measures undertaken during the pandemic. The article illustrates how the treatment of forced migrants as bare life was aggravated by their intersectional inequalities. It concludes by providing recommendations that could be enforced by the US and Ecuadorian governments to protect the right to freedom of mobility.

Keywords: bare life, intersectionality, mobility rights, COVID-19, Ecuador, United States

Procedia PDF Downloads 49
3621 Prediction of Unsteady Heat Transfer over Square Cylinder in the Presence of Nanofluid by Using ANN

Authors: Ajoy Kumar Das, Prasenjit Dey

Abstract:

Heat transfer due to forced convection of copper water based nanofluid has been predicted by Artificial Neural network (ANN). The present nanofluid is formed by mixing copper nano particles in water and the volume fractions are considered here are 0% to 15% and the Reynolds number are kept constant at 100. The back propagation algorithm is used to train the network. The present ANN is trained by the input and output data which has been obtained from the numerical simulation, performed in finite volume based Computational Fluid Dynamics (CFD) commercial software Ansys Fluent. The numerical simulation based results are compared with the back propagation based ANN results. It is found that the forced convection heat transfer of water based nanofluid can be predicted correctly by ANN. It is also observed that the back propagation ANN can predict the heat transfer characteristics of nanofluid very quickly compared to standard CFD method.

Keywords: forced convection, square cylinder, nanofluid, neural network

Procedia PDF Downloads 297
3620 Optimization of Slider Crank Mechanism Using Design of Experiments and Multi-Linear Regression

Authors: Galal Elkobrosy, Amr M. Abdelrazek, Bassuny M. Elsouhily, Mohamed E. Khidr

Abstract:

Crank shaft length, connecting rod length, crank angle, engine rpm, cylinder bore, mass of piston and compression ratio are the inputs that can control the performance of the slider crank mechanism and then its efficiency. Several combinations of these seven inputs are used and compared. The throughput engine torque predicted by the simulation is analyzed through two different regression models, with and without interaction terms, developed according to multi-linear regression using LU decomposition to solve system of algebraic equations. These models are validated. A regression model in seven inputs including their interaction terms lowered the polynomial degree from 3rd degree to 1st degree and suggested valid predictions and stable explanations.

Keywords: design of experiments, regression analysis, SI engine, statistical modeling

Procedia PDF Downloads 154
3619 An Epsilon Hierarchical Fuzzy Twin Support Vector Regression

Authors: Arindam Chaudhuri

Abstract:

The research presents epsilon- hierarchical fuzzy twin support vector regression (epsilon-HFTSVR) based on epsilon-fuzzy twin support vector regression (epsilon-FTSVR) and epsilon-twin support vector regression (epsilon-TSVR). Epsilon-FTSVR is achieved by incorporating trapezoidal fuzzy numbers to epsilon-TSVR which takes care of uncertainty existing in forecasting problems. Epsilon-FTSVR determines a pair of epsilon-insensitive proximal functions by solving two related quadratic programming problems. The structural risk minimization principle is implemented by introducing regularization term in primal problems of epsilon-FTSVR. This yields dual stable positive definite problems which improves regression performance. Epsilon-FTSVR is then reformulated as epsilon-HFTSVR consisting of a set of hierarchical layers each containing epsilon-FTSVR. Experimental results on both synthetic and real datasets reveal that epsilon-HFTSVR has remarkable generalization performance with minimum training time.

Keywords: regression, epsilon-TSVR, epsilon-FTSVR, epsilon-HFTSVR

Procedia PDF Downloads 332
3618 Nonparametric Truncated Spline Regression Model on the Data of Human Development Index in Indonesia

Authors: Kornelius Ronald Demu, Dewi Retno Sari Saputro, Purnami Widyaningsih

Abstract:

Human Development Index (HDI) is a standard measurement for a country's human development. Several factors may have influenced it, such as life expectancy, gross domestic product (GDP) based on the province's annual expenditure, the number of poor people, and the percentage of an illiterate people. The scatter plot between HDI and the influenced factors show that the plot does not follow a specific pattern or form. Therefore, the HDI's data in Indonesia can be applied with a nonparametric regression model. The estimation of the regression curve in the nonparametric regression model is flexible because it follows the shape of the data pattern. One of the nonparametric regression's method is a truncated spline. Truncated spline regression is one of the nonparametric approach, which is a modification of the segmented polynomial functions. The estimator of a truncated spline regression model was affected by the selection of the optimal knots point. Knot points is a focus point of spline truncated functions. The optimal knots point was determined by the minimum value of generalized cross validation (GCV). In this article were applied the data of Human Development Index with a truncated spline nonparametric regression model. The results of this research were obtained the best-truncated spline regression model to the HDI's data in Indonesia with the combination of optimal knots point 5-5-5-4. Life expectancy and the percentage of an illiterate people were the significant factors depend to the HDI in Indonesia. The coefficient of determination is 94.54%. This means the regression model is good enough to applied on the data of HDI in Indonesia.

Keywords: generalized cross validation (GCV), Human Development Index (HDI), knots point, nonparametric regression, truncated spline

Procedia PDF Downloads 305
3617 Regression Model Evaluation on Depth Camera Data for Gaze Estimation

Authors: James Purnama, Riri Fitri Sari

Abstract:

We investigate the machine learning algorithm selection problem in the term of a depth image based eye gaze estimation, with respect to its essential difficulty in reducing the number of required training samples and duration time of training. Statistics based prediction accuracy are increasingly used to assess and evaluate prediction or estimation in gaze estimation. This article evaluates Root Mean Squared Error (RMSE) and R-Squared statistical analysis to assess machine learning methods on depth camera data for gaze estimation. There are 4 machines learning methods have been evaluated: Random Forest Regression, Regression Tree, Support Vector Machine (SVM), and Linear Regression. The experiment results show that the Random Forest Regression has the lowest RMSE and the highest R-Squared, which means that it is the best among other methods.

Keywords: gaze estimation, gaze tracking, eye tracking, kinect, regression model, orange python

Procedia PDF Downloads 508
3616 Assessment of the Impact of Social Compliance Certification on Abolition of Forced Labour and Discrimination in the Garment Manufacturing Units in Bengaluru: A Perspective of Women Sewing Operators

Authors: Jonalee Das Bajpai, Sandeep Shastri

Abstract:

The Indian Textile and Garment Industry is one of the major contributors to the country’s economy. This industry is also one of the largest labour intensive industries after agriculture and livestock. This Indian garment industry caters to both the domestic and international market. Although this industry comes under the purview of Indian Labour Laws and other voluntary work place standards yet, this industry is often criticized for the undue exploitation of the workers. This paper explored the status of forced labour and discrimination at work place in the garment manufacturing units in Bengaluru. This study is conducted from the perspective of women sewing operators as majority of operators in Bengaluru are women. The research also explored to study the impact of social compliance certification in abolishing forced labour and discrimination at work place. Objectives of the Research: 1. To study the impact of 'Social Compliance Certification' on abolition of forced labour among the women workforce. 2. To study the impact of 'Social Compliance Certification' on abolition of discrimination at workplace among the women workforce. Sample Size and Data Collection Techniques: The main backbone of the data which is the primary data was collected through a structured questionnaire. The questionnaire attempted to explore the extent of prevalence of forced labour and discrimination against women workers from the perspective of women workers themselves. The sample size for the same was 600 (n) women sewing operators from the garment industry with minimum one year of work experience. Three hundred samples were selected from units with Social Compliance Certification like SA8000, WRAP, BSCI, ETI and so on. Other three hundred samples were selected from units without Social Compliance Certification. Out of these three hundred samples, one hundred and fifty samples were selected from units with Buyer’s Code of Conduct and another one hundred and fifty were from domestic units that do not come under the purview of any such certification. The responses of the survey were further authenticated through on sight visit and personal interactions. Comparative analysis of the workplace environment between units with Social Compliance certification, units with Buyer’s Code of Conduct and domestic units that do not come under the purview of any such voluntary workplace environment enabled to analyze the impact of Social Compliance certification on abolition of workplace environment and discrimination at workplace. Correlation analysis has been conducted to measure the relationship between impact of forced labour and discrimination at workplace on the level of job satisfaction. The result displayed that abolition of forced labour and abolition of discrimination at workplace have a higher level of job satisfaction among the women workers.

Keywords: discrimination, garment industry, forced labour, social compliance certification

Procedia PDF Downloads 171
3615 Generalized Extreme Value Regression with Binary Dependent Variable: An Application for Predicting Meteorological Drought Probabilities

Authors: Retius Chifurira

Abstract:

Logistic regression model is the most used regression model to predict meteorological drought probabilities. When the dependent variable is extreme, the logistic model fails to adequately capture drought probabilities. In order to adequately predict drought probabilities, we use the generalized linear model (GLM) with the quantile function of the generalized extreme value distribution (GEVD) as the link function. The method maximum likelihood estimation is used to estimate the parameters of the generalized extreme value (GEV) regression model. We compare the performance of the logistic and the GEV regression models in predicting drought probabilities for Zimbabwe. The performance of the regression models are assessed using the goodness-of-fit tests, namely; relative root mean square error (RRMSE) and relative mean absolute error (RMAE). Results show that the GEV regression model performs better than the logistic model, thereby providing a good alternative candidate for predicting drought probabilities. This paper provides the first application of GLM derived from extreme value theory to predict drought probabilities for a drought-prone country such as Zimbabwe.

Keywords: generalized extreme value distribution, general linear model, mean annual rainfall, meteorological drought probabilities

Procedia PDF Downloads 159
3614 Numerical Simulation of a Solar Photovoltaic Panel Cooled by a Forced Air System

Authors: Djamila Nebbali, Rezki Nebbali, Ahmed Ouibrahim

Abstract:

This study focuses on the cooling of a photovoltaic panel (PV). Indeed, the cooling improves the conversion capacity of this one and maintains, under extreme conditions of air temperature, the panel temperature at an appreciable level which avoids the altering. To do this, a fan provides forced circulation of air. Because the fan is supplied by the panel, it is necessary to determine the optimum operating point that unites efficiency of the PV with the consumption of the fan. For this matter, numerical simulations are performed at varying mass flow rates of air, under two extreme air temperatures (50°C, 25°C) and a fixed solar radiation (1000 W.m2) in a case of no wind.

Keywords: energy conversion, efficiency, balance energy, solar cell

Procedia PDF Downloads 388
3613 The Effect of Sexual Assault on Sport Participation Trajectories from Adolescence through Young Adulthood

Authors: Chung Gun Lee

Abstract:

Objectives: Certain life change events were shown to have strong effects on physical activity-related behavior, but more research is needed to investigate the longer-term effects of different life change events on physical activity-related behaviors. The purpose of this study is to examine the effect of experiencing physically or non-physically forced sexual activity on sports participation from adolescence to young adulthood. Methods: This study used the National Longitudinal Study of Adolescent Health (Add Health) data. Group-based trajectory modeling was utilized to examine the effect of experiencing sexual assault on trajectories of sports participation from adolescence to young adulthood. Results: Male participants were divided into three trajectory groups (i.e., Low-stable, High-decreasing, and High-stable) and female participants were divided into two trajectory groups (i.e., Low-stable and High-decreasing). The main finding of this study is that women who experienced non-physically forced sexual activity significantly decreases sports participation throughout the trajectory in ‘High-decreasing group.’ The effect of non-physically forced sexual activity on women’s sports participation was considerably weakened and became insignificant after including psychological depression in the model as a potential mediator. Discussion: Special attention should be paid to sport participation among women victims of non-physically forced sexual activity. Further studies are needed to examine other potential mediators in addition to psychological depression when examining the effect of non-physically forced sexual activity on sport participation in women.

Keywords: adolescent, group-based trajectory modeling, sexual assault, young adult

Procedia PDF Downloads 124
3612 Influence of Mass Flow Rate on Forced Convective Heat Transfer through a Nanofluid Filled Direct Absorption Solar Collector

Authors: Salma Parvin, M. A. Alim

Abstract:

The convective and radiative heat transfer performance and entropy generation on forced convection through a direct absorption solar collector (DASC) is investigated numerically. Four different fluids, including Cu-water nanofluid, Al2O3-waternanofluid, TiO2-waternanofluid, and pure water are used as the working fluid. Entropy production has been taken into account in addition to the collector efficiency and heat transfer enhancement. Penalty finite element method with Galerkin’s weighted residual technique is used to solve the governing non-linear partial differential equations. Numerical simulations are performed for the variation of mass flow rate. The outcomes are presented in the form of isotherms, average output temperature, the average Nusselt number, collector efficiency, average entropy generation, and Bejan number. The results present that the rate of heat transfer and collector efficiency enhance significantly for raising the values of m up to a certain range.

Keywords: DASC, forced convection, mass flow rate, nanofluid

Procedia PDF Downloads 262
3611 An Assessment of Involuntary Migration in India: Understanding Issues and Challenges

Authors: Rajni Singh, Rakesh Mishra, Mukunda Upadhyay

Abstract:

India is among the nations born out of partition that led to one of the greatest forced migrations that marked the past century. The Indian subcontinent got partitioned into two nation-states, namely India and Pakistan. This led to an unexampled mass displacement of people accounting for about 20 million in the subcontinent as a whole. This exemplifies the socio-political version of displacement, but there are other identified reasons leading to human displacement viz., natural calamities, development projects and people-trafficking and smuggling. Although forced migrations are rare in incidence, they are mostly region-specific and a very less percentage of population appears to be affected by it. However, when this percentage is transcripted in terms of volume, the real impact created by such migration can be realized. Forced migration is thus an issue related to the lives of many people and requires to be addressed with proper intervention. Forced or involuntary migration decimates peoples' assets while taking from them their most basic resources and makes them migrate without planning and intention. This in most cases proves to be a burden on the destination resources. Thus, the question related to their security concerns arise profoundly with regard to the protection and safeguards to these migrants who need help at the place of destination. This brings the human security dimension of forced migration into picture. The present study is an analysis of a sample of 1501 persons by NSSO in India (National Sample Survey Organisation), which identifies three reasons for forced migration- natural disaster, social/political problem and displacement by development projects. It was observed that, of the total forced migrants, about 4/5th comprised of the internally displaced persons. However, there was a huge inflow of such migrants to the country from across the borders also, the major contributing countries being Bangladesh, Pakistan, Sri Lanka, Gulf countries and Nepal. Among the three reasons for involuntary migration, social and political problem is the most prominent in displacing huge masses of population; it is also the reason where the share of international migrants to that of internally displaced is higher compared to the other two factors /reasons. Second to political and social problems, natural calamities displaced a high portion of the involuntary migrants. The present paper examines the factors which increase people's vulnerability to forced migration. On perusing the background characteristics of the migrants it was seen that those who were economically weak and socially fragile are more susceptible to migration. Therefore, getting an insight about this fragile group of society is required so that government policies can benefit these in the most efficient and targeted manner.

Keywords: involuntary migration, displacement, natural disaster, social and political problem

Procedia PDF Downloads 329
3610 The Extended Skew Gaussian Process for Regression

Authors: M. T. Alodat

Abstract:

In this paper, we propose a generalization to the Gaussian process regression(GPR) model called the extended skew Gaussian process for regression(ESGPr) model. The ESGPR model works better than the GPR model when the errors are skewed. We derive the predictive distribution for the ESGPR model at a new input. Also we apply the ESGPR model to FOREX data and we find that it fits the Forex data better than the GPR model.

Keywords: extended skew normal distribution, Gaussian process for regression, predictive distribution, ESGPr model

Procedia PDF Downloads 520
3609 Integrated Nested Laplace Approximations For Quantile Regression

Authors: Kajingulu Malandala, Ranganai Edmore

Abstract:

The asymmetric Laplace distribution (ADL) is commonly used as the likelihood function of the Bayesian quantile regression, and it offers different families of likelihood method for quantile regression. Notwithstanding their popularity and practicality, ADL is not smooth and thus making it difficult to maximize its likelihood. Furthermore, Bayesian inference is time consuming and the selection of likelihood may mislead the inference, as the Bayes theorem does not automatically establish the posterior inference. Furthermore, ADL does not account for greater skewness and Kurtosis. This paper develops a new aspect of quantile regression approach for count data based on inverse of the cumulative density function of the Poisson, binomial and Delaporte distributions using the integrated nested Laplace Approximations. Our result validates the benefit of using the integrated nested Laplace Approximations and support the approach for count data.

Keywords: quantile regression, Delaporte distribution, count data, integrated nested Laplace approximation

Procedia PDF Downloads 134
3608 The Use of Geographically Weighted Regression for Deforestation Analysis: Case Study in Brazilian Cerrado

Authors: Ana Paula Camelo, Keila Sanches

Abstract:

The Geographically Weighted Regression (GWR) was proposed in geography literature to allow relationship in a regression model to vary over space. In Brazil, the agricultural exploitation of the Cerrado Biome is the main cause of deforestation. In this study, we propose a methodology using geostatistical methods to characterize the spatial dependence of deforestation in the Cerrado based on agricultural production indicators. Therefore, it was used the set of exploratory spatial data analysis tools (ESDA) and confirmatory analysis using GWR. It was made the calibration a non-spatial model, evaluation the nature of the regression curve, election of the variables by stepwise process and multicollinearity analysis. After the evaluation of the non-spatial model was processed the spatial-regression model, statistic evaluation of the intercept and verification of its effect on calibration. In an analysis of Spearman’s correlation the results between deforestation and livestock was +0.783 and with soybeans +0.405. The model presented R²=0.936 and showed a strong spatial dependence of agricultural activity of soybeans associated to maize and cotton crops. The GWR is a very effective tool presenting results closer to the reality of deforestation in the Cerrado when compared with other analysis.

Keywords: deforestation, geographically weighted regression, land use, spatial analysis

Procedia PDF Downloads 329
3607 Weighted Rank Regression with Adaptive Penalty Function

Authors: Kang-Mo Jung

Abstract:

The use of regularization for statistical methods has become popular. The least absolute shrinkage and selection operator (LASSO) framework has become the standard tool for sparse regression. However, it is well known that the LASSO is sensitive to outliers or leverage points. We consider a new robust estimation which is composed of the weighted loss function of the pairwise difference of residuals and the adaptive penalty function regulating the tuning parameter for each variable. Rank regression is resistant to regression outliers, but not to leverage points. By adopting a weighted loss function, the proposed method is robust to leverage points of the predictor variable. Furthermore, the adaptive penalty function gives us good statistical properties in variable selection such as oracle property and consistency. We develop an efficient algorithm to compute the proposed estimator using basic functions in program R. We used an optimal tuning parameter based on the Bayesian information criterion (BIC). Numerical simulation shows that the proposed estimator is effective for analyzing real data set and contaminated data.

Keywords: adaptive penalty function, robust penalized regression, variable selection, weighted rank regression

Procedia PDF Downloads 431
3606 Large Time Asymptotic Behavior to Solutions of a Forced Burgers Equation

Authors: Satyanarayana Engu, Ahmed Mohd, V. Murugan

Abstract:

We study the large time asymptotics of solutions to the Cauchy problem for a forced Burgers equation (FBE) with the initial data, which is continuous and summable on R. For which, we first derive explicit solutions of FBE assuming a different class of initial data in terms of Hermite polynomials. Later, by violating this assumption we prove the existence of a solution to the considered Cauchy problem. Finally, we give an asymptotic approximate solution and establish that the error will be of order O(t^(-1/2)) with respect to L^p -norm, where 1≤p≤∞, for large time.

Keywords: Burgers equation, Cole-Hopf transformation, Hermite polynomials, large time asymptotics

Procedia PDF Downloads 295
3605 MapReduce Logistic Regression Algorithms with RHadoop

Authors: Byung Ho Jung, Dong Hoon Lim

Abstract:

Logistic regression is a statistical method for analyzing a dataset in which there are one or more independent variables that determine an outcome. Logistic regression is used extensively in numerous disciplines, including the medical and social science fields. In this paper, we address the problem of estimating parameters in the logistic regression based on MapReduce framework with RHadoop that integrates R and Hadoop environment applicable to large scale data. There exist three learning algorithms for logistic regression, namely Gradient descent method, Cost minimization method and Newton-Rhapson's method. The Newton-Rhapson's method does not require a learning rate, while gradient descent and cost minimization methods need to manually pick a learning rate. The experimental results demonstrated that our learning algorithms using RHadoop can scale well and efficiently process large data sets on commodity hardware. We also compared the performance of our Newton-Rhapson's method with gradient descent and cost minimization methods. The results showed that our newton's method appeared to be the most robust to all data tested.

Keywords: big data, logistic regression, MapReduce, RHadoop

Procedia PDF Downloads 245
3604 A Generalized Weighted Loss for Support Vextor Classification and Multilayer Perceptron

Authors: Filippo Portera

Abstract:

Usually standard algorithms employ a loss where each error is the mere absolute difference between the true value and the prediction, in case of a regression task. In the present, we present several error weighting schemes that are a generalization of the consolidated routine. We study both a binary classification model for Support Vextor Classification and a regression net for Multylayer Perceptron. Results proves that the error is never worse than the standard procedure and several times it is better.

Keywords: loss, binary-classification, MLP, weights, regression

Procedia PDF Downloads 63
3603 Interference among Lambsquarters and Oil Rapeseed Cultivars

Authors: Reza Siyami, Bahram Mirshekari

Abstract:

Seed and oil yield of rapeseed is considerably affected by weeds interference including mustard (Sinapis arvensis L.), lambsquarters (Chenopodium album L.) and redroot pigweed (Amaranthus retroflexus L.) throughout the East Azerbaijan province in Iran. To formulate the relationship between four independent growth variables measured in our experiment with a dependent variable, multiple regression analysis was carried out for the weed leaves number per plant (X1), green cover percentage (X2), LAI (X3) and leaf area per plant (X4) as independent variables and rapeseed oil yield as a dependent variable. The multiple regression equation is shown as follows: Seed essential oil yield (kg/ha) = 0.156 + 0.0325 (X1) + 0.0489 (X2) + 0.0415 (X3) + 0.133 (X4). Furthermore, the stepwise regression analysis was also carried out for the data obtained to test the significance of the independent variables affecting the oil yield as a dependent variable. The resulted stepwise regression equation is shown as follows: Oil yield = 4.42 + 0.0841 (X2) + 0.0801 (X3); R2 = 81.5. The stepwise regression analysis verified that the green cover percentage and LAI of weed had a marked increasing effect on the oil yield of rapeseed.

Keywords: green cover percentage, independent variable, interference, regression

Procedia PDF Downloads 389
3602 Copula-Based Estimation of Direct and Indirect Effects in Path Analysis Model

Authors: Alam Ali, Ashok Kumar Pathak

Abstract:

Path analysis is a statistical technique used to evaluate the strength of the direct and indirect effects of variables. One or more structural regression equations are used to estimate a series of parameters in order to find the better fit of data. Sometimes, exogenous variables do not show a significant strength of their direct and indirect effect when the assumption of classical regression (ordinary least squares (OLS)) are violated by the nature of the data. The main motive of this article is to investigate the efficacy of the copula-based regression approach over the classical regression approach and calculate the direct and indirect effects of variables when data violates the OLS assumption and variables are linked through an elliptical copula. We perform this study using a well-organized numerical scheme. Finally, a real data application is also presented to demonstrate the performance of the superiority of the copula approach.

Keywords: path analysis, copula-based regression models, direct and indirect effects, k-fold cross validation technique

Procedia PDF Downloads 46
3601 Performance Analysis of Proprietary and Non-Proprietary Tools for Regression Testing Using Genetic Algorithm

Authors: K. Hema Shankari, R. Thirumalaiselvi, N. V. Balasubramanian

Abstract:

The present paper addresses to the research in the area of regression testing with emphasis on automated tools as well as prioritization of test cases. The uniqueness of regression testing and its cyclic nature is pointed out. The difference in approach between industry, with business model as basis, and academia, with focus on data mining, is highlighted. Test Metrics are discussed as a prelude to our formula for prioritization; a case study is further discussed to illustrate this methodology. An industrial case study is also described in the paper, where the number of test cases is so large that they have to be grouped as Test Suites. In such situations, a genetic algorithm proposed by us can be used to reconfigure these Test Suites in each cycle of regression testing. The comparison is made between a proprietary tool and an open source tool using the above-mentioned metrics. Our approach is clarified through several tables.

Keywords: APFD metric, genetic algorithm, regression testing, RFT tool, test case prioritization, selenium tool

Procedia PDF Downloads 401