Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 238

Search results for: Regression

238 Orthogonal Regression for Nonparametric Estimation of Errors-in-Variables Models

Authors: Anastasiia Yu. Timofeeva

Abstract:

Two new algorithms for nonparametric estimation of errors-in-variables models are proposed. The first algorithm is based on penalized regression spline. The spline is represented as a piecewise-linear function and for each linear portion orthogonal regression is estimated. This algorithm is iterative. The second algorithm involves locally weighted regression estimation. When the independent variable is measured with error such estimation is a complex nonlinear optimization problem. The simulation results have shown the advantage of the second algorithm under the assumption that true smoothing parameters values are known. Nevertheless the use of some indexes of fit to smoothing parameters selection gives the similar results and has an oversmoothing effect.

Keywords: Grade point average, orthogonal regression, penalized regression spline, locally weighted regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
237 Relationship between Sums of Squares in Linear Regression and Semi-parametric Regression

Authors: Dursun Aydın, Bilgin Senel

Abstract:

In this paper, the sum of squares in linear regression is reduced to sum of squares in semi-parametric regression. We indicated that different sums of squares in the linear regression are similar to various deviance statements in semi-parametric regression. In addition to, coefficient of the determination derived in linear regression model is easily generalized to coefficient of the determination of the semi-parametric regression model. Then, it is made an application in order to support the theory of the linear regression and semi-parametric regression. In this way, study is supported with a simulated data example.

Keywords: Semi-parametric regression, Penalized LeastSquares, Residuals, Deviance, Smoothing Spline.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
236 Estimating Regression Parameters in Linear Regression Model with a Censored Response Variable

Authors: Jesus Orbe, Vicente Nunez-Anton

Abstract:

In this work we study the effect of several covariates X on a censored response variable T with unknown probability distribution. In this context, most of the studies in the literature can be located in two possible general classes of regression models: models that study the effect the covariates have on the hazard function; and models that study the effect the covariates have on the censored response variable. Proposals in this paper are in the second class of models and, more specifically, on least squares based model approach. Thus, using the bootstrap estimate of the bias, we try to improve the estimation of the regression parameters by reducing their bias, for small sample sizes. Simulation results presented in the paper show that, for reasonable sample sizes and censoring levels, the bias is always smaller for the new proposals.

Keywords: Censored response variable, regression, bias.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
235 Speaker Independent Quranic Recognizer Basedon Maximum Likelihood Linear Regression

Authors: Ehab Mourtaga, Ahmad Sharieh, Mousa Abdallah

Abstract:

An automatic speech recognition system for the formal Arabic language is needed. The Quran is the most formal spoken book in Arabic, it is spoken all over the world. In this research, an automatic speech recognizer for Quranic based speakerindependent was developed and tested. The system was developed based on the tri-phone Hidden Markov Model and Maximum Likelihood Linear Regression (MLLR). The MLLR computes a set of transformations which reduces the mismatch between an initial model set and the adaptation data. It uses the regression class tree, as well as, estimates a set of linear transformations for the mean and variance parameters of a Gaussian mixture HMM system. The 30th Chapter of the Quran, with five of the most famous readers of the Quran, was used for the training and testing of the data. The chapter includes about 2000 distinct words. The advantages of using the Quranic verses as the database in this developed recognizer are the uniqueness of the words and the high level of orderliness between verses. The level of accuracy from the tested data ranged 68 to 85%.

Keywords: Hidden Markov Model (HMM), MaximumLikelihood Linear Regression (MLLR), Quran, Regression ClassTree, Speech Recognition, Speaker-independent.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
234 Landslide Susceptibility Mapping: A Comparison between Logistic Regression and Multivariate Adaptive Regression Spline Models in the Municipality of Oudka, Northern of Morocco

Authors: S. Benchelha, H. C. Aoudjehane, M. Hakdaoui, R. El Hamdouni, H. Mansouri, T. Benchelha, M. Layelmam, M. Alaoui

Abstract:

The logistic regression (LR) and multivariate adaptive regression spline (MarSpline) are applied and verified for analysis of landslide susceptibility map in Oudka, Morocco, using geographical information system. From spatial database containing data such as landslide mapping, topography, soil, hydrology and lithology, the eight factors related to landslides such as elevation, slope, aspect, distance to streams, distance to road, distance to faults, lithology map and Normalized Difference Vegetation Index (NDVI) were calculated or extracted. Using these factors, landslide susceptibility indexes were calculated by the two mentioned methods. Before the calculation, this database was divided into two parts, the first for the formation of the model and the second for the validation. The results of the landslide susceptibility analysis were verified using success and prediction rates to evaluate the quality of these probabilistic models. The result of this verification was that the MarSpline model is the best model with a success rate (AUC = 0.963) and a prediction rate (AUC = 0.951) higher than the LR model (success rate AUC = 0.918, rate prediction AUC = 0.901).

Keywords: Landslide susceptibility mapping, regression logistic, multivariate adaptive regression spline, Oudka, Taounate, Morocco.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
233 Comparison of Multivariate Adaptive Regression Splines and Random Forest Regression in Predicting Forced Expiratory Volume in One Second

Authors: P. V. Pramila, V. Mahesh

Abstract:

Pulmonary Function Tests are important non-invasive diagnostic tests to assess respiratory impairments and provides quantifiable measures of lung function. Spirometry is the most frequently used measure of lung function and plays an essential role in the diagnosis and management of pulmonary diseases. However, the test requires considerable patient effort and cooperation, markedly related to the age of patients resulting in incomplete data sets. This paper presents, a nonlinear model built using Multivariate adaptive regression splines and Random forest regression model to predict the missing spirometric features. Random forest based feature selection is used to enhance both the generalization capability and the model interpretability. In the present study, flow-volume data are recorded for N= 198 subjects. The ranked order of feature importance index calculated by the random forests model shows that the spirometric features FVC, FEF25, PEF, FEF25-75, FEF50 and the demographic parameter height are the important descriptors. A comparison of performance assessment of both models prove that, the prediction ability of MARS with the `top two ranked features namely the FVC and FEF25 is higher, yielding a model fit of R2= 0.96 and R2= 0.99 for normal and abnormal subjects. The Root Mean Square Error analysis of the RF model and the MARS model also shows that the latter is capable of predicting the missing values of FEV1 with a notably lower error value of 0.0191 (normal subjects) and 0.0106 (abnormal subjects) with the aforementioned input features. It is concluded that combining feature selection with a prediction model provides a minimum subset of predominant features to train the model, as well as yielding better prediction performance. This analysis can assist clinicians with a intelligence support system in the medical diagnosis and improvement of clinical care.

Keywords: FEV1, Multivariate Adaptive Regression Splines Pulmonary Function Test, Random Forest.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
232 A Comparison of Some Thresholding Selection Methods for Wavelet Regression

Authors: Alsaidi M. Altaher, Mohd T. Ismail

Abstract:

In wavelet regression, choosing threshold value is a crucial issue. A too large value cuts too many coefficients resulting in over smoothing. Conversely, a too small threshold value allows many coefficients to be included in reconstruction, giving a wiggly estimate which result in under smoothing. However, the proper choice of threshold can be considered as a careful balance of these principles. This paper gives a very brief introduction to some thresholding selection methods. These methods include: Universal, Sure, Ebays, Two fold cross validation and level dependent cross validation. A simulation study on a variety of sample sizes, test functions, signal-to-noise ratios is conducted to compare their numerical performances using three different noise structures. For Gaussian noise, EBayes outperforms in all cases for all used functions while Two fold cross validation provides the best results in the case of long tail noise. For large values of signal-to-noise ratios, level dependent cross validation works well under correlated noises case. As expected, increasing both sample size and level of signal to noise ratio, increases estimation efficiency.

Keywords: wavelet regression, simulation, Threshold.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
231 A Robust LS-SVM Regression

Authors: József Valyon, Gábor Horváth

Abstract:

In comparison to the original SVM, which involves a quadratic programming task; LS–SVM simplifies the required computation, but unfortunately the sparseness of standard SVM is lost. Another problem is that LS-SVM is only optimal if the training samples are corrupted by Gaussian noise. In Least Squares SVM (LS–SVM), the nonlinear solution is obtained, by first mapping the input vector to a high dimensional kernel space in a nonlinear fashion, where the solution is calculated from a linear equation set. In this paper a geometric view of the kernel space is introduced, which enables us to develop a new formulation to achieve a sparse and robust estimate.

Keywords: Support Vector Machines, Least Squares SupportVector Machines, Regression, Sparse approximation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
230 Multivariate School Travel Demand Regression Based on Trip Attraction

Authors: Ben-Edigbe J, RahmanR

Abstract:

Since primary school trips usually start from home, attention by many scholars have been focused on the home end for data gathering. Thereafter category analysis has often been relied upon when predicting school travel demands. In this paper, school end was relied on for data gathering and multivariate regression for future travel demand prediction. 9859 pupils were surveyed by way of questionnaires at 21 primary schools. The town was divided into 5 zones. The study was carried out in Skudai Town, Malaysia. Based on the hypothesis that the number of primary school trip ends are expected to be the same because school trips are fixed, the choice of trip end would have inconsequential effect on the outcome. The study compared empirical data for home and school trip end productions and attractions. Variance from both data results was insignificant, although some claims from home based family survey were found to be grossly exaggerated. Data from the school trip ends was relied on for travel demand prediction because of its completeness. Accessibility, trip attraction and trip production were then related to school trip rates under daylight and dry weather conditions. The paper concluded that, accessibility is an important parameter when predicting demand for future school trip rates.

Keywords: Trip generation, regression analysis, multiple linearregressions

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
229 Ensembling Adaptively Constructed Polynomial Regression Models

Authors: Gints Jekabsons

Abstract:

The approach of subset selection in polynomial regression model building assumes that the chosen fixed full set of predefined basis functions contains a subset that is sufficient to describe the target relation sufficiently well. However, in most cases the necessary set of basis functions is not known and needs to be guessed – a potentially non-trivial (and long) trial and error process. In our research we consider a potentially more efficient approach – Adaptive Basis Function Construction (ABFC). It lets the model building method itself construct the basis functions necessary for creating a model of arbitrary complexity with adequate predictive performance. However, there are two issues that to some extent plague the methods of both the subset selection and the ABFC, especially when working with relatively small data samples: the selection bias and the selection instability. We try to correct these issues by model post-evaluation using Cross-Validation and model ensembling. To evaluate the proposed method, we empirically compare it to ABFC methods without ensembling, to a widely used method of subset selection, as well as to some other well-known regression modeling methods, using publicly available data sets.

Keywords: Basis function construction, heuristic search, modelensembles, polynomial regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
228 A Comparison of the Nonparametric Regression Models using Smoothing Spline and Kernel Regression

Authors: Dursun Aydin

Abstract:

This paper study about using of nonparametric models for Gross National Product data in Turkey and Stanford heart transplant data. It is discussed two nonparametric techniques called smoothing spline and kernel regression. The main goal is to compare the techniques used for prediction of the nonparametric regression models. According to the results of numerical studies, it is concluded that smoothing spline regression estimators are better than those of the kernel regression.

Keywords: Kernel regression, Nonparametric models, Prediction, Smoothing spline.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
227 A General Regression Test Selection Technique

Authors: Walid S. Abd El-hamid, Sherif S. El-etriby, Mohiy M. Hadhoud

Abstract:

This paper presents a new methodology to select test cases from regression test suites. The selection strategy is based on analyzing the dynamic behavior of the applications that written in any programming language. Methods based on dynamic analysis are more safe and efficient. We design a technique that combine the code based technique and model based technique, to allow comparing the object oriented of an application that written in any programming language. We have developed a prototype tool that detect changes and select test cases from test suite.

Keywords: Regression testing, Model based testing, Dynamicbehavior.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
226 Estimation of Time -Varying Linear Regression with Unknown Time -Volatility via Continuous Generalization of the Akaike Information Criterion

Authors: Elena Ezhova, Vadim Mottl, Olga Krasotkina

Abstract:

The problem of estimating time-varying regression is inevitably concerned with the necessity to choose the appropriate level of model volatility - ranging from the full stationarity of instant regression models to their absolute independence of each other. In the stationary case the number of regression coefficients to be estimated equals that of regressors, whereas the absence of any smoothness assumptions augments the dimension of the unknown vector by the factor of the time-series length. The Akaike Information Criterion is a commonly adopted means of adjusting a model to the given data set within a succession of nested parametric model classes, but its crucial restriction is that the classes are rigidly defined by the growing integer-valued dimension of the unknown vector. To make the Kullback information maximization principle underlying the classical AIC applicable to the problem of time-varying regression estimation, we extend it onto a wider class of data models in which the dimension of the parameter is fixed, but the freedom of its values is softly constrained by a family of continuously nested a priori probability distributions.

Keywords: Time varying regression, time-volatility of regression coefficients, Akaike Information Criterion (AIC), Kullback information maximization principle.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
225 Using Combination of Optimized Recurrent Neural Network with Design of Experiments and Regression for Control Chart Forecasting

Authors: R. Behmanesh, I. Rahimi

Abstract:

recurrent neural network (RNN) is an efficient tool for modeling production control process as well as modeling services. In this paper one RNN was combined with regression model and were employed in order to be checked whether the obtained data by the model in comparison with actual data, are valid for variable process control chart. Therefore, one maintenance process in workshop of Esfahan Oil Refining Co. (EORC) was taken for illustration of models. First, the regression was made for predicting the response time of process based upon determined factors, and then the error between actual and predicted response time as output and also the same factors as input were used in RNN. Finally, according to predicted data from combined model, it is scrutinized for test values in statistical process control whether forecasting efficiency is acceptable. Meanwhile, in training process of RNN, design of experiments was set so as to optimize the RNN.

Keywords: RNN, DOE, regression, control chart.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
224 Multiple Regression based Graphical Modeling for Images

Authors: Pavan S., Sridhar G., Sridhar V.

Abstract:

Super resolution is one of the commonly referred inference problems in computer vision. In the case of images, this problem is generally addressed using a graphical model framework wherein each node represents a portion of the image and the edges between the nodes represent the statistical dependencies. However, the large dimensionality of images along with the large number of possible states for a node makes the inference problem computationally intractable. In this paper, we propose a representation wherein each node can be represented as acombination of multiple regression functions. The proposed approach achieves a tradeoff between the computational complexity and inference accuracy by varying the number of regression functions for a node.

Keywords: Belief propagation, Graphical model, Regression, Super resolution.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
223 On the outlier Detection in Nonlinear Regression

Authors: Hossein Riazoshams, Midi Habshah, Jr., Mohamad Bakri Adam

Abstract:

The detection of outliers is very essential because of their responsibility for producing huge interpretative problem in linear as well as in nonlinear regression analysis. Much work has been accomplished on the identification of outlier in linear regression, but not in nonlinear regression. In this article we propose several outlier detection techniques for nonlinear regression. The main idea is to use the linear approximation of a nonlinear model and consider the gradient as the design matrix. Subsequently, the detection techniques are formulated. Six detection measures are developed that combined with three estimation techniques such as the Least-Squares, M and MM-estimators. The study shows that among the six measures, only the studentized residual and Cook Distance which combined with the MM estimator, consistently capable of identifying the correct outliers.

Keywords: Nonlinear Regression, outliers, Gradient, LeastSquare, M-estimate, MM-estimate.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
222 Model-Based Software Regression Test Suite Reduction

Authors: Shiwei Deng, Yang Bao

Abstract:

In this paper, we present a model-based regression test suite reducing approach that uses EFSM model dependence analysis and probability-driven greedy algorithm to reduce software regression test suites. The approach automatically identifies the difference between the original model and the modified model as a set of elementary model modifications. The EFSM dependence analysis is performed for each elementary modification to reduce the regression test suite, and then the probability-driven greedy algorithm is adopted to select the minimum set of test cases from the reduced regression test suite that cover all interaction patterns. Our initial experience shows that the approach may significantly reduce the size of regression test suites.

Keywords: Dependence analysis, EFSM model, greedy algorithm, regression test.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
221 Defect Cause Modeling with Decision Tree and Regression Analysis

Authors: B. Bakır, İ. Batmaz, F. A. Güntürkün, İ. A. İpekçi, G. Köksal, N. E. Özdemirel

Abstract:

The main aim of this study is to identify the most influential variables that cause defects on the items produced by a casting company located in Turkey. To this end, one of the items produced by the company with high defective percentage rates is selected. Two approaches-the regression analysis and decision treesare used to model the relationship between process parameters and defect types. Although logistic regression models failed, decision tree model gives meaningful results. Based on these results, it can be claimed that the decision tree approach is a promising technique for determining the most important process variables.

Keywords: Casting industry, decision tree algorithm C5.0, logistic regression, quality improvement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
220 Predictive Clustering Hybrid Regression(pCHR) Approach and Its Application to Sucrose-Based Biohydrogen Production

Authors: Nikhil, Ari Visa, Chin-Chao Chen, Chiu-Yue Lin, Jaakko A. Puhakka, Olli Yli-Harja

Abstract:

A predictive clustering hybrid regression (pCHR) approach was developed and evaluated using dataset from H2- producing sucrose-based bioreactor operated for 15 months. The aim was to model and predict the H2-production rate using information available about envirome and metabolome of the bioprocess. Selforganizing maps (SOM) and Sammon map were used to visualize the dataset and to identify main metabolic patterns and clusters in bioprocess data. Three metabolic clusters: acetate coupled with other metabolites, butyrate only, and transition phases were detected. The developed pCHR model combines principles of k-means clustering, kNN classification and regression techniques. The model performed well in modeling and predicting the H2-production rate with mean square error values of 0.0014 and 0.0032, respectively.

Keywords: Biohydrogen, bioprocess modeling, clusteringhybrid regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
219 Optimization of Slider Crank Mechanism Using Design of Experiments and Multi-Linear Regression

Authors: Galal Elkobrosy, Amr M. Abdelrazek, Bassuny M. Elsouhily, Mohamed E. Khidr

Abstract:

Crank shaft length, connecting rod length, crank angle, engine rpm, cylinder bore, mass of piston and compression ratio are the inputs that can control the performance of the slider crank mechanism and then its efficiency. Several combinations of these seven inputs are used and compared. The throughput engine torque predicted by the simulation is analyzed through two different regression models, with and without interaction terms, developed according to multi-linear regression using LU decomposition to solve system of algebraic equations. These models are validated. A regression model in seven inputs including their interaction terms lowered the polynomial degree from 3rd degree to 1st degree and suggested valid predictions and stable explanations.

Keywords: Design of experiments, regression analysis, SI Engine, statistical modeling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
218 Nuclear Fuel Safety Threshold Determined by Logistic Regression Plus Uncertainty

Authors: D. S. Gomes, A. T. Silva

Abstract:

Analysis of the uncertainty quantification related to nuclear safety margins applied to the nuclear reactor is an important concept to prevent future radioactive accidents. The nuclear fuel performance code may involve the tolerance level determined by traditional deterministic models producing acceptable results at burn cycles under 62 GWd/MTU. The behavior of nuclear fuel can simulate applying a series of material properties under irradiation and physics models to calculate the safety limits. In this study, theoretical predictions of nuclear fuel failure under transient conditions investigate extended radiation cycles at 75 GWd/MTU, considering the behavior of fuel rods in light-water reactors under reactivity accident conditions. The fuel pellet can melt due to the quick increase of reactivity during a transient. Large power excursions in the reactor are the subject of interest bringing to a treatment that is known as the Fuchs-Hansen model. The point kinetic neutron equations show similar characteristics of non-linear differential equations. In this investigation, the multivariate logistic regression is employed to a probabilistic forecast of fuel failure. A comparison of computational simulation and experimental results was acceptable. The experiments carried out use the pre-irradiated fuels rods subjected to a rapid energy pulse which exhibits the same behavior during a nuclear accident. The propagation of uncertainty utilizes the Wilk's formulation. The variables chosen as essential to failure prediction were the fuel burnup, the applied peak power, the pulse width, the oxidation layer thickness, and the cladding type.

Keywords: Logistic regression, reactivity-initiated accident, safety margins, uncertainty propagation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
217 Two New Relative Efficiencies of Linear Weighted Regression

Authors: Shuimiao Wan, Chao Yuan, Baoguang Tian

Abstract:

In statistics parameter theory, usually the parameter estimations have two kinds, one is the least-square estimation (LSE), and the other is the best linear unbiased estimation (BLUE). Due to the determining theorem of minimum variance unbiased estimator (MVUE), the parameter estimation of BLUE in linear model is most ideal. But since the calculations are complicated or the covariance is not given, people are hardly to get the solution. Therefore, people prefer to use LSE rather than BLUE. And this substitution will take some losses. To quantize the losses, many scholars have presented many kinds of different relative efficiencies in different views. For the linear weighted regression model, this paper discusses the relative efficiencies of LSE of β to BLUE of β. It also defines two new relative efficiencies and gives their lower bounds.

Keywords: Linear weighted regression, Relative efficiency, Lower bound, Parameter estimation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
216 A Study on Inference from Distance Variables in Hedonic Regression

Authors: Yan Wang, Yasushi Asami, Yukio Sadahiro

Abstract:

In urban area, several landmarks may affect housing price and rents, and hedonic analysis should employ distance variables corresponding to each landmarks. Unfortunately, the effects of distances to landmarks on housing prices are generally not consistent with the true price. These distance variables may cause magnitude error in regression, pointing a problem of spatial multicollinearity. In this paper, we provided some approaches for getting the samples with less bias and method on locating the specific sampling area to avoid the multicollinerity problem in two specific landmarks case.

Keywords: Landmarks, hedonic regression, distance variables, collinearity, multicollinerity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
215 Dichotomous Logistic Regression with Leave-One-Out Validation

Authors: Sin Yin Teh, Abdul Rahman Othman, Michael Boon Chong Khoo

Abstract:

In this paper, the concepts of dichotomous logistic regression (DLR) with leave-one-out (L-O-O) were discussed. To illustrate this, the L-O-O was run to determine the importance of the simulation conditions for robust test of spread procedures with good Type I error rates. The resultant model was then evaluated. The discussions included 1) assessment of the accuracy of the model, and 2) parameter estimates. These were presented and illustrated by modeling the relationship between the dichotomous dependent variable (Type I error rates) with a set of independent variables (the simulation conditions). The base SAS software containing PROC LOGISTIC and DATA step functions can be making used to do the DLR analysis.

Keywords: Dichotomous logistic regression, leave-one-out, testof spread.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
214 A Comparison of the Sum of Squares in Linear and Partial Linear Regression Models

Authors: Dursun Aydın

Abstract:

In this paper, estimation of the linear regression model is made by ordinary least squares method and the partially linear regression model is estimated by penalized least squares method using smoothing spline. Then, it is investigated that differences and similarity in the sum of squares related for linear regression and partial linear regression models (semi-parametric regression models). It is denoted that the sum of squares in linear regression is reduced to sum of squares in partial linear regression models. Furthermore, we indicated that various sums of squares in the linear regression are similar to different deviance statements in partial linear regression. In addition to, coefficient of the determination derived in linear regression model is easily generalized to coefficient of the determination of the partial linear regression model. For this aim, it is made two different applications. A simulated and a real data set are considered to prove the claim mentioned here. In this way, this study is supported with a simulation and a real data example.

Keywords: Partial Linear Regression Model, Linear RegressionModel, Residuals, Deviance, Smoothing Spline.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
213 Development of Regression Equation for Surface Finish and Analysis of Surface Integrity in EDM

Authors: Md. Ashikur Rahman Khan, M. M. Rahman

Abstract:

Electrical discharge machining (EDM) is a relatively modern machining process having distinct advantages over other machining processes and can machine Ti-alloys effectively. The present study emphasizes the features of the development of regression equation based on response surface methodology (RSM) for correlating the interactive and higher-order influences of machining parameters on surface finish of Titanium alloy Ti-6Al-4V. The process parameters selected in this study are discharge current, pulse on time, pulse off time and servo voltage. Machining has been accomplished using negative polarity of Graphite electrode. Analysis of variance is employed to ascertain the adequacy of the developed regression model. Experiments based on central composite of response surface method are carried out. Scanning electron microscopy (SEM) analysis was performed to investigate the surface topography of the EDMed job. The results evidence that the proposed regression equation can predict the surface roughness effectively. The lower ampere and short pulse on time yield better surface finish.

Keywords: Graphite electrode, regression model, response surface methodology, surface roughness.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
212 Regression Test Selection Technique for Multi-Programming Language

Authors: Walid S. Abd El-hamid, Sherif S. El-Etriby, Mohiy M. Hadhoud

Abstract:

Regression testing is a maintenance activity applied to modified software to provide confidence that the changed parts are correct and that the unchanged parts have not been adversely affected by the modifications. Regression test selection techniques reduce the cost of regression testing, by selecting a subset of an existing test suite to use in retesting modified programs. This paper presents the first general regression-test-selection technique, which based on code and allows selecting test cases for any programs written in any programming language. Then it handles incomplete program. We also describe RTSDiff, a regression-test-selection system that implements the proposed technique. The results of the empirical studied that performed in four programming languages java, C#, Cµ and Visual basic show that the efficiency and effective in reducing the size of test suit.

Keywords: Regression testing, testing, test selection, softwareevolution, software maintenance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
211 Institutional Efficiency of Commonhold Industrial Parks Using a Polynomial Regression Model

Authors: Jeng-Wen Lin, Simon Chien-Yuan Chen

Abstract:

Based on assumptions of neo-classical economics and rational choice / public choice theory, this paper investigates the regulation of industrial land use in Taiwan by homeowners associations (HOAs) as opposed to traditional government administration. The comparison, which applies the transaction cost theory and a polynomial regression analysis, manifested that HOAs are superior to conventional government administration in terms of transaction costs and overall efficiency. A case study that compares Taiwan-s commonhold industrial park, NangKang Software Park, to traditional government counterparts using limited data on the costs and returns was analyzed. This empirical study on the relative efficiency of governmental and private institutions justified the important theoretical proposition. Numerical results prove the efficiency of the established model.

Keywords: Homeowners Associations, Institutional Efficiency, Polynomial Regression, Transaction Cost.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
210 A Comparative Study of Additive and Nonparametric Regression Estimators and Variable Selection Procedures

Authors: Adriano Z. Zambom, Preethi Ravikumar

Abstract:

One of the biggest challenges in nonparametric regression is the curse of dimensionality. Additive models are known to overcome this problem by estimating only the individual additive effects of each covariate. However, if the model is misspecified, the accuracy of the estimator compared to the fully nonparametric one is unknown. In this work the efficiency of completely nonparametric regression estimators such as the Loess is compared to the estimators that assume additivity in several situations, including additive and non-additive regression scenarios. The comparison is done by computing the oracle mean square error of the estimators with regards to the true nonparametric regression function. Then, a backward elimination selection procedure based on the Akaike Information Criteria is proposed, which is computed from either the additive or the nonparametric model. Simulations show that if the additive model is misspecified, the percentage of time it fails to select important variables can be higher than that of the fully nonparametric approach. A dimension reduction step is included when nonparametric estimator cannot be computed due to the curse of dimensionality. Finally, the Boston housing dataset is analyzed using the proposed backward elimination procedure and the selected variables are identified.

Keywords: Additive models, local polynomial regression, residuals, mean square error, variable selection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF
209 Computational Aspects of Regression Analysis of Interval Data

Authors: Michal Cerny

Abstract:

We consider linear regression models where both input data (the values of independent variables) and output data (the observations of the dependent variable) are interval-censored. We introduce a possibilistic generalization of the least squares estimator, so called OLS-set for the interval model. This set captures the impact of the loss of information on the OLS estimator caused by interval censoring and provides a tool for quantification of this effect. We study complexity-theoretic properties of the OLS-set. We also deal with restricted versions of the general interval linear regression model, in particular the crisp input – interval output model. We give an argument that natural descriptions of the OLS-set in the crisp input – interval output cannot be computed in polynomial time. Then we derive easily computable approximations for the OLS-set which can be used instead of the exact description. We illustrate the approach by an example.

Keywords: Linear regression, interval-censored data, computational complexity.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF