Search results for: Multiple regression modeling
4006 The Strengths and Limitations of the Statistical Modeling of Complex Social Phenomenon: Focusing on SEM, Path Analysis, or Multiple Regression Models
Authors: Jihye Jeon
Abstract:
This paper analyzes the conceptual framework of three statistical methods, multiple regression, path analysis, and structural equation models. When establishing research model of the statistical modeling of complex social phenomenon, it is important to know the strengths and limitations of three statistical models. This study explored the character, strength, and limitation of each modeling and suggested some strategies for accurate explaining or predicting the causal relationships among variables. Especially, on the studying of depression or mental health, the common mistakes of research modeling were discussed.Keywords: Multiple regression, path analysis, structural equation models, statistical modeling, social and psychological phenomenon.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 92494005 Comparison of Polynomial and Radial Basis Kernel Functions based SVR and MLR in Modeling Mass Transfer by Vertical and Inclined Multiple Plunging Jets
Abstract:
Presently various computational techniques are used in modeling and analyzing environmental engineering data. In the present study, an intra-comparison of polynomial and radial basis kernel functions based on Support Vector Regression and, in turn, an inter-comparison with Multi Linear Regression has been attempted in modeling mass transfer capacity of vertical (θ = 90O) and inclined (θ multiple plunging jets (varying from 1 to 16 numbers). The data set used in this study consists of four input parameters with a total of eighty eight cases, forty four each for vertical and inclined multiple plunging jets. For testing, tenfold cross validation was used. Correlation coefficient values of 0.971 and 0.981 along with corresponding root mean square error values of 0.0025 and 0.0020 were achieved by using polynomial and radial basis kernel functions based Support Vector Regression respectively. An intra-comparison suggests improved performance by radial basis function in comparison to polynomial kernel based Support Vector Regression. Further, an inter-comparison with Multi Linear Regression (correlation coefficient = 0.973 and root mean square error = 0.0024) reveals that radial basis kernel functions based Support Vector Regression performs better in modeling and estimating mass transfer by multiple plunging jets.Keywords: Mass transfer, multiple plunging jets, polynomial and radial basis kernel functions, Support Vector Regression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14324004 Modeling Oxygen-transfer by Multiple Plunging Jets using Support Vector Machines and Gaussian Process Regression Techniques
Authors: Surinder Deswal
Abstract:
The paper investigates the potential of support vector machines and Gaussian process based regression approaches to model the oxygen–transfer capacity from experimental data of multiple plunging jets oxygenation systems. The results suggest the utility of both the modeling techniques in the prediction of the overall volumetric oxygen transfer coefficient (KLa) from operational parameters of multiple plunging jets oxygenation system. The correlation coefficient root mean square error and coefficient of determination values of 0.971, 0.002 and 0.945 respectively were achieved by support vector machine in comparison to values of 0.960, 0.002 and 0.920 respectively achieved by Gaussian process regression. Further, the performances of both these regression approaches in predicting the overall volumetric oxygen transfer coefficient was compared with the empirical relationship for multiple plunging jets. A comparison of results suggests that support vector machines approach works well in comparison to both empirical relationship and Gaussian process approaches, and could successfully be employed in modeling oxygen-transfer.Keywords: Oxygen-transfer, multiple plunging jets, support vector machines, Gaussian process.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16374003 Multi-Linear Regression Based Prediction of Mass Transfer by Multiple Plunging Jets
Abstract:
The paper aims to compare the performance of vertical and inclined multiple plunging jets and to model and predict their mass transfer capacity by multi-linear regression based approach. The multiple vertical plunging jets have jet impact angle of θ = 90O; whereas, multiple inclined plunging jets have jet impact angle of θ = 60O. The results of the study suggests that mass transfer is higher for multiple jets, and inclined multiple plunging jets have up to 1.6 times higher mass transfer than vertical multiple plunging jets under similar conditions. The derived relationship, based on multi-linear regression approach, has successfully predicted the volumetric mass transfer coefficient (KLa) from operational parameters of multiple plunging jets with a correlation coefficient of 0.973, root mean square error of 0.002 and coefficient of determination of 0.946. The results suggests that predicted overall mass transfer coefficient is in good agreement with actual experimental values; thereby, suggesting the utility of derived relationship based on multi-linear regression based approach and can be successfully employed in modeling mass transfer by multiple plunging jets.
Keywords: Mass transfer, multiple plunging jets, multi-linear regression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22004002 A Hybrid Model of ARIMA and Multiple Polynomial Regression for Uncertainties Modeling of a Serial Production Line
Authors: Amir Azizi, Amir Yazid b. Ali, Loh Wei Ping, Mohsen Mohammadzadeh
Abstract:
Uncertainties of a serial production line affect on the production throughput. The uncertainties cannot be prevented in a real production line. However the uncertain conditions can be controlled by a robust prediction model. Thus, a hybrid model including autoregressive integrated moving average (ARIMA) and multiple polynomial regression, is proposed to model the nonlinear relationship of production uncertainties with throughput. The uncertainties under consideration of this study are demand, breaktime, scrap, and lead-time. The nonlinear relationship of production uncertainties with throughput are examined in the form of quadratic and cubic regression models, where the adjusted R-squared for quadratic and cubic regressions was 98.3% and 98.2%. We optimized the multiple quadratic regression (MQR) by considering the time series trend of the uncertainties using ARIMA model. Finally the hybrid model of ARIMA and MQR is formulated by better adjusted R-squared, which is 98.9%.Keywords: ARIMA, multiple polynomial regression, production throughput, uncertainties
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21994001 Empirical Statistical Modeling of Rainfall Prediction over Myanmar
Authors: Wint Thida Zaw, Thinn Thu Naing
Abstract:
One of the essential sectors of Myanmar economy is agriculture which is sensitive to climate variation. The most important climatic element which impacts on agriculture sector is rainfall. Thus rainfall prediction becomes an important issue in agriculture country. Multi variables polynomial regression (MPR) provides an effective way to describe complex nonlinear input output relationships so that an outcome variable can be predicted from the other or others. In this paper, the modeling of monthly rainfall prediction over Myanmar is described in detail by applying the polynomial regression equation. The proposed model results are compared to the results produced by multiple linear regression model (MLR). Experiments indicate that the prediction model based on MPR has higher accuracy than using MLR.Keywords: Polynomial Regression, Rainfall Forecasting, Statistical forecasting.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 26334000 Multiple Regression based Graphical Modeling for Images
Authors: Pavan S., Sridhar G., Sridhar V.
Abstract:
Super resolution is one of the commonly referred inference problems in computer vision. In the case of images, this problem is generally addressed using a graphical model framework wherein each node represents a portion of the image and the edges between the nodes represent the statistical dependencies. However, the large dimensionality of images along with the large number of possible states for a node makes the inference problem computationally intractable. In this paper, we propose a representation wherein each node can be represented as acombination of multiple regression functions. The proposed approach achieves a tradeoff between the computational complexity and inference accuracy by varying the number of regression functions for a node.
Keywords: Belief propagation, Graphical model, Regression, Super resolution.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15463999 Research on the Problems of Housing Prices in Qingdao from a Macro Perspective
Authors: Liu Zhiyuan, Sun Zongdi, Liu Zhiyuan, Sun Zongdi
Abstract:
Qingdao is a seaside city. Taking into account the characteristics of Qingdao, this article established a multiple linear regression model to analyze the impact of macroeconomic factors on housing prices. We used stepwise regression method to make multiple linear regression analysis, and made statistical analysis of F test values and T test values. According to the analysis results, the model is continuously optimized. Finally, this article obtained the multiple linear regression equation and the influencing factors, and the reliability of the model was verified by F test and T test.
Keywords: Housing prices, multiple linear regression model, macroeconomic factors, Qingdao City.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11793998 Internet Purchases in European Union Countries: Multiple Linear Regression Approach
Authors: Ksenija Dumičić, Anita Čeh Časni, Irena Palić
Abstract:
This paper examines economic and Information and Communication Technology (ICT) development influence on recently increasing Internet purchases by individuals for European Union member states. After a growing trend for Internet purchases in EU27 was noticed, all possible regression analysis was applied using nine independent variables in 2011. Finally, two linear regression models were studied in detail. Conducted simple linear regression analysis confirmed the research hypothesis that the Internet purchases in analyzed EU countries is positively correlated with statistically significant variable Gross Domestic Product per capita (GDPpc). Also, analyzed multiple linear regression model with four regressors, showing ICT development level, indicates that ICT development is crucial for explaining the Internet purchases by individuals, confirming the research hypothesis.
Keywords: European Union, Internet purchases, multiple linear regression model, outlier
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 29553997 Mathematical Modeling to Predict Surface Roughness in CNC Milling
Authors: Ab. Rashid M.F.F., Gan S.Y., Muhammad N.Y.
Abstract:
Surface roughness (Ra) is one of the most important requirements in machining process. In order to obtain better surface roughness, the proper setting of cutting parameters is crucial before the process take place. This research presents the development of mathematical model for surface roughness prediction before milling process in order to evaluate the fitness of machining parameters; spindle speed, feed rate and depth of cut. 84 samples were run in this study by using FANUC CNC Milling α-Τ14ιE. Those samples were randomly divided into two data sets- the training sets (m=60) and testing sets(m=24). ANOVA analysis showed that at least one of the population regression coefficients was not zero. Multiple Regression Method was used to determine the correlation between a criterion variable and a combination of predictor variables. It was established that the surface roughness is most influenced by the feed rate. By using Multiple Regression Method equation, the average percentage deviation of the testing set was 9.8% and 9.7% for training data set. This showed that the statistical model could predict the surface roughness with about 90.2% accuracy of the testing data set and 90.3% accuracy of the training data set.
Keywords: Surface roughness, regression analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21303996 Artificial Neural Network based Modeling of Evaporation Losses in Reservoirs
Authors: Surinder Deswal, Mahesh Pal
Abstract:
An Artificial Neural Network based modeling technique has been used to study the influence of different combinations of meteorological parameters on evaporation from a reservoir. The data set used is taken from an earlier reported study. Several input combination were tried so as to find out the importance of different input parameters in predicting the evaporation. The prediction accuracy of Artificial Neural Network has also been compared with the accuracy of linear regression for predicting evaporation. The comparison demonstrated superior performance of Artificial Neural Network over linear regression approach. The findings of the study also revealed the requirement of all input parameters considered together, instead of individual parameters taken one at a time as reported in earlier studies, in predicting the evaporation. The highest correlation coefficient (0.960) along with lowest root mean square error (0.865) was obtained with the input combination of air temperature, wind speed, sunshine hours and mean relative humidity. A graph between the actual and predicted values of evaporation suggests that most of the values lie within a scatter of ±15% with all input parameters. The findings of this study suggest the usefulness of ANN technique in predicting the evaporation losses from reservoirs.Keywords: Artificial neural network, evaporation losses, multiple linear regression, modeling.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19763995 NFκB Pathway Modeling for Optimal Drug Combination Therapy on Multiple Myeloma
Authors: Huiming Peng, Jianguo Wen, Hongwei Li, Jeff Chang, Xiaobo Zhou
Abstract:
NFκB activation plays a crucial role in anti-apoptotic responses in response to the apoptotic signaling during tumor necrosis factor (TNFa) stimulation in Multiple Myeloma (MM). Although several drugs have been found effective for the treatment of MM by mainly inhibiting NFκB pathway, there are no any quantitative or qualitative results of comparison assessment on inhibition effect between different single drugs or drug combinations. Computational modeling is becoming increasingly indispensable for applied biological research mainly because it can provide strong quantitative predicting power. In this study, a novel computational pathway modeling approach is employed to comparably assess the inhibition effects of specific single drugs and drug combinations on the NFκB pathway in MM, especially the prediction of synergistic drug combinations.
Keywords: Computational modeling, drug combination, inhibition effect, multiple myeloma, NFkB pathway.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 30463994 Predicting Bridge Pier Scour Depth with SVM
Authors: Arun Goel
Abstract:
Prediction of maximum local scour is necessary for the safety and economical design of the bridges. A number of equations have been developed over the years to predict local scour depth using laboratory data and a few pier equations have also been proposed using field data. Most of these equations are empirical in nature as indicated by the past publications. In this paper attempts have been made to compute local depth of scour around bridge pier in dimensional and non-dimensional form by using linear regression, simple regression and SVM (Poly & Rbf) techniques along with few conventional empirical equations. The outcome of this study suggests that the SVM (Poly & Rbf) based modeling can be employed as an alternate to linear regression, simple regression and the conventional empirical equations in predicting scour depth of bridge piers. The results of present study on the basis of non-dimensional form of bridge pier scour indicate the improvement in the performance of SVM (Poly & Rbf) in comparison to dimensional form of scour.Keywords: Modeling, pier scour, regression, prediction, SVM (Poly & Rbf kernels).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15433993 Hierarchically Modeling Cognition and Behavioral Problems of an Under-Represented Group
Authors: Zhidong Zhang, Zhi-Chao Zhang
Abstract:
This study examined the mental health and behavioral problems in early adolescence with the instrument of Achenbach System of Empirically Based Assessment (ASEBA). The purpose of the study was stratified sampling method was used to collect data from 1975 participants. Multiple regression models and hierarchical regression models were applied to examine the relations between the background variables and internalizing problems, and the ones between students’ performance and internalizing problems. The results indicated that several background variables as predictors could significantly predict the anxious/depressed problem; reading and social study scores could significantly predict the anxious/depressed problem. However the class as a hierarchical macro factor did not indicate the significant effect. In brief, the majority of these models represented that the background variables, behaviors and academic performance were significantly related to the anxious/depressed problem.Keywords: Behavioral problems, anxious/depression problems, empirical-based assessment, hierarchical modeling.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17593992 Using Combination of Optimized Recurrent Neural Network with Design of Experiments and Regression for Control Chart Forecasting
Authors: R. Behmanesh, I. Rahimi
Abstract:
recurrent neural network (RNN) is an efficient tool for modeling production control process as well as modeling services. In this paper one RNN was combined with regression model and were employed in order to be checked whether the obtained data by the model in comparison with actual data, are valid for variable process control chart. Therefore, one maintenance process in workshop of Esfahan Oil Refining Co. (EORC) was taken for illustration of models. First, the regression was made for predicting the response time of process based upon determined factors, and then the error between actual and predicted response time as output and also the same factors as input were used in RNN. Finally, according to predicted data from combined model, it is scrutinized for test values in statistical process control whether forecasting efficiency is acceptable. Meanwhile, in training process of RNN, design of experiments was set so as to optimize the RNN.Keywords: RNN, DOE, regression, control chart.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16593991 Predictive Clustering Hybrid Regression(pCHR) Approach and Its Application to Sucrose-Based Biohydrogen Production
Authors: Nikhil, Ari Visa, Chin-Chao Chen, Chiu-Yue Lin, Jaakko A. Puhakka, Olli Yli-Harja
Abstract:
A predictive clustering hybrid regression (pCHR) approach was developed and evaluated using dataset from H2- producing sucrose-based bioreactor operated for 15 months. The aim was to model and predict the H2-production rate using information available about envirome and metabolome of the bioprocess. Selforganizing maps (SOM) and Sammon map were used to visualize the dataset and to identify main metabolic patterns and clusters in bioprocess data. Three metabolic clusters: acetate coupled with other metabolites, butyrate only, and transition phases were detected. The developed pCHR model combines principles of k-means clustering, kNN classification and regression techniques. The model performed well in modeling and predicting the H2-production rate with mean square error values of 0.0014 and 0.0032, respectively.Keywords: Biohydrogen, bioprocess modeling, clusteringhybrid regression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17763990 Modeling and Optimization of Process Parameters in PMEDM by Genetic Algorithm
Authors: Farhad Kolahan, Mohammad Bironro
Abstract:
This paper addresses modeling and optimization of process parameters in powder mixed electrical discharge machining (PMEDM). The process output characteristics include metal removal rate (MRR) and electrode wear rate (EWR). Grain size of Aluminum powder (S), concentration of the powder (C), discharge current (I) pulse on time (T) are chosen as control variables to study the process performance. The experimental results are used to develop the regression models based on second order polynomial equations for the different process characteristics. Then, a genetic algorithm (GA) has been employed to determine optimal process parameters for any desired output values of machining characteristics.
Keywords: Regression modeling, PMEDM, GeneticAlgorithm, Optimization.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14923989 Relationship between Sums of Squares in Linear Regression and Semi-parametric Regression
Authors: Dursun Aydın, Bilgin Senel
Abstract:
In this paper, the sum of squares in linear regression is reduced to sum of squares in semi-parametric regression. We indicated that different sums of squares in the linear regression are similar to various deviance statements in semi-parametric regression. In addition to, coefficient of the determination derived in linear regression model is easily generalized to coefficient of the determination of the semi-parametric regression model. Then, it is made an application in order to support the theory of the linear regression and semi-parametric regression. In this way, study is supported with a simulated data example.Keywords: Semi-parametric regression, Penalized LeastSquares, Residuals, Deviance, Smoothing Spline.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18533988 Architectural Acoustic Modeling for Predicting Reverberation Time in Room Acoustic Design Using Multiple Criteria Decision Making Analysis
Authors: C. Ardil
Abstract:
This paper presents architectural acoustic modeling to estimate reverberation time in room acoustic design using multiple criteria decision making analysis. First, fundamental decision criteria were determined to evaluate the reverberation time in the room acoustic design problem. Then, the proposed model was applied to a practical decision problem to evaluate and select the optimal room acoustic design model. Finally, the optimal acoustic design of the rooms was analyzed and ranked using a multiple criteria decision making analysis method.
Keywords: Architectural acoustics, room acoustics, architectural acoustic modeling, reverberation time, room acoustic design, multiple criteria decision making analysis, decision analysis, MCDMA
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5503987 A Method for Modeling Multiple Antenna Channels
Authors: S. Rajabi, M. ArdebiliPoor, M. Shahabadi
Abstract:
In this paper we propose a method for modeling the correlation between the received signals by two or more antennas operating in a multipath environment. Considering the maximum excess delay in the channel being modeled, an elliptical region surrounding both transmitter and receiver antennas is produced. A number of scatterers are randomly distributed in this region and scatter the incoming waves. The amplitude and phase of incoming waves are computed and used to obtain statistical properties of the received signals. This model has the distinguishable advantage of being applicable for any configuration of antennas. Furthermore the common PDF (Probability Distribution Function) of received wave amplitudes for any pair of antennas can be calculated and used to produce statistical parameters of received signals.Keywords: MIMO (Multiple Input Multiple Output), SIMO (Single Input Multiple Output), GBSBEM (Geometrically Based Single Bounce Elliptical Model).
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14213986 Extended Least Squares LS–SVM
Authors: József Valyon, Gábor Horváth
Abstract:
Among neural models the Support Vector Machine (SVM) solutions are attracting increasing attention, mostly because they eliminate certain crucial questions involved by neural network construction. The main drawback of standard SVM is its high computational complexity, therefore recently a new technique, the Least Squares SVM (LS–SVM) has been introduced. In this paper we present an extended view of the Least Squares Support Vector Regression (LS–SVR), which enables us to develop new formulations and algorithms to this regression technique. Based on manipulating the linear equation set -which embodies all information about the regression in the learning process- some new methods are introduced to simplify the formulations, speed up the calculations and/or provide better results.Keywords: Function estimation, Least–Squares Support VectorMachines, Regression, System Modeling
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 20083985 Segmentation of Piecewise Polynomial Regression Model by Using Reversible Jump MCMC Algorithm
Authors: Suparman
Abstract:
Piecewise polynomial regression model is very flexible model for modeling the data. If the piecewise polynomial regression model is matched against the data, its parameters are not generally known. This paper studies the parameter estimation problem of piecewise polynomial regression model. The method which is used to estimate the parameters of the piecewise polynomial regression model is Bayesian method. Unfortunately, the Bayes estimator cannot be found analytically. Reversible jump MCMC algorithm is proposed to solve this problem. Reversible jump MCMC algorithm generates the Markov chain that converges to the limit distribution of the posterior distribution of piecewise polynomial regression model parameter. The resulting Markov chain is used to calculate the Bayes estimator for the parameters of piecewise polynomial regression model.
Keywords: Piecewise, Bayesian, reversible jump MCMC, segmentation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16683984 Quantitative Structure Activity Relationship and Insilco Docking of Substituted 1,3,4-Oxadiazole Derivatives as Potential Glucosamine-6-Phosphate Synthase Inhibitors
Authors: Suman Bala, Sunil Kamboj, Vipin Saini
Abstract:
Quantitative Structure Activity Relationship (QSAR) analysis has been developed to relate antifungal activity of novel substituted 1,3,4-oxadiazole against Candida albicans and Aspergillus niger using computer assisted multiple regression analysis. The study has shown the better relationship between antifungal activities with respect to various descriptors established by multiple regression analysis. The analysis has shown statistically significant correlation with R2 values 0.932 and 0.782 against Candida albicans and Aspergillus niger respectively. These derivatives were further subjected to molecular docking studies to investigate the interactions between the target compounds and amino acid residues present in the active site of glucosamine-6-phosphate synthase. All the synthesized compounds have better docking score as compared to standard fluconazole. Our results could be used for the further design as well as development of optimal and potential antifungal agents.Keywords: 1, 3, 4-Oxadiazole, QSAR, Multiple linear regression, Docking, Glucosamine-6-Phosphate Synthase.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15963983 Optimization of Slider Crank Mechanism Using Design of Experiments and Multi-Linear Regression
Authors: Galal Elkobrosy, Amr M. Abdelrazek, Bassuny M. Elsouhily, Mohamed E. Khidr
Abstract:
Crank shaft length, connecting rod length, crank angle, engine rpm, cylinder bore, mass of piston and compression ratio are the inputs that can control the performance of the slider crank mechanism and then its efficiency. Several combinations of these seven inputs are used and compared. The throughput engine torque predicted by the simulation is analyzed through two different regression models, with and without interaction terms, developed according to multi-linear regression using LU decomposition to solve system of algebraic equations. These models are validated. A regression model in seven inputs including their interaction terms lowered the polynomial degree from 3rd degree to 1st degree and suggested valid predictions and stable explanations.
Keywords: Design of experiments, regression analysis, SI Engine, statistical modeling.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12523982 Categorical Data Modeling: Logistic Regression Software
Authors: Abdellatif Tchantchane
Abstract:
A Matlab based software for logistic regression is developed to enhance the process of teaching quantitative topics and assist researchers with analyzing wide area of applications where categorical data is involved. The software offers an option of performing stepwise logistic regression to select the most significant predictors. The software includes a feature to detect influential observations in data, and investigates the effect of dropping or misclassifying an observation on a predictor variable. The input data may consist either as a set of individual responses (yes/no) with the predictor variables or as grouped records summarizing various categories for each unique set of predictor variables' values. Graphical displays are used to output various statistical results and to assess the goodness of fit of the logistic regression model. The software recognizes possible convergence constraints when present in data, and the user is notified accordingly.
Keywords: Logistic regression, Matlab, Categorical data, Influential observation.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18813981 Defect Cause Modeling with Decision Tree and Regression Analysis
Authors: B. Bakır, İ. Batmaz, F. A. Güntürkün, İ. A. İpekçi, G. Köksal, N. E. Özdemirel
Abstract:
The main aim of this study is to identify the most influential variables that cause defects on the items produced by a casting company located in Turkey. To this end, one of the items produced by the company with high defective percentage rates is selected. Two approaches-the regression analysis and decision treesare used to model the relationship between process parameters and defect types. Although logistic regression models failed, decision tree model gives meaningful results. Based on these results, it can be claimed that the decision tree approach is a promising technique for determining the most important process variables.Keywords: Casting industry, decision tree algorithm C5.0, logistic regression, quality improvement.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 25173980 Climate Change in Albania and Its Effect on Cereal Yield
Abstract:
This study is focused on analyzing climate change in Albania and its potential effects on cereal yields. Initially, monthly temperature and rainfalls in Albania were studied for the period 1960-2021. Climacteric variables are important variables when trying to model cereal yield behavior, especially when significant changes in weather conditions are observed. For this purpose, in the second part of the study, linear and nonlinear models explaining cereal yield are constructed for the same period, 1960-2021. The multiple linear regression analysis and lasso regression method are applied to the data between cereal yield and each independent variable: average temperature, average rainfall, fertilizer consumption, arable land, land under cereal production, and nitrous oxide emissions. In our regression model, heteroscedasticity is not observed, data follow a normal distribution, and there is a low correlation between factors, so we do not have the problem of multicollinearity. Machine learning methods, such as Random Forest (RF), are used to predict cereal yield responses to climacteric and other variables. RF showed high accuracy compared to the other statistical models in the prediction of cereal yield. We found that changes in average temperature negatively affect cereal yield. The coefficients of fertilizer consumption, arable land, and land under cereal production are positively affecting production. Our results show that the RF method is an effective and versatile machine-learning method for cereal yield prediction compared to the other two methods: multiple linear regression and lasso regression method.
Keywords: Cereal yield, climate change, machine learning, multiple regression model, random forest.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2473979 A Comparison of the Sum of Squares in Linear and Partial Linear Regression Models
Authors: Dursun Aydın
Abstract:
In this paper, estimation of the linear regression model is made by ordinary least squares method and the partially linear regression model is estimated by penalized least squares method using smoothing spline. Then, it is investigated that differences and similarity in the sum of squares related for linear regression and partial linear regression models (semi-parametric regression models). It is denoted that the sum of squares in linear regression is reduced to sum of squares in partial linear regression models. Furthermore, we indicated that various sums of squares in the linear regression are similar to different deviance statements in partial linear regression. In addition to, coefficient of the determination derived in linear regression model is easily generalized to coefficient of the determination of the partial linear regression model. For this aim, it is made two different applications. A simulated and a real data set are considered to prove the claim mentioned here. In this way, this study is supported with a simulation and a real data example.Keywords: Partial Linear Regression Model, Linear RegressionModel, Residuals, Deviance, Smoothing Spline.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18713978 New Approach for Load Modeling
Authors: S. Chokri
Abstract:
Load modeling is one of the central functions in power systems operations. Electricity cannot be stored, which means that for electric utility, the estimate of the future demand is necessary in managing the production and purchasing in an economically reasonable way. A majority of the recently reported approaches are based on neural network. The attraction of the methods lies in the assumption that neural networks are able to learn properties of the load. However, the development of the methods is not finished, and the lack of comparative results on different model variations is a problem. This paper presents a new approach in order to predict the Tunisia daily peak load. The proposed method employs a computational intelligence scheme based on the Fuzzy neural network (FNN) and support vector regression (SVR). Experimental results obtained indicate that our proposed FNN-SVR technique gives significantly good prediction accuracy compared to some classical techniques.
Keywords: Neural network, Load Forecasting, Fuzzy inference, Machine learning, Fuzzy modeling and rule extraction, Support Vector Regression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21983977 Fuzzy Logic Approach to Robust Regression Models of Uncertain Medical Categories
Authors: Arkady Bolotin
Abstract:
Dichotomization of the outcome by a single cut-off point is an important part of various medical studies. Usually the relationship between the resulted dichotomized dependent variable and explanatory variables is analyzed with linear regression, probit regression or logistic regression. However, in many real-life situations, a certain cut-off point dividing the outcome into two groups is unknown and can be specified only approximately, i.e. surrounded by some (small) uncertainty. It means that in order to have any practical meaning the regression model must be robust to this uncertainty. In this paper, we show that neither the beta in the linear regression model, nor its significance level is robust to the small variations in the dichotomization cut-off point. As an alternative robust approach to the problem of uncertain medical categories, we propose to use the linear regression model with the fuzzy membership function as a dependent variable. This fuzzy membership function denotes to what degree the value of the underlying (continuous) outcome falls below or above the dichotomization cut-off point. In the paper, we demonstrate that the linear regression model of the fuzzy dependent variable can be insensitive against the uncertainty in the cut-off point location. In the paper we present the modeling results from the real study of low hemoglobin levels in infants. We systematically test the robustness of the binomial regression model and the linear regression model with the fuzzy dependent variable by changing the boundary for the category Anemia and show that the behavior of the latter model persists over a quite wide interval.
Keywords: Categorization, Uncertain medical categories, Binomial regression model, Fuzzy dependent variable, Robustness.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1558