Search results for: Regression analysis.
8954 Multivariate School Travel Demand Regression Based on Trip Attraction
Authors: Ben-Edigbe J, RahmanR
Abstract:
Since primary school trips usually start from home, attention by many scholars have been focused on the home end for data gathering. Thereafter category analysis has often been relied upon when predicting school travel demands. In this paper, school end was relied on for data gathering and multivariate regression for future travel demand prediction. 9859 pupils were surveyed by way of questionnaires at 21 primary schools. The town was divided into 5 zones. The study was carried out in Skudai Town, Malaysia. Based on the hypothesis that the number of primary school trip ends are expected to be the same because school trips are fixed, the choice of trip end would have inconsequential effect on the outcome. The study compared empirical data for home and school trip end productions and attractions. Variance from both data results was insignificant, although some claims from home based family survey were found to be grossly exaggerated. Data from the school trip ends was relied on for travel demand prediction because of its completeness. Accessibility, trip attraction and trip production were then related to school trip rates under daylight and dry weather conditions. The paper concluded that, accessibility is an important parameter when predicting demand for future school trip rates.Keywords: Trip generation, regression analysis, multiple linearregressions
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19058953 Institutional Efficiency of Commonhold Industrial Parks Using a Polynomial Regression Model
Authors: Jeng-Wen Lin, Simon Chien-Yuan Chen
Abstract:
Based on assumptions of neo-classical economics and rational choice / public choice theory, this paper investigates the regulation of industrial land use in Taiwan by homeowners associations (HOAs) as opposed to traditional government administration. The comparison, which applies the transaction cost theory and a polynomial regression analysis, manifested that HOAs are superior to conventional government administration in terms of transaction costs and overall efficiency. A case study that compares Taiwan-s commonhold industrial park, NangKang Software Park, to traditional government counterparts using limited data on the costs and returns was analyzed. This empirical study on the relative efficiency of governmental and private institutions justified the important theoretical proposition. Numerical results prove the efficiency of the established model.Keywords: Homeowners Associations, Institutional Efficiency, Polynomial Regression, Transaction Cost.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15808952 The Impact of Governance on Happiness: Evidence from Quantile Regressions
Authors: Chiung-Ju Huang
Abstract:
This study utilizes the quantile regression analysis to examine the impact of governance (including democratic quality and technical quality) on happiness in 101 countries worldwide, classified as “developed countries” and “developing countries”. The empirical results show that the impact of democratic quality and technical quality on happiness is significantly positive for “developed countries”, while is insignificant for “developing countries”. The results suggest that the authorities in developed countries can enhance the level of individual happiness by means of improving the democracy quality and technical quality. However, for developing countries, promoting the quality of governance in order to enhance the level of happiness may not be effective. Policy makers in developed countries may pay more attention on increasing real GDP per capita instead of promoting the quality of governance to enhance individual happiness.
Keywords: Governance, happiness, multiple regression, quantile regression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17028951 Using Combination of Optimized Recurrent Neural Network with Design of Experiments and Regression for Control Chart Forecasting
Authors: R. Behmanesh, I. Rahimi
Abstract:
recurrent neural network (RNN) is an efficient tool for modeling production control process as well as modeling services. In this paper one RNN was combined with regression model and were employed in order to be checked whether the obtained data by the model in comparison with actual data, are valid for variable process control chart. Therefore, one maintenance process in workshop of Esfahan Oil Refining Co. (EORC) was taken for illustration of models. First, the regression was made for predicting the response time of process based upon determined factors, and then the error between actual and predicted response time as output and also the same factors as input were used in RNN. Finally, according to predicted data from combined model, it is scrutinized for test values in statistical process control whether forecasting efficiency is acceptable. Meanwhile, in training process of RNN, design of experiments was set so as to optimize the RNN.Keywords: RNN, DOE, regression, control chart.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 16598950 Estimate of Maximum Expected Intensity of One-Half-Wave Lines Dancing
Authors: A. Bekbaev, M. Dzhamanbaev, R. Abitaeva, A. Karbozova, G. Nabyeva
Abstract:
In this paper, the regression dependence of dancing intensity from wind speed and length of span was established due to the statistic data obtained from multi-year observations on line wires dancing accumulated by power systems of Kazakhstan and the Russian Federation. The lower and upper limitations of the equations parameters were estimated, as well as the adequacy of the regression model. The constructed model will be used in research of dancing phenomena for the development of methods and means of protection against dancing and for zoning plan of the territories of line wire dancing.Keywords: Power lines, line wire dancing, dancing intensity, regression equation, dancing area intensity.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 12108949 A Renovated Cook's Distance Based On The Buckley-James Estimate In Censored Regression
Authors: Nazrina Aziz, Dong Q. Wang
Abstract:
There have been various methods created based on the regression ideas to resolve the problem of data set containing censored observations, i.e. the Buckley-James method, Miller-s method, Cox method, and Koul-Susarla-Van Ryzin estimators. Even though comparison studies show the Buckley-James method performs better than some other methods, it is still rarely used by researchers mainly because of the limited diagnostics analysis developed for the Buckley-James method thus far. Therefore, a diagnostic tool for the Buckley-James method is proposed in this paper. It is called the renovated Cook-s Distance, (RD* i ) and has been developed based on the Cook-s idea. The renovated Cook-s Distance (RD* i ) has advantages (depending on the analyst demand) over (i) the change in the fitted value for a single case, DFIT* i as it measures the influence of case i on all n fitted values Yˆ∗ (not just the fitted value for case i as DFIT* i) (ii) the change in the estimate of the coefficient when the ith case is deleted, DBETA* i since DBETA* i corresponds to the number of variables p so it is usually easier to look at a diagnostic measure such as RD* i since information from p variables can be considered simultaneously. Finally, an example using Stanford Heart Transplant data is provided to illustrate the proposed diagnostic tool.
Keywords: Buckley-James estimators, censored regression, censored data, diagnostic analysis, product-limit estimator, renovated Cook's Distance.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14378948 Forecasting of Grape Juice Flavor by Using Support Vector Regression
Authors: Ren-Jieh Kuo, Chun-Shou Huang
Abstract:
The research of juice flavor forecasting has become more important in China. Due to the fast economic growth in China, many different kinds of juices have been introduced to the market. If a beverage company can understand their customers’ preference well, the juice can be served more attractive. Thus, this study intends to introducing the basic theory and computing process of grapes juice flavor forecasting based on support vector regression (SVR). Applying SVR, BPN, and LR to forecast the flavor of grapes juice in real data shows that SVR is more suitable and effective at predicting performance.
Keywords: Flavor forecasting, artificial neural networks, support vector regression, grape juice flavor.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22168947 Predictive Clustering Hybrid Regression(pCHR) Approach and Its Application to Sucrose-Based Biohydrogen Production
Authors: Nikhil, Ari Visa, Chin-Chao Chen, Chiu-Yue Lin, Jaakko A. Puhakka, Olli Yli-Harja
Abstract:
A predictive clustering hybrid regression (pCHR) approach was developed and evaluated using dataset from H2- producing sucrose-based bioreactor operated for 15 months. The aim was to model and predict the H2-production rate using information available about envirome and metabolome of the bioprocess. Selforganizing maps (SOM) and Sammon map were used to visualize the dataset and to identify main metabolic patterns and clusters in bioprocess data. Three metabolic clusters: acetate coupled with other metabolites, butyrate only, and transition phases were detected. The developed pCHR model combines principles of k-means clustering, kNN classification and regression techniques. The model performed well in modeling and predicting the H2-production rate with mean square error values of 0.0014 and 0.0032, respectively.Keywords: Biohydrogen, bioprocess modeling, clusteringhybrid regression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17768946 Correlations between Cleaning Frequency of Reservoir and Water Tower and Parameters of Water Quality
Authors: Chen Bi-Hsiang, Yang Hung-Wen, Lou Jie-Chung, Han Jia-Yun
Abstract:
This study was investigated on sampling and analyzing water quality in water reservoir & water tower installed in two kind of residential buildings and school facilities. Data of water quality was collected for correlation analysis with frequency of sanitization of water reservoir through questioning managers of building about the inspection charts recorded on equipment for water reservoir. Statistical software packages (SPSS) were applied to the data of two groups (cleaning frequency and water quality) for regression analysis to determine the optimal cleaning frequency of sanitization. The correlation coefficient (R) in this paper represented the degree of correlation, with values of R ranging from +1 to -1.After investigating three categories of drinking water users; this study found that the frequency of sanitization of water reservoir significantly influenced the water quality of drinking water. A higher frequency of sanitization (more than four times per 1 year) implied a higher quality of drinking water. Results indicated that sanitizing water reservoir & water tower should at least twice annually for achieving the aim of safety of drinking water.Keywords: cleaning frequency of sanitization, parameters ofwater quality, regression analysis, water reservoir & water tower
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17348945 New Regression Model and I-Kaz Method for Online Cutting Tool Wear Monitoring
Authors: Jaharah A. Ghani, Muhammad Rizal, Ahmad Sayuti, Mohd Zaki Nuawi, Mohd Nizam Ab. Rahman, Che Hassan Che Haron
Abstract:
This study presents a new method for detecting the cutting tool wear based on the measured cutting force signals using the regression model and I-kaz method. The detection of tool wear was done automatically using the in-house developed regression model and 3D graphic presentation of I-kaz 3D coefficient during machining process. The machining tests were carried out on a CNC turning machine Colchester Master Tornado T4 in dry cutting condition, and Kistler 9255B dynamometer was used to measure the cutting force signals, which then stored and displayed in the DasyLab software. The progression of the cutting tool flank wear land (VB) was indicated by the amount of the cutting force generated. Later, the I-kaz was used to analyze all the cutting force signals from beginning of the cut until the rejection stage of the cutting tool. Results of the IKaz analysis were represented by various characteristic of I-kaz 3D coefficient and 3D graphic presentation. The I-kaz 3D coefficient number decreases when the tool wear increases. This method can be used for real time tool wear monitoring.Keywords: mathematical model, I-kaz method, tool wear
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 23988944 Detecting Earnings Management via Statistical and Neural Network Techniques
Authors: Mohammad Namazi, Mohammad Sadeghzadeh Maharluie
Abstract:
Predicting earnings management is vital for the capital market participants, financial analysts and managers. The aim of this research is attempting to respond to this query: Is there a significant difference between the regression model and neural networks’ models in predicting earnings management, and which one leads to a superior prediction of it? In approaching this question, a Linear Regression (LR) model was compared with two neural networks including Multi-Layer Perceptron (MLP), and Generalized Regression Neural Network (GRNN). The population of this study includes 94 listed companies in Tehran Stock Exchange (TSE) market from 2003 to 2011. After the results of all models were acquired, ANOVA was exerted to test the hypotheses. In general, the summary of statistical results showed that the precision of GRNN did not exhibit a significant difference in comparison with MLP. In addition, the mean square error of the MLP and GRNN showed a significant difference with the multi variable LR model. These findings support the notion of nonlinear behavior of the earnings management. Therefore, it is more appropriate for capital market participants to analyze earnings management based upon neural networks techniques, and not to adopt linear regression models.Keywords: Earnings management, generalized regression neural networks, linear regression, multi-layer perceptron, Tehran stock exchange.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 21048943 Optimization of the Transfer Molding Process by Implementation of Online Monitoring Techniques for Electronic Packages
Authors: Burcu Kaya, Jan-Martin Kaiser, Karl-Friedrich Becker, Tanja Braun, Klaus-Dieter Lang
Abstract:
Quality of the molded packages is strongly influenced by the process parameters of the transfer molding. To achieve a better package quality and a stable transfer molding process, it is necessary to understand the influence of the process parameters on the package quality. This work aims to comprehend the relationship between the process parameters, and to identify the optimum process parameters for the transfer molding process in order to achieve less voids and wire sweep. To achieve this, a DoE is executed for process optimization and a regression analysis is carried out. A systematic approach is represented to generate models which enable an estimation of the number of voids and wire sweep. Validation experiments are conducted to verify the model and the results are presented.Keywords: Epoxy molding compounds, optimization, regression analysis, transfer molding process, voids, wire sweep.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15298942 Regional Analysis of Streamflow Drought: A Case Study for Southwestern Iran
Authors: M. Byzedi, B. Saghafian
Abstract:
Droughts are complex, natural hazards that, to a varying degree, affect some parts of the world every year. The range of drought impacts is related to drought occurring in different stages of the hydrological cycle and usually different types of droughts, such as meteorological, agricultural, hydrological, and socioeconomical are distinguished. Streamflow drought was analyzed by the method of truncation level (at 70% level) on daily discharges measured in 54 hydrometric stations in southwestern Iran. Frequency analysis was carried out for annual maximum series (AMS) of drought deficit volume and duration series. Some factors including physiographic, climatic, geologic, and vegetation cover were studied as influential factors in the regional analysis. According to the results of factor analysis, six most effective factors were identified as area, rainfall from December to February, the percent of area with Normalized Difference Vegetation Index (NDVI) <0.1, the percent of convex area, drainage density and the minimum of watershed elevation that explained 90.9% of variance. The homogenous regions were determined by cluster analysis and discriminate function analysis. Suitable multivariate regression models were evaluated for streamflow drought deficit volume with 2 years return period. The significance level of regression models was 0.01. The results showed that the watershed area is the most effective factor with high correlation with deficit volume. Also, drought duration was not a suitable drought index for regional analysis.Keywords: Iran, Streamflow drought, truncation level method, regional analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17438941 Electron-Impact Excitation of Kr 5s, 5p Levels
Authors: Alla A. Mityureva
Abstract:
The available data on the cross sections of electronimpact excitation of krypton 5s and 5p configuration levels out of the ground state are represented in convenient and compact form. The results are obtained by regression through all known published data related to this process.Keywords: Cross section, electron excitation, krypton, regression
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 10868940 Comparison of Bayesian and Regression Schemes to Model Public Health Services
Authors: Sotirios Raptis
Abstract:
Bayesian reasoning (BR) or Linear (Auto) Regression (AR/LR) can predict different sources of data using priors or other data, and can link social service demands in cohorts, while their consideration in isolation (self-prediction) may lead to service misuse ignoring the context. The paper advocates that BR with Binomial (BD), or Normal (ND) models or raw data (.D) as probabilistic updates can be compared to AR/LR to link services in Scotland and reduce cost by sharing healthcare (HC) resources. Clustering, cross-correlation, along with BR, LR, AR can better predict demand. Insurance companies and policymakers can link such services, and examples include those offered to the elderly, and low-income people, smoking-related services linked to mental health services, or epidemiological weight in children. 22 service packs are used that are published by Public Health Services (PHS) Scotland and Scottish Government (SG) from 1981 to 2019, broken into 110 year series (factors), joined using LR, AR, BR. The Primary component analysis found 11 significant factors, while C-Means (CM) clustering gave five major clusters.
Keywords: Bayesian probability, cohorts, data frames, regression, services, prediction.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2238939 Zero Inflated Strict Arcsine Regression Model
Authors: Y. N. Phang, E. F. Loh
Abstract:
Zero inflated strict arcsine model is a newly developed model which is found to be appropriate in modeling overdispersed count data. In this study, we extend zero inflated strict arcsine model to zero inflated strict arcsine regression model by taking into consideration the extra variability caused by extra zeros and covariates in count data. Maximum likelihood estimation method is used in estimating the parameters for this zero inflated strict arcsine regression model.Keywords: Overdispersed count data, maximum likelihood estimation, simulated annealing.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 17558938 Competitors’ Influence Analysis of a Retailer by Using Customer Value and Huff’s Gravity Model
Authors: Yepeng Cheng, Yasuhiko Morimoto
Abstract:
Customer relationship analysis is vital for retail stores, especially for supermarkets. The point of sale (POS) systems make it possible to record the daily purchasing behaviors of customers as an identification point of sale (ID-POS) database, which can be used to analyze customer behaviors of a supermarket. The customer value is an indicator based on ID-POS database for detecting the customer loyalty of a store. In general, there are many supermarkets in a city, and other nearby competitor supermarkets significantly affect the customer value of customers of a supermarket. However, it is impossible to get detailed ID-POS databases of competitor supermarkets. This study firstly focused on the customer value and distance between a customer's home and supermarkets in a city, and then constructed the models based on logistic regression analysis to analyze correlations between distance and purchasing behaviors only from a POS database of a supermarket chain. During the modeling process, there are three primary problems existed, including the incomparable problem of customer values, the multicollinearity problem among customer value and distance data, and the number of valid partial regression coefficients. The improved customer value, Huff’s gravity model, and inverse attractiveness frequency are considered to solve these problems. This paper presents three types of models based on these three methods for loyal customer classification and competitors’ influence analysis. In numerical experiments, all types of models are useful for loyal customer classification. The type of model, including all three methods, is the most superior one for evaluating the influence of the other nearby supermarkets on customers' purchasing of a supermarket chain from the viewpoint of valid partial regression coefficients and accuracy.Keywords: Customer value, Huff's Gravity Model, POS, retailer.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6128937 Analytical Modelling of Surface Roughness during Compacted Graphite Iron Milling Using Ceramic Inserts
Authors: S. Karabulut, A. Güllü, A. Güldas, R. Gürbüz
Abstract:
This study investigates the effects of the lead angle and chip thickness variation on surface roughness during the machining of compacted graphite iron using ceramic cutting tools under dry cutting conditions. Analytical models were developed for predicting the surface roughness values of the specimens after the face milling process. Experimental data was collected and imported to the artificial neural network model. A multilayer perceptron model was used with the back propagation algorithm employing the input parameters of lead angle, cutting speed and feed rate in connection with chip thickness. Furthermore, analysis of variance was employed to determine the effects of the cutting parameters on surface roughness. Artificial neural network and regression analysis were used to predict surface roughness. The values thus predicted were compared with the collected experimental data, and the corresponding percentage error was computed. Analysis results revealed that the lead angle is the dominant factor affecting surface roughness. Experimental results indicated an improvement in the surface roughness value with decreasing lead angle value from 88° to 45°.Keywords: CGI, milling, surface roughness, ANN, regression, modeling, analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19698936 Comparison of Polynomial and Radial Basis Kernel Functions based SVR and MLR in Modeling Mass Transfer by Vertical and Inclined Multiple Plunging Jets
Abstract:
Presently various computational techniques are used in modeling and analyzing environmental engineering data. In the present study, an intra-comparison of polynomial and radial basis kernel functions based on Support Vector Regression and, in turn, an inter-comparison with Multi Linear Regression has been attempted in modeling mass transfer capacity of vertical (θ = 90O) and inclined (θ multiple plunging jets (varying from 1 to 16 numbers). The data set used in this study consists of four input parameters with a total of eighty eight cases, forty four each for vertical and inclined multiple plunging jets. For testing, tenfold cross validation was used. Correlation coefficient values of 0.971 and 0.981 along with corresponding root mean square error values of 0.0025 and 0.0020 were achieved by using polynomial and radial basis kernel functions based Support Vector Regression respectively. An intra-comparison suggests improved performance by radial basis function in comparison to polynomial kernel based Support Vector Regression. Further, an inter-comparison with Multi Linear Regression (correlation coefficient = 0.973 and root mean square error = 0.0024) reveals that radial basis kernel functions based Support Vector Regression performs better in modeling and estimating mass transfer by multiple plunging jets.Keywords: Mass transfer, multiple plunging jets, polynomial and radial basis kernel functions, Support Vector Regression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 14328935 Solving Single Machine Total Weighted Tardiness Problem Using Gaussian Process Regression
Authors: Wanatchapong Kongkaew
Abstract:
This paper proposes an application of probabilistic technique, namely Gaussian process regression, for estimating an optimal sequence of the single machine with total weighted tardiness (SMTWT) scheduling problem. In this work, the Gaussian process regression (GPR) model is utilized to predict an optimal sequence of the SMTWT problem, and its solution is improved by using an iterated local search based on simulated annealing scheme, called GPRISA algorithm. The results show that the proposed GPRISA method achieves a very good performance and a reasonable trade-off between solution quality and time consumption. Moreover, in the comparison of deviation from the best-known solution, the proposed mechanism noticeably outperforms the recently existing approaches.
Keywords: Gaussian process regression, iterated local search, simulated annealing, single machine total weighted tardiness.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22358934 Neuro-fuzzy Model and Regression Model a Comparison Study of MRR in Electrical Discharge Machining of D2 Tool Steel
Authors: M. K. Pradhan, C. K. Biswas,
Abstract:
In the current research, neuro-fuzzy model and regression model was developed to predict Material Removal Rate in Electrical Discharge Machining process for AISI D2 tool steel with copper electrode. Extensive experiments were conducted with various levels of discharge current, pulse duration and duty cycle. The experimental data are split into two sets, one for training and the other for validation of the model. The training data were used to develop the above models and the test data, which was not used earlier to develop these models were used for validation the models. Subsequently, the models are compared. It was found that the predicted and experimental results were in good agreement and the coefficients of correlation were found to be 0.999 and 0.974 for neuro fuzzy and regression model respectively
Keywords: Electrical discharge machining, material removal rate, neuro-fuzzy model, regression model, mountain clustering.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 13888933 Non-Methane Hydrocarbons Emission during the Photocopying Process
Authors: Kiurski S. Jelena, Aksentijević M. Snežana, Kecić S. Vesna, Oros B. Ivana
Abstract:
Prosperity of electronic equipment in photocopying environment not only has improved work efficiency, but also has changed indoor air quality. Considering the number of photocopying employed, indoor air quality might be worse than in general office environments. Determining the contribution from any type of equipment to indoor air pollution is a complex matter. Non-methane hydrocarbons are known to have an important role on air quality due to their high reactivity. The presence of hazardous pollutants in indoor air has been detected in one photocopying shop in Novi Sad, Serbia. Air samples were collected and analyzed for five days, during 8-hr working time in three time intervals, whereas three different sampling points were determined. Using multiple linear regression model and software package STATISTICA 10 the concentrations of occupational hazards and microclimates parameters were mutually correlated. Based on the obtained multiple coefficients of determination (0.3751, 0.2389 and 0.1975), a weak positive correlation between the observed variables was determined. Small values of parameter F indicated that there was no statistically significant difference between the concentration levels of nonmethane hydrocarbons and microclimates parameters. The results showed that variable could be presented by the general regression model: y = b0 + b1xi1+ b2xi2. Obtained regression equations allow to measure the quantitative agreement between the variables and thus obtain more accurate knowledge of their mutual relations.Keywords: Indoor air quality, multiple regression analysis, nonmethane hydrocarbons, photocopying process.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 19748932 Comparison of Multivariate Adaptive Regression Splines and Random Forest Regression in Predicting Forced Expiratory Volume in One Second
Authors: P. V. Pramila, V. Mahesh
Abstract:
Pulmonary Function Tests are important non-invasive diagnostic tests to assess respiratory impairments and provides quantifiable measures of lung function. Spirometry is the most frequently used measure of lung function and plays an essential role in the diagnosis and management of pulmonary diseases. However, the test requires considerable patient effort and cooperation, markedly related to the age of patients resulting in incomplete data sets. This paper presents, a nonlinear model built using Multivariate adaptive regression splines and Random forest regression model to predict the missing spirometric features. Random forest based feature selection is used to enhance both the generalization capability and the model interpretability. In the present study, flow-volume data are recorded for N= 198 subjects. The ranked order of feature importance index calculated by the random forests model shows that the spirometric features FVC, FEF25, PEF, FEF25-75, FEF50 and the demographic parameter height are the important descriptors. A comparison of performance assessment of both models prove that, the prediction ability of MARS with the `top two ranked features namely the FVC and FEF25 is higher, yielding a model fit of R2= 0.96 and R2= 0.99 for normal and abnormal subjects. The Root Mean Square Error analysis of the RF model and the MARS model also shows that the latter is capable of predicting the missing values of FEV1 with a notably lower error value of 0.0191 (normal subjects) and 0.0106 (abnormal subjects) with the aforementioned input features. It is concluded that combining feature selection with a prediction model provides a minimum subset of predominant features to train the model, as well as yielding better prediction performance. This analysis can assist clinicians with a intelligence support system in the medical diagnosis and improvement of clinical care.
Keywords: FEV1, Multivariate Adaptive Regression Splines Pulmonary Function Test, Random Forest.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 37378931 Functional Decomposition Based Effort Estimation Model for Software-Intensive Systems
Authors: Nermin Sökmen
Abstract:
An effort estimation model is needed for softwareintensive projects that consist of hardware, embedded software or some combination of the two, as well as high level software solutions. This paper first focuses on functional decomposition techniques to measure functional complexity of a computer system and investigates its impact on system development effort. Later, it examines effects of technical difficulty and design team capability factors in order to construct the best effort estimation model. With using traditional regression analysis technique, the study develops a system development effort estimation model which takes functional complexity, technical difficulty and design team capability factors as input parameters. Finally, the assumptions of the model are tested.
Keywords: Functional complexity, functional decomposition, development effort, technical difficulty, design team capability, regression analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 22788930 Comparative Studies of Support Vector Regression between Reproducing Kernel and Gaussian Kernel
Authors: Wei Zhang, Su-Yan Tang, Yi-Fan Zhu, Wei-Ping Wang
Abstract:
Support vector regression (SVR) has been regarded as a state-of-the-art method for approximation and regression. The importance of kernel function, which is so-called admissible support vector kernel (SV kernel) in SVR, has motivated many studies on its composition. The Gaussian kernel (RBF) is regarded as a “best" choice of SV kernel used by non-expert in SVR, whereas there is no evidence, except for its superior performance on some practical applications, to prove the statement. Its well-known that reproducing kernel (R.K) is also a SV kernel which possesses many important properties, e.g. positive definiteness, reproducing property and composing complex R.K by simpler ones. However, there are a limited number of R.Ks with explicit forms and consequently few quantitative comparison studies in practice. In this paper, two R.Ks, i.e. SV kernels, composed by the sum and product of a translation invariant kernel in a Sobolev space are proposed. An exploratory study on the performance of SVR based general R.K is presented through a systematic comparison to that of RBF using multiple criteria and synthetic problems. The results show that the R.K is an equivalent or even better SV kernel than RBF for the problems with more input variables (more than 5, especially more than 10) and higher nonlinearity.Keywords: admissible support vector kernel, reproducing kernel, reproducing kernel Hilbert space, support vector regression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 15948929 Developing Pedotransfer Functions for Estimating Some Soil Properties using Artificial Neural Network and Multivariate Regression Approaches
Authors: Fereydoon Sarmadian, Ali Keshavarzi
Abstract:
Study of soil properties like field capacity (F.C.) and permanent wilting point (P.W.P.) play important roles in study of soil moisture retention curve. Although these parameters can be measured directly, their measurement is difficult and expensive. Pedotransfer functions (PTFs) provide an alternative by estimating soil parameters from more readily available soil data. In this investigation, 70 soil samples were collected from different horizons of 15 soil profiles located in the Ziaran region, Qazvin province, Iran. The data set was divided into two subsets for calibration (80%) and testing (20%) of the models and their normality were tested by Kolmogorov-Smirnov method. Both multivariate regression and artificial neural network (ANN) techniques were employed to develop the appropriate PTFs for predicting soil parameters using easily measurable characteristics of clay, silt, O.C, S.P, B.D and CaCO3. The performance of the multivariate regression and ANN models was evaluated using an independent test data set. In order to evaluate the models, root mean square error (RMSE) and R2 were used. The comparison of RSME for two mentioned models showed that the ANN model gives better estimates of F.C and P.W.P than the multivariate regression model. The value of RMSE and R2 derived by ANN model for F.C and P.W.P were (2.35, 0.77) and (2.83, 0.72), respectively. The corresponding values for multivariate regression model were (4.46, 0.68) and (5.21, 0.64), respectively. Results showed that ANN with five neurons in hidden layer had better performance in predicting soil properties than multivariate regression.
Keywords: Artificial neural network, Field capacity, Permanentwilting point, Pedotransfer functions.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 18198928 Prediction Heating Values of Lignocellulosics from Biomass Characteristics
Authors: Kaltima Phichai, Pornchanoke Pragrobpondee, Thaweesak Khumpart, Samorn Hirunpraditkoon
Abstract:
The paper provides biomasses characteristics by proximate analysis (volatile matter, fixed carbon and ash) and ultimate analysis (carbon, hydrogen, nitrogen and oxygen) for the prediction of the heating value equations. The heating value estimation of various biomasses can be used as an energy evaluation. Thirteen types of biomass were studied. Proximate analysis was investigated by mass loss method and infrared moisture analyzer. Ultimate analysis was analyzed by CHNO analyzer. The heating values varied from 15 to 22.4MJ kg-1. Correlations of the calculated heating value with proximate and ultimate analyses were undertaken using multiple regression analysis and summarized into three and two equations, respectively. Correlations based on proximate analysis illustrated that deviation of calculated heating values from experimental heating values was higher than the correlations based on ultimate analysis.
Keywords: Heating value equation, Proximate analysis, Ultimate analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 37238927 Statistical Analysis of the Impact of Maritime Transport Gross Domestic Product on Nigeria’s Economy
Authors: K. P. Oyeduntan, K. Oshinubi
Abstract:
Nigeria is referred as the ‘Giant of Africa’ due to high population, land mass and large economy. However, it still trails far behind many smaller economies in the continent in terms of maritime operations. As we have seen that the maritime industry is the sparkplug for national growth, because it houses the most crucial infrastructure that generates wealth for a nation, it is worrisome that a nation with six seaports lag in maritime activities. In this research, we have studied how the Gross Domestic Product (GDP) of the maritime transport influences the Nigerian economy. To do this, we applied Simple Linear Regression (SLR), Support Vector Machine (SVM), Polynomial Regression Model (PRM), Generalized Additive Model (GAM) and Generalized Linear Mixed Model (GLMM) to model the relationship between the nation’s Total GDP (TGDP) and the Maritime Transport GDP (MGDP) using a time series data of 20 years. The result showed that the MGDP is statistically significant to the Nigerian economy. Amongst the statistical tool applied, the PRM of order 4 describes the relationship better when compared to other methods. The recommendations presented in this study will guide policy makers and help improve the economy of Nigeria.
Keywords: Economy, GDP, maritime transport, port, regression.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1428926 The Profit Trend of Cosmetics Products Using Bootstrap Edgeworth Approximation
Authors: Edlira Donefski, Lorenc Ekonomi, Tina Donefski
Abstract:
Edgeworth approximation is one of the most important statistical methods that has a considered contribution in the reduction of the sum of standard deviation of the independent variables’ coefficients in a Quantile Regression Model. This model estimates the conditional median or other quantiles. In this paper, we have applied approximating statistical methods in an economical problem. We have created and generated a quantile regression model to see how the profit gained is connected with the realized sales of the cosmetic products in a real data, taken from a local business. The Linear Regression of the generated profit and the realized sales was not free of autocorrelation and heteroscedasticity, so this is the reason that we have used this model instead of Linear Regression. Our aim is to analyze in more details the relation between the variables taken into study: the profit and the finalized sales and how to minimize the standard errors of the independent variable involved in this study, the level of realized sales. The statistical methods that we have applied in our work are Edgeworth Approximation for Independent and Identical distributed (IID) cases, Bootstrap version of the Model and the Edgeworth approximation for Bootstrap Quantile Regression Model. The graphics and the results that we have presented here identify the best approximating model of our study.Keywords: Bootstrap, Edgeworth approximation, independent and Identical distributed, quantile.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4418925 The Effect of User Comments on Traffic Application Usage
Authors: I. Gokasar, G. Bakioglu
Abstract:
With the unprecedented rates of technological improvements, people start to solve their problems with the help of technological tools. According to application stores and websites in which people evaluate and comment on the traffic apps, there are more than 100 traffic applications which have different features with respect to their purpose of usage ranging from the features of traffic apps for public transit modes to the features of traffic apps for private cars. This study focuses on the top 30 traffic applications which were chosen with respect to their download counts. All data about the traffic applications were obtained from related websites. The purpose of this study is to analyze traffic applications in terms of their categorical attributes with the help of developing a regression model. The analysis results suggest that negative interpretations (e.g., being deficient) does not lead to lower star ratings of the applications. However, those negative interpretations result in a smaller increase in star rate. In addition, women use higher star rates than men for the evaluation of traffic applications.
Keywords: Traffic App, real–time information, traffic congestion, regression analysis, dummy variables.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1178