Search results for: shrinkage regression
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3305

Search results for: shrinkage regression

3185 Machine Learning Assisted Prediction of Sintered Density of Binary W(MO) Alloys

Authors: Hexiong Liu

Abstract:

Powder metallurgy is the optimal method for the consolidation and preparation of W(Mo) alloys, which exhibit excellent application prospects at high temperatures. The properties of W(Mo) alloys are closely related to the sintered density. However, controlling the sintered density and porosity of these alloys is still challenging. In the past, the regulation methods mainly focused on time-consuming and costly trial-and-error experiments. In this study, the sintering data for more than a dozen W(Mo) alloys constituted a small-scale dataset, including both solid and liquid phases of sintering. Furthermore, simple descriptors were used to predict the sintered density of W(Mo) alloys based on the descriptor selection strategy and machine learning method (ML), where the ML algorithm included the least absolute shrinkage and selection operator (Lasso) regression, k-nearest neighbor (k-NN), random forest (RF), and multi-layer perceptron (MLP). The results showed that the interpretable descriptors extracted by our proposed selection strategy and the MLP neural network achieved a high prediction accuracy (R>0.950). By further predicting the sintered density of W(Mo) alloys using different sintering processes, the error between the predicted and experimental values was less than 0.063, confirming the application potential of the model.

Keywords: sintered density, machine learning, interpretable descriptors, W(Mo) alloy

Procedia PDF Downloads 57
3184 Generalized Additive Model for Estimating Propensity Score

Authors: Tahmidul Islam

Abstract:

Propensity Score Matching (PSM) technique has been widely used for estimating causal effect of treatment in observational studies. One major step of implementing PSM is estimating the propensity score (PS). Logistic regression model with additive linear terms of covariates is most used technique in many studies. Logistics regression model is also used with cubic splines for retaining flexibility in the model. However, choosing the functional form of the logistic regression model has been a question since the effectiveness of PSM depends on how accurately the PS been estimated. In many situations, the linearity assumption of linear logistic regression may not hold and non-linear relation between the logit and the covariates may be appropriate. One can estimate PS using machine learning techniques such as random forest, neural network etc for more accuracy in non-linear situation. In this study, an attempt has been made to compare the efficacy of Generalized Additive Model (GAM) in various linear and non-linear settings and compare its performance with usual logistic regression. GAM is a non-parametric technique where functional form of the covariates can be unspecified and a flexible regression model can be fitted. In this study various simple and complex models have been considered for treatment under several situations (small/large sample, low/high number of treatment units) and examined which method leads to more covariate balance in the matched dataset. It is found that logistic regression model is impressively robust against inclusion quadratic and interaction terms and reduces mean difference in treatment and control set equally efficiently as GAM does. GAM provided no significantly better covariate balance than logistic regression in both simple and complex models. The analysis also suggests that larger proportion of controls than treatment units leads to better balance for both of the methods.

Keywords: accuracy, covariate balances, generalized additive model, logistic regression, non-linearity, propensity score matching

Procedia PDF Downloads 340
3183 A Comparison of Neural Network and DOE-Regression Analysis for Predicting Resource Consumption of Manufacturing Processes

Authors: Frank Kuebler, Rolf Steinhilper

Abstract:

Artificial neural networks (ANN) as well as Design of Experiments (DOE) based regression analysis (RA) are mainly used for modeling of complex systems. Both methodologies are commonly applied in process and quality control of manufacturing processes. Due to the fact that resource efficiency has become a critical concern for manufacturing companies, these models needs to be extended to predict resource-consumption of manufacturing processes. This paper describes an approach to use neural networks as well as DOE based regression analysis for predicting resource consumption of manufacturing processes and gives a comparison of the achievable results based on an industrial case study of a turning process.

Keywords: artificial neural network, design of experiments, regression analysis, resource efficiency, manufacturing process

Procedia PDF Downloads 501
3182 Logistic Regression Model versus Additive Model for Recurrent Event Data

Authors: Entisar A. Elgmati

Abstract:

Recurrent infant diarrhea is studied using daily data collected in Salvador, Brazil over one year and three months. A logistic regression model is fitted instead of Aalen's additive model using the same covariates that were used in the analysis with the additive model. The model gives reasonably similar results to that using additive regression model. In addition, the problem with the estimated conditional probabilities not being constrained between zero and one in additive model is solved here. Also martingale residuals that have been used to judge the goodness of fit for the additive model are shown to be useful for judging the goodness of fit of the logistic model.

Keywords: additive model, cumulative probabilities, infant diarrhoea, recurrent event

Procedia PDF Downloads 613
3181 Identifying Factors Contributing to the Spread of Lyme Disease: A Regression Analysis of Virginia’s Data

Authors: Fatemeh Valizadeh Gamchi, Edward L. Boone

Abstract:

This research focuses on Lyme disease, a widespread infectious condition in the United States caused by the bacterium Borrelia burgdorferi sensu stricto. It is critical to identify environmental and economic elements that are contributing to the spread of the disease. This study examined data from Virginia to identify a subset of explanatory variables significant for Lyme disease case numbers. To identify relevant variables and avoid overfitting, linear poisson, and regularization regression methods such as a ridge, lasso, and elastic net penalty were employed. Cross-validation was performed to acquire tuning parameters. The methods proposed can automatically identify relevant disease count covariates. The efficacy of the techniques was assessed using four criteria on three simulated datasets. Finally, using the Virginia Department of Health’s Lyme disease data set, the study successfully identified key factors, and the results were consistent with previous studies.

Keywords: lyme disease, Poisson generalized linear model, ridge regression, lasso regression, elastic net regression

Procedia PDF Downloads 103
3180 An Analysis of the Effect of Sharia Financing and Work Relation Founding towards Non-Performing Financing in Islamic Banks in Indonesia

Authors: Muhammad Bahrul Ilmi

Abstract:

The purpose of this research is to analyze the influence of Islamic financing and work relation founding simultaneously and partially towards non-performing financing in Islamic banks. This research was regression quantitative field research, and had been done in Muammalat Indonesia Bank and Islamic Danamon Bank in 3 months. The populations of this research were 15 account officers of Muammalat Indonesia Bank and Islamic Danamon Bank in Surakarta, Indonesia. The techniques of collecting data used in this research were documentation, questionnaire, literary study and interview. Regression analysis result shows that Islamic financing and work relation founding simultaneously has positive and significant effect towards non performing financing of two Islamic Banks. It is obtained with probability value 0.003 which is less than 0.05 and F value 9.584. The analysis result of Islamic financing regression towards non performing financing shows the significant effect. It is supported by double linear regression analysis with probability value 0.001 which is less than 0.05. The regression analysis of work relation founding effect towards non-performing financing shows insignificant effect. This is shown in the double linear regression analysis with probability value 0.161 which is bigger than 0.05.

Keywords: Syariah financing, work relation founding, non-performing financing (NPF), Islamic Bank

Procedia PDF Downloads 409
3179 A Kolmogorov-Smirnov Type Goodness-Of-Fit Test of Multinomial Logistic Regression Model in Case-Control Studies

Authors: Chen Li-Ching

Abstract:

The multinomial logistic regression model is used popularly for inferring the relationship of risk factors and disease with multiple categories. This study based on the discrepancy between the nonparametric maximum likelihood estimator and semiparametric maximum likelihood estimator of the cumulative distribution function to propose a Kolmogorov-Smirnov type test statistic to assess adequacy of the multinomial logistic regression model for case-control data. A bootstrap procedure is presented to calculate the critical value of the proposed test statistic. Empirical type I error rates and powers of the test are performed by simulation studies. Some examples will be illustrated the implementation of the test.

Keywords: case-control studies, goodness-of-fit test, Kolmogorov-Smirnov test, multinomial logistic regression

Procedia PDF Downloads 431
3178 Field Performance of Cement Treated Bases as a Reflective Crack Mitigation Technique for Flexible Pavements

Authors: Mohammad R. Bhuyan, Mohammad J. Khattak

Abstract:

Deterioration of flexible pavements due to crack reflection from its soil-cement base layer is a major concern around the globe. The service life of flexible pavement diminishes significantly because of the reflective cracks. Highway agencies are struggling for decades to prevent or mitigate these cracks in order to increase pavement service lives. The root cause of reflective cracks is the shrinkage crack which occurs in the soil-cement bases during the cement hydration process. The primary factor that causes the shrinkage is the cement content of the soil-cement mixture. With the increase of cement content, the soil-cement base gains strength and durability, which is necessary to withstand the traffic loads. But at the same time, higher cement content creates more shrinkage resulting in more reflective cracks in pavements. Historically, various states of USA have used the soil-cement bases for constructing flexile pavements. State of Louisiana (USA) had been using 8 to 10 percent of cement content to manufacture the soil-cement bases. Such traditional soil-cement bases yield 2.0 MPa (300 psi) 7-day compressive strength and are termed as cement stabilized design (CSD). As these CSD bases generate significant reflective cracks, another design of soil-cement base has been utilized by adding 4 to 6 percent of cement content called cement treated design (CTD), which yields 1.0 MPa (150 psi) 7-day compressive strength. The reduction of cement content in the CTD base is expected to minimize shrinkage cracks thus increasing pavement service lives. Hence, this research study evaluates the long-term field performance of CTD bases with respect to CSD bases used in flexible pavements. Pavement Management System of the state of Louisiana was utilized to select flexible pavement projects with CSD and CTD bases that had good historical record and time-series distress performance data. It should be noted that the state collects roughness and distress data for 1/10th mile section every 2-year period. In total, 120 CSD and CTD projects were analyzed in this research, where more than 145 miles (CTD) and 175 miles (CSD) of roadways data were accepted for performance evaluation and benefit-cost analyses. Here, the service life extension and area based on distress performance were considered as benefits. It was found that CTD bases increased 1 to 5 years of pavement service lives based on transverse cracking as compared to CSD bases. On the other hand, the service lives based on longitudinal and alligator cracking, rutting and roughness index remain the same. Hence, CTD bases provide some service life extension (2.6 years, on average) to the controlling distress; transverse cracking, but it was inexpensive due to its lesser cement content. Consequently, CTD bases become 20% more cost-effective than the traditional CSD bases, when both bases were compared by net benefit-cost ratio obtained from all distress types.

Keywords: cement treated base, cement stabilized base, reflective cracking , service life, flexible pavement

Procedia PDF Downloads 149
3177 A Study on Inference from Distance Variables in Hedonic Regression

Authors: Yan Wang, Yasushi Asami, Yukio Sadahiro

Abstract:

In urban area, several landmarks may affect housing price and rents, hedonic analysis should employ distance variables corresponding to each landmarks. Unfortunately, the effects of distances to landmarks on housing prices are generally not consistent with the true price. These distance variables may cause magnitude error in regression, pointing a problem of spatial multicollinearity. In this paper, we provided some approaches for getting the samples with less bias and method on locating the specific sampling area to avoid the multicollinerity problem in two specific landmarks case.

Keywords: landmarks, hedonic regression, distance variables, collinearity, multicollinerity

Procedia PDF Downloads 437
3176 Forecasting of Grape Juice Flavor by Using Support Vector Regression

Authors: Ren-Jieh Kuo, Chun-Shou Huang

Abstract:

The research of juice flavor forecasting has become more important in China. Due to the fast economic growth in China, many different kinds of juices have been introduced to the market. If a beverage company can understand their customers’ preference well, the juice can be served more attractively. Thus, this study intends to introduce the basic theory and computing process of grapes juice flavor forecasting based on support vector regression (SVR). Applying SVR, BPN and LR to forecast the flavor of grapes juice in real data, the result shows that SVR is more suitable and effective at predicting performance.

Keywords: flavor forecasting, artificial neural networks, Support Vector Regression, China

Procedia PDF Downloads 463
3175 Future Applications of 4D Printing in Dentistry

Authors: Hosamuddin Hamza

Abstract:

The major concept of 4D printing is self-folding under thermal and humidity changes. This concept relies on understanding how the microstructures of 3D-printed models can undergo spontaneous shape transformation under thermal and moisture changes. The transformation mechanism could be achieved by mixing, in a controllable pattern, a number of materials within the printed model, each with known strain/shrinkage properties. 4D printing has a strong potential to be applied in dentistry as the technology could produce dynamic and adaptable materials to be used as functional objects in the oral environment under the continuously changing thermal and humidity conditions. The motion criteria could override the undesired dimensional changes, thermal instability, polymerization shrinkage and microleakage. 4D printing could produce restorative materials being self-adjusted spontaneously without further intervention from the dentist or patient; that is, the materials could be capable of fixing its failed portions, compensating for some lost tooth structure, while avoiding microleakage or overhangs at the margins. In prosthetic dentistry, 4D printing could provide an option to manage the influence of bone and soft tissue imbalance during mastication (and at rest) with high predictability of the type/direction of forces. It can also produce materials with better fitting and retention characteristics than conventional or 3D-printed materials. Nevertheless, it is important to highlight that 4D-printed objects, having dynamic properties, could provide some cushion as they undergo self-folding compensating for any thermal changes or mechanical forces such as traumatic forces.

Keywords: functional material, self-folding material, 3D printing, 4D printing

Procedia PDF Downloads 452
3174 Optimization of Two Quality Characteristics in Injection Molding Processes via Taguchi Methodology

Authors: Joseph C. Chen, Venkata Karthik Jakka

Abstract:

The main objective of this research is to optimize tensile strength and dimensional accuracy in injection molding processes using Taguchi Parameter Design. An L16 orthogonal array (OA) is used in Taguchi experimental design with five control factors at four levels each and with non-controllable factor vibration. A total of 32 experiments were designed to obtain the optimal parameter setting for the process. The optimal parameters identified for the shrinkage are shot volume, 1.7 cubic inch (A4); mold term temperature, 130 ºF (B1); hold pressure, 3200 Psi (C4); injection speed, 0.61 inch3/sec (D2); and hold time of 14 seconds (E2). The optimal parameters identified for the tensile strength are shot volume, 1.7 cubic inch (A4); mold temperature, 160 ºF (B4); hold pressure, 3100 Psi (C3); injection speed, 0.69 inch3/sec (D4); and hold time of 14 seconds (E2). The Taguchi-based optimization framework was systematically and successfully implemented to obtain an adjusted optimal setting in this research. The mean shrinkage of the confirmation runs is 0.0031%, and the tensile strength value was found to be 3148.1 psi. Both outcomes are far better results from the baseline, and defects have been further reduced in injection molding processes.

Keywords: injection molding processes, taguchi parameter design, tensile strength, high-density polyethylene(HDPE)

Procedia PDF Downloads 175
3173 Estimation of Coefficients of Ridge and Principal Components Regressions with Multicollinear Data

Authors: Rajeshwar Singh

Abstract:

The presence of multicollinearity is common in handling with several explanatory variables simultaneously due to exhibiting a linear relationship among them. A great problem arises in understanding the impact of explanatory variables on the dependent variable. Thus, the method of least squares estimation gives inexact estimates. In this case, it is advised to detect its presence first before proceeding further. Using the ridge regression degree of its occurrence is reduced but principal components regression gives good estimates in this situation. This paper discusses well-known techniques of the ridge and principal components regressions and applies to get the estimates of coefficients by both techniques. In addition to it, this paper also discusses the conflicting claim on the discovery of the method of ridge regression based on available documents.

Keywords: conflicting claim on credit of discovery of ridge regression, multicollinearity, principal components and ridge regressions, variance inflation factor

Procedia PDF Downloads 389
3172 Improvement of Recycled Aggregate Concrete Properties by Controlling the Water Flow in the Interfacial Transition Zone

Authors: M. Eckert, M. Oliveira, A. Bettencourt Ribeiro

Abstract:

The intensive use of natural aggregate, near the towns, associated to the increase of the global population, leads to its depletion and increases the transport distances. The uncontrolled deposition of construction and demolition waste in landfills and city outskirts, causes pollution and take up space for noblest purposes. The main problem of recycled aggregate lies in its high water absorption, what is due to the porosity of the materials which constitute this type of aggregate. When the aggregates are dry, water flows from the inside to the engaging cement paste matrix, and when they are saturated an inverse process occurs. This water flow breaks the aggregate-cement paste bonds and the greater water concentration, in the inter-facial transition zone, degrades the concrete properties in its fresh and hardened state. Based on the water absorption over time, it was optimized an staged mixing method, to regulate the said flow and manufacture recycled aggregate concrete with levels of work-ability, strength and shrinkage equivalent to those of conventional concrete.The physical, mechanical and geometrical properties of the aggregates where related to the properties of concrete in its fresh and hardened state. Three types of commercial recycled aggregates and two types of natural aggregates where evaluated. Six compositions with different percentages of recycled coarse aggregate where tested.

Keywords: recycled aggregate, water absorption, interfacial transition zone, compressive-strength, shrinkage

Procedia PDF Downloads 429
3171 Estimate of Maximum Expected Intensity of One-Half-Wave Lines Dancing

Authors: A. Bekbaev, M. Dzhamanbaev, R. Abitaeva, A. Karbozova, G. Nabyeva

Abstract:

In this paper, the regression dependence of dancing intensity from wind speed and length of span was established due to the statistic data obtained from multi-year observations on line wires dancing accumulated by power systems of Kazakhstan and the Russian Federation. The lower and upper limitations of the equations parameters were estimated, as well as the adequacy of the regression model. The constructed model will be used in research of dancing phenomena for the development of methods and means of protection against dancing and for zoning plan of the territories of line wire dancing.

Keywords: power lines, line wire dancing, dancing intensity, regression equation, dancing area intensity

Procedia PDF Downloads 293
3170 Power Ultrasound Application on Convective Drying of Banana (Musa paradisiaca), Mango (Mangifera indica L.) and Guava (Psidium guajava L.)

Authors: Erika K. Méndez, Carlos E. Orrego, Diana L. Manrique, Juan D. Gonzalez, Doménica Vallejo

Abstract:

High moisture content in fruits generates post-harvest problems such as mechanical, biochemical, microbial and physical losses. Dehydration, which is based on the reduction of water activity of the fruit, is a common option for overcoming such losses. However, regular hot air drying could affect negatively the quality properties of the fruit due to the long residence time at high temperature. Power ultrasound (US) application during the convective drying has been used as a novel method able to enhance drying rate and, consequently, to decrease drying time. In the present study, a new approach was tested to evaluate the effect of US on the drying time, the final antioxidant activity (AA) and the total polyphenol content (TPC) of banana slices (BS), mango slices (MS) and guava slices (GS). There were also studied the drying kinetics with nine different models from which water effective diffusivities (Deff) (with or without shrinkage corrections) were calculated. Compared with the corresponding control tests, US assisted drying for fruit slices showed reductions in drying time between 16.23 and 30.19%, 11.34 and 32.73%, and 19.25 and 47.51% for the MS, BS and GS respectively. Considering shrinkage effects, Deff calculated values ranged from 1.67*10-10 to 3.18*10-10 m2/s, 3.96*10-10 and 5.57*10-10 m2/s and 4.61*10-10 to 8.16*10-10 m2/s for the BS, MS and GS samples respectively. Reductions of TPC and AA (as DPPH) were observed compared with the original content in fresh fruit data in all kinds of drying assays.

Keywords: banana, drying, effective diffusivity, guava, mango, ultrasound

Procedia PDF Downloads 510
3169 Incorporating Anomaly Detection in a Digital Twin Scenario Using Symbolic Regression

Authors: Manuel Alves, Angelica Reis, Armindo Lobo, Valdemar Leiras

Abstract:

In industry 4.0, it is common to have a lot of sensor data. In this deluge of data, hints of possible problems are difficult to spot. The digital twin concept aims to help answer this problem, but it is mainly used as a monitoring tool to handle the visualisation of data. Failure detection is of paramount importance in any industry, and it consumes a lot of resources. Any improvement in this regard is of tangible value to the organisation. The aim of this paper is to add the ability to forecast test failures, curtailing detection times. To achieve this, several anomaly detection algorithms were compared with a symbolic regression approach. To this end, Isolation Forest, One-Class SVM and an auto-encoder have been explored. For the symbolic regression PySR library was used. The first results show that this approach is valid and can be added to the tools available in this context as a low resource anomaly detection method since, after training, the only requirement is the calculation of a polynomial, a useful feature in the digital twin context.

Keywords: anomaly detection, digital twin, industry 4.0, symbolic regression

Procedia PDF Downloads 98
3168 Impact of Infrastructural Development on Socio-Economic Growth: An Empirical Investigation in India

Authors: Jonardan Koner

Abstract:

The study attempts to find out the impact of infrastructural investment on state economic growth in India. It further tries to determine the magnitude of the impact of infrastructural investment on economic indicator, i.e., per-capita income (PCI) in Indian States. The study uses panel regression technique to measure the impact of infrastructural investment on per-capita income (PCI) in Indian States. Panel regression technique helps incorporate both the cross-section and time-series aspects of the dataset. In order to analyze the difference in impact of the explanatory variables on the explained variables across states, the study uses Fixed Effect Panel Regression Model. The conclusions of the study are that infrastructural investment has a desirable impact on economic development and that the impact is different for different states in India. We analyze time series data (annual frequency) ranging from 1991 to 2010. The study reveals that the infrastructural investment significantly explains the variation of economic indicators.

Keywords: infrastructural investment, multiple regression, panel regression techniques, economic development, fixed effect dummy variable model

Procedia PDF Downloads 352
3167 A Quadratic Model to Early Predict the Blastocyst Stage with a Time Lapse Incubator

Authors: Cecile Edel, Sandrine Giscard D'Estaing, Elsa Labrune, Jacqueline Lornage, Mehdi Benchaib

Abstract:

Introduction: The use of incubator equipped with time-lapse technology in Artificial Reproductive Technology (ART) allows a continuous surveillance. With morphocinetic parameters, algorithms are available to predict the potential outcome of an embryo. However, the different proposed time-lapse algorithms do not take account the missing data, and then some embryos could not be classified. The aim of this work is to construct a predictive model even in the case of missing data. Materials and methods: Patients: A retrospective study was performed, in biology laboratory of reproduction at the hospital ‘Femme Mère Enfant’ (Lyon, France) between 1 May 2013 and 30 April 2015. Embryos (n= 557) obtained from couples (n=108) were cultured in a time-lapse incubator (Embryoscope®, Vitrolife, Goteborg, Sweden). Time-lapse incubator: The morphocinetic parameters obtained during the three first days of embryo life were used to build the predictive model. Predictive model: A quadratic regression was performed between the number of cells and time. N = a. T² + b. T + c. N: number of cells at T time (T in hours). The regression coefficients were calculated with Excel software (Microsoft, Redmond, WA, USA), a program with Visual Basic for Application (VBA) (Microsoft) was written for this purpose. The quadratic equation was used to find a value that allows to predict the blastocyst formation: the synthetize value. The area under the curve (AUC) obtained from the ROC curve was used to appreciate the performance of the regression coefficients and the synthetize value. A cut-off value has been calculated for each regression coefficient and for the synthetize value to obtain two groups where the difference of blastocyst formation rate according to the cut-off values was maximal. The data were analyzed with SPSS (IBM, Il, Chicago, USA). Results: Among the 557 embryos, 79.7% had reached the blastocyst stage. The synthetize value corresponds to the value calculated with time value equal to 99, the highest AUC was then obtained. The AUC for regression coefficient ‘a’ was 0.648 (p < 0.001), 0.363 (p < 0.001) for the regression coefficient ‘b’, 0.633 (p < 0.001) for the regression coefficient ‘c’, and 0.659 (p < 0.001) for the synthetize value. The results are presented as follow: blastocyst formation rate under cut-off value versus blastocyst rate formation above cut-off value. For the regression coefficient ‘a’ the optimum cut-off value was -1.14.10-3 (61.3% versus 84.3%, p < 0.001), 0.26 for the regression coefficient ‘b’ (83.9% versus 63.1%, p < 0.001), -4.4 for the regression coefficient ‘c’ (62.2% versus 83.1%, p < 0.001) and 8.89 for the synthetize value (58.6% versus 85.0%, p < 0.001). Conclusion: This quadratic regression allows to predict the outcome of an embryo even in case of missing data. Three regression coefficients and a synthetize value could represent the identity card of an embryo. ‘a’ regression coefficient represents the acceleration of cells division, ‘b’ regression coefficient represents the speed of cell division. We could hypothesize that ‘c’ regression coefficient could represent the intrinsic potential of an embryo. This intrinsic potential could be dependent from oocyte originating the embryo. These hypotheses should be confirmed by studies analyzing relationship between regression coefficients and ART parameters.

Keywords: ART procedure, blastocyst formation, time-lapse incubator, quadratic model

Procedia PDF Downloads 290
3166 Two-Phase Sampling for Estimating a Finite Population Total in Presence of Missing Values

Authors: Daniel Fundi Murithi

Abstract:

Missing data is a real bane in many surveys. To overcome the problems caused by missing data, partial deletion, and single imputation methods, among others, have been proposed. However, problems such as discarding usable data and inaccuracy in reproducing known population parameters and standard errors are associated with them. For regression and stochastic imputation, it is assumed that there is a variable with complete cases to be used as a predictor in estimating missing values in the other variable, and the relationship between the two variables is linear, which might not be realistic in practice. In this project, we estimate population total in presence of missing values in two-phase sampling. Instead of regression or stochastic models, non-parametric model based regression model is used in imputing missing values. Empirical study showed that nonparametric model-based regression imputation is better in reproducing variance of population total estimate obtained when there were no missing values compared to mean, median, regression, and stochastic imputation methods. Although regression and stochastic imputation were better than nonparametric model-based imputation in reproducing population total estimates obtained when there were no missing values in one of the sample sizes considered, nonparametric model-based imputation may be used when the relationship between outcome and predictor variables is not linear.

Keywords: finite population total, missing data, model-based imputation, two-phase sampling

Procedia PDF Downloads 110
3165 Heat and Mass Transfer Modelling of Industrial Sludge Drying at Different Pressures and Temperatures

Authors: L. Al Ahmad, C. Latrille, D. Hainos, D. Blanc, M. Clausse

Abstract:

A two-dimensional finite volume axisymmetric model is developed to predict the simultaneous heat and mass transfers during the drying of industrial sludge. The simulations were run using COMSOL-Multiphysics 3.5a. The input parameters of the numerical model were acquired from a preliminary experimental work. Results permit to establish correlations describing the evolution of the various parameters as a function of the drying temperature and the sludge water content. The selection and coupling of the equation are validated based on the drying kinetics acquired experimentally at a temperature range of 45-65 °C and absolute pressure range of 200-1000 mbar. The model, incorporating the heat and mass transfer mechanisms at different operating conditions, shows simulated values of temperature and water content. Simulated results are found concordant with the experimental values, only at the first and last drying stages where sludge shrinkage is insignificant. Simulated and experimental results show that sludge drying is favored at high temperatures and low pressure. As experimentally observed, the drying time is reduced by 68% for drying at 65 °C compared to 45 °C under 1 atm. At 65 °C, a 200-mbar absolute pressure vacuum leads to an additional reduction in drying time estimated by 61%. However, the drying rate is underestimated in the intermediate stage. This rate underestimation could be improved in the model by considering the shrinkage phenomena that occurs during sludge drying.

Keywords: industrial sludge drying, heat transfer, mass transfer, mathematical modelling

Procedia PDF Downloads 107
3164 A Novel Approach towards Test Case Prioritization Technique

Authors: Kamna Solanki, Yudhvir Singh, Sandeep Dalal

Abstract:

Software testing is a time and cost intensive process. A scrutiny of the code and rigorous testing is required to identify and rectify the putative bugs. The process of bug identification and its consequent correction is continuous in nature and often some of the bugs are removed after the software has been launched in the market. This process of code validation of the altered software during the maintenance phase is termed as Regression testing. Regression testing ubiquitously considers resource constraints; therefore, the deduction of an appropriate set of test cases, from the ensemble of the entire gamut of test cases, is a critical issue for regression test planning. This paper presents a novel method for designing a suitable prioritization process to optimize fault detection rate and performance of regression test on predefined constraints. The proposed method for test case prioritization m-ACO alters the food source selection criteria of natural ants and is basically a modified version of Ant Colony Optimization (ACO). The proposed m-ACO approach has been coded in 'Perl' language and results are validated using three examples by computation of Average Percentage of Faults Detected (APFD) metric.

Keywords: regression testing, software testing, test case prioritization, test suite optimization

Procedia PDF Downloads 312
3163 Prediction of the Thermodynamic Properties of Hydrocarbons Using Gaussian Process Regression

Authors: N. Alhazmi

Abstract:

Knowing the thermodynamics properties of hydrocarbons is vital when it comes to analyzing the related chemical reaction outcomes and understanding the reaction process, especially in terms of petrochemical industrial applications, combustions, and catalytic reactions. However, measuring the thermodynamics properties experimentally is time-consuming and costly. In this paper, Gaussian process regression (GPR) has been used to directly predict the main thermodynamic properties - standard enthalpy of formation, standard entropy, and heat capacity -for more than 360 cyclic and non-cyclic alkanes, alkenes, and alkynes. A simple workflow has been proposed that can be applied to directly predict the main properties of any hydrocarbon by knowing its descriptors and chemical structure and can be generalized to predict the main properties of any material. The model was evaluated by calculating the statistical error R², which was more than 0.9794 for all the predicted properties.

Keywords: thermodynamic, Gaussian process regression, hydrocarbons, regression, supervised learning, entropy, enthalpy, heat capacity

Procedia PDF Downloads 199
3162 Numerical Analysis of the Aging Effects of RC Shear Walls Repaired by CFRP Sheets: Application of CEB-FIP MC 90 Model

Authors: Yeghnem Redha, Guerroudj Hicham Zakaria, Hanifi Hachemi Amar Lemiya, Meftah Sid Ahmed, Tounsi Abdelouahed, Adda Bedia El Abbas

Abstract:

Creep deformation of concrete is often responsible for excessive deflection at service loads which can compromise the performance of elements within a structure. Although laboratory test may be undertaken to determine the deformation properties of concrete, these are time-consuming, often expensive and generally not a practical option. Therefore, relatively simple empirically design code models are relied to predict the creep strain. This paper reviews the accuracy of creep and shrinkage predictions of reinforced concrete (RC) shear walls structures strengthened with carbon fibre reinforced polymer (CFRP) sheets, which is characterized by a widthwise varying fibre volume fraction. This review is yielded by CEB-FIB MC90 model. The time-dependent behavior was investigated to analyze their static behavior. In the numerical formulation, the adherents and the adhesives are all modelled as shear wall elements, using the mixed finite element method. Several tests were used to dem¬onstrate the accuracy and effectiveness of the proposed method. Numerical results from the present analysis are presented to illustrate the significance of the time-dependency of the lateral displacements.

Keywords: RC shear walls strengthened, CFRP sheets, creep and shrinkage, CEB-FIP MC90 model, finite element method, static behavior

Procedia PDF Downloads 282
3161 Solving Single Machine Total Weighted Tardiness Problem Using Gaussian Process Regression

Authors: Wanatchapong Kongkaew

Abstract:

This paper proposes an application of probabilistic technique, namely Gaussian process regression, for estimating an optimal sequence of the single machine with total weighted tardiness (SMTWT) scheduling problem. In this work, the Gaussian process regression (GPR) model is utilized to predict an optimal sequence of the SMTWT problem, and its solution is improved by using an iterated local search based on simulated annealing scheme, called GPRISA algorithm. The results show that the proposed GPRISA method achieves a very good performance and a reasonable trade-off between solution quality and time consumption. Moreover, in the comparison of deviation from the best-known solution, the proposed mechanism noticeably outperforms the recently existing approaches.

Keywords: Gaussian process regression, iterated local search, simulated annealing, single machine total weighted tardiness

Procedia PDF Downloads 284
3160 The Profit Trend of Cosmetics Products Using Bootstrap Edgeworth Approximation

Authors: Edlira Donefski, Lorenc Ekonomi, Tina Donefski

Abstract:

Edgeworth approximation is one of the most important statistical methods that has a considered contribution in the reduction of the sum of standard deviation of the independent variables’ coefficients in a Quantile Regression Model. This model estimates the conditional median or other quantiles. In this paper, we have applied approximating statistical methods in an economical problem. We have created and generated a quantile regression model to see how the profit gained is connected with the realized sales of the cosmetic products in a real data, taken from a local business. The Linear Regression of the generated profit and the realized sales was not free of autocorrelation and heteroscedasticity, so this is the reason that we have used this model instead of Linear Regression. Our aim is to analyze in more details the relation between the variables taken into study: the profit and the finalized sales and how to minimize the standard errors of the independent variable involved in this study, the level of realized sales. The statistical methods that we have applied in our work are Edgeworth Approximation for Independent and Identical distributed (IID) cases, Bootstrap version of the Model and the Edgeworth approximation for Bootstrap Quantile Regression Model. The graphics and the results that we have presented here identify the best approximating model of our study.

Keywords: bootstrap, edgeworth approximation, IID, quantile

Procedia PDF Downloads 138
3159 Phosphate Sludge Ceramics: Effects of Firing Cycle Parameters on Technological Properties and Ceramic Suitability

Authors: Mohamed Loutou, Mohamed Hajjaji, Mohamed Ait Babram, Mohammed Mansori, Rachid Hakkou, Claude Favotto

Abstract:

More than 26,4 million tons of phosphates are produced by the phosphates industries in Morocco (2010), generating huge amounts of sludge by flocculation during the ore beneficiation. They way are stored at the end of the process in open air ponds. Its accumulation and storage may have an impact on several scales such as ground water and human being. For this purpose, an efficient way to use it the field of the ceramic is proposed. The as received sludge and a clay-rich sediment have been studied in terms of chemical, mineralogical and micro-structural side using various analytical methods. Several formulations have been performed by mixing the sludge with the binder shaped in the form of granules. After being dried at 105 °C, the samples were heated in the range of 900-1200 °C. As well as the ceramic properties (firing shrinkage, water absorption, total porosity and compressive strength) the micro structure has been investigated using X-ray diffraction, scanning electron microscopy and Fourier transform infrared spectroscopy. The relations between properties and the operating factors were formulated using the design of experiments (DOE). Gehlenite was the only phase neo-formed in the sintering samples. SEM micrographs revealed the presence of nano metric stains. Based on RSM results, all factors had positive effects on Firing shrinkage, compressive strength and total porosity. However, they manifested opposite effects on density and water absorption.

Keywords: phosphate sludge, clay, ceramic properties, granule

Procedia PDF Downloads 487
3158 Time Series Regression with Meta-Clusters

Authors: Monika Chuchro

Abstract:

This paper presents a preliminary attempt to apply classification of time series using meta-clusters in order to improve the quality of regression models. In this case, clustering was performed as a method to obtain a subgroups of time series data with normal distribution from inflow into waste water treatment plant data which Composed of several groups differing by mean value. Two simple algorithms: K-mean and EM were chosen as a clustering method. The rand index was used to measure the similarity. After simple meta-clustering, regression model was performed for each subgroups. The final model was a sum of subgroups models. The quality of obtained model was compared with the regression model made using the same explanatory variables but with no clustering of data. Results were compared by determination coefficient (R2), measure of prediction accuracy mean absolute percentage error (MAPE) and comparison on linear chart. Preliminary results allows to foresee the potential of the presented technique.

Keywords: clustering, data analysis, data mining, predictive models

Procedia PDF Downloads 447
3157 Efficient Principal Components Estimation of Large Factor Models

Authors: Rachida Ouysse

Abstract:

This paper proposes a constrained principal components (CnPC) estimator for efficient estimation of large-dimensional factor models when errors are cross sectionally correlated and the number of cross-sections (N) may be larger than the number of observations (T). Although principal components (PC) method is consistent for any path of the panel dimensions, it is inefficient as the errors are treated to be homoskedastic and uncorrelated. The new CnPC exploits the assumption of bounded cross-sectional dependence, which defines Chamberlain and Rothschild’s (1983) approximate factor structure, as an explicit constraint and solves a constrained PC problem. The CnPC method is computationally equivalent to the PC method applied to a regularized form of the data covariance matrix. Unlike maximum likelihood type methods, the CnPC method does not require inverting a large covariance matrix and thus is valid for panels with N ≥ T. The paper derives a convergence rate and an asymptotic normality result for the CnPC estimators of the common factors. We provide feasible estimators and show in a simulation study that they are more accurate than the PC estimator, especially for panels with N larger than T, and the generalized PC type estimators, especially for panels with N almost as large as T.

Keywords: high dimensionality, unknown factors, principal components, cross-sectional correlation, shrinkage regression, regularization, pseudo-out-of-sample forecasting

Procedia PDF Downloads 129
3156 Economic Analysis of Cowpea (Unguiculata spp) Production in Northern Nigeria: A Case Study of Kano Katsina and Jigawa States

Authors: Yakubu Suleiman, S. A. Musa

Abstract:

Nigeria is the largest cowpea producer in the world, accounting for about 45%, followed by Brazil with about 17%. Cowpea is grown in Kano, Bauchi, Katsina, Borno in the north, Oyo in the west, and to the lesser extent in Enugu in the east. This study was conducted to determine the input–output relationship of Cowpea production in Kano, Katsina, and Jigawa states of Nigeria. The data were collected with the aid of 1000 structured questionnaires that were randomly distributed to Cowpea farmers in the three states mentioned above of the study area. The data collected were analyzed using regression analysis (Cobb–Douglass production function model). The result of the regression analysis revealed the coefficient of multiple determinations, R2, to be 72.5% and the F ration to be 106.20 and was found to be significant (P < 0.01). The regression coefficient of constant is 0.5382 and is significant (P < 0.01). The regression coefficient with respect to labor and seeds were 0.65554 and 0.4336, respectively, and they are highly significant (P < 0.01). The regression coefficient with respect to fertilizer is 0.26341 which is significant (P < 0.05). This implies that a unit increase of any one of the variable inputs used while holding all other variables inputs constants, will significantly increase the total Cowpea output by their corresponding coefficient. This indicated that farmers in the study area are operating in stage II of the production function. The result revealed that Cowpea farmer in Kano, Jigawa and Katsina States realized a profit of N15,997, N34,016 and N19,788 per hectare respectively. It is hereby recommended that more attention should be given to Cowpea production by government and research institutions.

Keywords: coefficient, constant, inputs, regression

Procedia PDF Downloads 391