Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 3295

Search results for: polynomial regression

3175 Determining the Causality Variables in Female Genital Mutilation: A Factor Screening Approach

Abstract:

Female Genital Mutilation (FGM) is made up of three types namely: Clitoridectomy, Excision and Infibulation. In this study, we examine the factors responsible for FGM in order to identify the causality variables in a logistic regression approach. From the result of the survey conducted by the Public Health Division, Nigeria Institute of Medical Research, Yaba, Lagos State, the tau statistic, τ was used to screen 9 factors that causes FGM in order to select few of the predictors before multiple regression equation is obtained. The need for this may be that the sample size may not be able to sustain having a regression with all the predictors or to avoid multi-collinearity. A total of 300 respondents, comprising 150 adult males and 150 adult females were selected for the household survey based on the multi-stage sampling procedure. The tau statistic,

Keywords: female genital mutilation, logistic regression, tau statistic, African society

Procedia PDF Downloads 227

3174 A Monte Carlo Fuzzy Logistic Regression Framework against Imbalance and Separation

Authors: Georgios Charizanos, Haydar Demirhan, Duygu Icen

Abstract:

Two of the most impactful issues in classical logistic regression are class imbalance and complete separation. These can result in model predictions heavily leaning towards the imbalanced class on the binary response variable or over-fitting issues. Fuzzy methodology offers key solutions for handling these problems. However, most studies propose the transformation of the binary responses into a continuous format limited within [0,1]. This is called the possibilistic approach within fuzzy logistic regression. Following this approach is more aligned with straightforward regression since a logit-link function is not utilized, and fuzzy probabilities are not generated. In contrast, we propose a method of fuzzifying binary response variables that allows for the use of the logit-link function; hence, a probabilistic fuzzy logistic regression model with the Monte Carlo method. The fuzzy probabilities are then classified by selecting a fuzzy threshold. Different combinations of fuzzy and crisp input, output, and coefficients are explored, aiming to understand which of these perform better under different conditions of imbalance and separation. We conduct numerical experiments using both synthetic and real datasets to demonstrate the performance of the fuzzy logistic regression framework against seven crisp machine learning methods. The proposed framework shows better performance irrespective of the degree of imbalance and presence of separation in the data, while the considered machine learning methods are significantly impacted.

Keywords: fuzzy logistic regression, fuzzy, logistic, machine learning

Procedia PDF Downloads 36

3173 Landslide Susceptibility Mapping: A Comparison between Logistic Regression and Multivariate Adaptive Regression Spline Models in the Municipality of Oudka, Northern of Morocco

Authors: S. Benchelha, H. C. Aoudjehane, M. Hakdaoui, R. El Hamdouni, H. Mansouri, T. Benchelha, M. Layelmam, M. Alaoui

Abstract:

The logistic regression (LR) and multivariate adaptive regression spline (MarSpline) are applied and verified for analysis of landslide susceptibility map in Oudka, Morocco, using geographical information system. From spatial database containing data such as landslide mapping, topography, soil, hydrology and lithology, the eight factors related to landslides such as elevation, slope, aspect, distance to streams, distance to road, distance to faults, lithology map and Normalized Difference Vegetation Index (NDVI) were calculated or extracted. Using these factors, landslide susceptibility indexes were calculated by the two mentioned methods. Before the calculation, this database was divided into two parts, the first for the formation of the model and the second for the validation. The results of the landslide susceptibility analysis were verified using success and prediction rates to evaluate the quality of these probabilistic models. The result of this verification was that the MarSpline model is the best model with a success rate (AUC = 0.963) and a prediction rate (AUC = 0.951) higher than the LR model (success rate AUC = 0.918, rate prediction AUC = 0.901).

Keywords: landslide susceptibility mapping, regression logistic, multivariate adaptive regression spline, Oudka, Taounate

Procedia PDF Downloads 160

3172 Degumming of Eri Silk Fabric with Ionic Liquid

Authors: Shweta K. Vyas, Rakesh Musale, Sanjeev R. Shukla

Abstract:

Eri silk is a non mulberry silk which is obtained without killing the silkworms and hence it is also known as Ahmisa silk. In the present study, the results on degumming of eri silk with alkaline peroxide have been compared with those obtained by using ionic liquid (IL) 1-Butyl-3-methylimidazolium chloride [BMIM]Cl. Experiments were designed to find out the optimum processing parameters for degumming of eri silk by response surface methodology. The statistical software, Design-Expert 6.0 was used for regression analysis and graphical analysis of the responses obtained by running the set of designed experiments. Analysis of variance (ANOVA) was used to estimate the statistical parameters. The polynomial equation of quadratic order was employed to fit the experimental data. The quality and model terms were evaluated by F-test. Three dimensional surface plots were prepared to study the effect of variables on different responses. The optimum conditions for IL treatment were selected from predicted combinations and the experiments were repeated under these conditions to determine the reproducibility.

Keywords: silk degumming, ionic liquid, response surface methodology, ANOVA

Procedia PDF Downloads 560

3171 Robust Variable Selection Based on Schwarz Information Criterion for Linear Regression Models

Authors: Shokrya Saleh A. Alshqaq, Abdullah Ali H. Ahmadini

Abstract:

The Schwarz information criterion (SIC) is a popular tool for selecting the best variables in regression datasets. However, SIC is defined using an unbounded estimator, namely, the least-squares (LS), which is highly sensitive to outlying observations, especially bad leverage points. A method for robust variable selection based on SIC for linear regression models is thus needed. This study investigates the robustness properties of SIC by deriving its influence function and proposes a robust SIC based on the MM-estimation scale. The aim of this study is to produce a criterion that can effectively select accurate models in the presence of vertical outliers and high leverage points. The advantages of the proposed robust SIC is demonstrated through a simulation study and an analysis of a real dataset.

Keywords: influence function, robust variable selection, robust regression, Schwarz information criterion

Procedia PDF Downloads 113

3170 Generalized Additive Model for Estimating Propensity Score

Authors: Tahmidul Islam

Abstract:

Propensity Score Matching (PSM) technique has been widely used for estimating causal effect of treatment in observational studies. One major step of implementing PSM is estimating the propensity score (PS). Logistic regression model with additive linear terms of covariates is most used technique in many studies. Logistics regression model is also used with cubic splines for retaining flexibility in the model. However, choosing the functional form of the logistic regression model has been a question since the effectiveness of PSM depends on how accurately the PS been estimated. In many situations, the linearity assumption of linear logistic regression may not hold and non-linear relation between the logit and the covariates may be appropriate. One can estimate PS using machine learning techniques such as random forest, neural network etc for more accuracy in non-linear situation. In this study, an attempt has been made to compare the efficacy of Generalized Additive Model (GAM) in various linear and non-linear settings and compare its performance with usual logistic regression. GAM is a non-parametric technique where functional form of the covariates can be unspecified and a flexible regression model can be fitted. In this study various simple and complex models have been considered for treatment under several situations (small/large sample, low/high number of treatment units) and examined which method leads to more covariate balance in the matched dataset. It is found that logistic regression model is impressively robust against inclusion quadratic and interaction terms and reduces mean difference in treatment and control set equally efficiently as GAM does. GAM provided no significantly better covariate balance than logistic regression in both simple and complex models. The analysis also suggests that larger proportion of controls than treatment units leads to better balance for both of the methods.

Keywords: accuracy, covariate balances, generalized additive model, logistic regression, non-linearity, propensity score matching

Procedia PDF Downloads 332

3169 A Comparison of Neural Network and DOE-Regression Analysis for Predicting Resource Consumption of Manufacturing Processes

Authors: Frank Kuebler, Rolf Steinhilper

Abstract:

Artificial neural networks (ANN) as well as Design of Experiments (DOE) based regression analysis (RA) are mainly used for modeling of complex systems. Both methodologies are commonly applied in process and quality control of manufacturing processes. Due to the fact that resource efficiency has become a critical concern for manufacturing companies, these models needs to be extended to predict resource-consumption of manufacturing processes. This paper describes an approach to use neural networks as well as DOE based regression analysis for predicting resource consumption of manufacturing processes and gives a comparison of the achievable results based on an industrial case study of a turning process.

Keywords: artificial neural network, design of experiments, regression analysis, resource efficiency, manufacturing process

Procedia PDF Downloads 492

3168 A Hybrid Block Multistep Method for Direct Numerical Integration of Fourth Order Initial Value Problems

Authors: Adamu S. Salawu, Ibrahim O. Isah

Abstract:

Direct solution to several forms of fourth-order ordinary differential equations is not easily obtained without first reducing them to a system of first-order equations. Thus, numerical methods are being developed with the underlying techniques in the literature, which seeks to approximate some classes of fourth-order initial value problems with admissible error bounds. Multistep methods present a great advantage of the ease of implementation but with a setback of several functions evaluation for every stage of implementation. However, hybrid methods conventionally show a slightly higher order of truncation for any k-step linear multistep method, with the possibility of obtaining solutions at off mesh points within the interval of solution. In the light of the foregoing, we propose the continuous form of a hybrid multistep method with Chebyshev polynomial as a basis function for the numerical integration of fourth-order initial value problems of ordinary differential equations. The basis function is interpolated and collocated at some points on the interval [0, 2] to yield a system of equations, which is solved to obtain the unknowns of the approximating polynomial. The continuous form obtained, its first and second derivatives are evaluated at carefully chosen points to obtain the proposed block method needed to directly approximate fourth-order initial value problems. The method is analyzed for convergence. Implementation of the method is done by conducting numerical experiments on some test problems. The outcome of the implementation of the method suggests that the method performs well on problems with oscillatory or trigonometric terms since the approximations at several points on the solution domain did not deviate too far from the theoretical solutions. The method also shows better performance compared with an existing hybrid method when implemented on a larger interval of solution.

Keywords: Chebyshev polynomial, collocation, hybrid multistep method, initial value problems, interpolation

Procedia PDF Downloads 100

3167 Logistic Regression Model versus Additive Model for Recurrent Event Data

Authors: Entisar A. Elgmati

Abstract:

Recurrent infant diarrhea is studied using daily data collected in Salvador, Brazil over one year and three months. A logistic regression model is fitted instead of Aalen's additive model using the same covariates that were used in the analysis with the additive model. The model gives reasonably similar results to that using additive regression model. In addition, the problem with the estimated conditional probabilities not being constrained between zero and one in additive model is solved here. Also martingale residuals that have been used to judge the goodness of fit for the additive model are shown to be useful for judging the goodness of fit of the logistic model.

Keywords: additive model, cumulative probabilities, infant diarrhoea, recurrent event

Procedia PDF Downloads 604

3166 Extension and Closure of a Field for Engineering Purpose

Authors: Shouji Yujiro, Memei Dukovic, Mist Yakubu

Abstract:

Fields are important objects of study in algebra since they provide a useful generalization of many number systems, such as the rational numbers, real numbers, and complex numbers. In particular, the usual rules of associativity, commutativity and distributivity hold. Fields also appear in many other areas of mathematics; see the examples below. When abstract algebra was first being developed, the definition of a field usually did not include commutativity of multiplication, and what we today call a field would have been called either a commutative field or a rational domain. In contemporary usage, a field is always commutative. A structure which satisfies all the properties of a field except possibly for commutativity, is today called a division ring ordivision algebra or sometimes a skew field. Also non-commutative field is still widely used. In French, fields are called corps (literally, body), generally regardless of their commutativity. When necessary, a (commutative) field is called corps commutative and a skew field-corps gauche. The German word for body is Körper and this word is used to denote fields; hence the use of the blackboard bold to denote a field. The concept of fields was first (implicitly) used to prove that there is no general formula expressing in terms of radicals the roots of a polynomial with rational coefficients of degree 5 or higher. An extension of a field k is just a field K containing k as a subfield. One distinguishes between extensions having various qualities. For example, an extension K of a field k is called algebraic, if every element of K is a root of some polynomial with coefficients in k. Otherwise, the extension is called transcendental. The aim of Galois Theory is the study of algebraic extensions of a field. Given a field k, various kinds of closures of k may be introduced. For example, the algebraic closure, the separable closure, the cyclic closure et cetera. The idea is always the same: If P is a property of fields, then a P-closure of k is a field K containing k, having property, and which is minimal in the sense that no proper subfield of K that contains k has property P. For example if we take P (K) to be the property ‘every non-constant polynomial f in K[t] has a root in K’, then a P-closure of k is just an algebraic closure of k. In general, if P-closures exist for some property P and field k, they are all isomorphic. However, there is in general no preferable isomorphism between two closures.

Keywords: field theory, mechanic maths, supertech, rolltech

Procedia PDF Downloads 339

3165 A Characterization of Skew Cyclic Code with Complementary Dual

Authors: Eusebio Jr. Lina, Ederlina Nocon

Abstract:

Cyclic codes are a fundamental subclass of linear codes that enjoy a very interesting algebraic structure. The class of skew cyclic codes (or θ-cyclic codes) is a generalization of the notion of cyclic codes. This a very large class of linear codes which can be used to systematically search for codes with good properties. A linear code with complementary dual (LCD code) is a linear code C satisfying C ∩ C^⊥ = {0}. This subclass of linear codes provides an optimum linear coding solution for a two-user binary adder channel and plays an important role in countermeasures to passive and active side-channel analyses on embedded cryptosystems. This paper aims to identify LCD codes from the class of skew cyclic codes. Let F_q be a finite field of order q, and θ be an automorphism of F_q. Some conditions for a skew cyclic code to be LCD were given. To this end, the properties of a noncommutative skew polynomial ring F_q[x, θ] of automorphism type were revisited, and the algebraic structure of skew cyclic code using its skew polynomial representation was examined. Using the result that skew cyclic codes are left ideals of the ring F_q[x, θ]/〈x^n-1〉, a characterization of a skew cyclic LCD code of length n was derived. A necessary condition for a skew cyclic code to be LCD was also given.

Keywords: LCD cyclic codes, skew cyclic LCD codes, skew cyclic complementary dual codes, theta-cyclic codes with complementary duals

Procedia PDF Downloads 304

3164 Identifying Factors Contributing to the Spread of Lyme Disease: A Regression Analysis of Virginia’s Data

Authors: Fatemeh Valizadeh Gamchi, Edward L. Boone

Abstract:

This research focuses on Lyme disease, a widespread infectious condition in the United States caused by the bacterium Borrelia burgdorferi sensu stricto. It is critical to identify environmental and economic elements that are contributing to the spread of the disease. This study examined data from Virginia to identify a subset of explanatory variables significant for Lyme disease case numbers. To identify relevant variables and avoid overfitting, linear poisson, and regularization regression methods such as a ridge, lasso, and elastic net penalty were employed. Cross-validation was performed to acquire tuning parameters. The methods proposed can automatically identify relevant disease count covariates. The efficacy of the techniques was assessed using four criteria on three simulated datasets. Finally, using the Virginia Department of Health’s Lyme disease data set, the study successfully identified key factors, and the results were consistent with previous studies.

Keywords: lyme disease, Poisson generalized linear model, ridge regression, lasso regression, elastic net regression

Procedia PDF Downloads 96

3163 An Analysis of the Effect of Sharia Financing and Work Relation Founding towards Non-Performing Financing in Islamic Banks in Indonesia

Authors: Muhammad Bahrul Ilmi

Abstract:

The purpose of this research is to analyze the influence of Islamic financing and work relation founding simultaneously and partially towards non-performing financing in Islamic banks. This research was regression quantitative field research, and had been done in Muammalat Indonesia Bank and Islamic Danamon Bank in 3 months. The populations of this research were 15 account officers of Muammalat Indonesia Bank and Islamic Danamon Bank in Surakarta, Indonesia. The techniques of collecting data used in this research were documentation, questionnaire, literary study and interview. Regression analysis result shows that Islamic financing and work relation founding simultaneously has positive and significant effect towards non performing financing of two Islamic Banks. It is obtained with probability value 0.003 which is less than 0.05 and F value 9.584. The analysis result of Islamic financing regression towards non performing financing shows the significant effect. It is supported by double linear regression analysis with probability value 0.001 which is less than 0.05. The regression analysis of work relation founding effect towards non-performing financing shows insignificant effect. This is shown in the double linear regression analysis with probability value 0.161 which is bigger than 0.05.

Keywords: Syariah financing, work relation founding, non-performing financing (NPF), Islamic Bank

Procedia PDF Downloads 404

3162 A Kolmogorov-Smirnov Type Goodness-Of-Fit Test of Multinomial Logistic Regression Model in Case-Control Studies

Authors: Chen Li-Ching

Abstract:

The multinomial logistic regression model is used popularly for inferring the relationship of risk factors and disease with multiple categories. This study based on the discrepancy between the nonparametric maximum likelihood estimator and semiparametric maximum likelihood estimator of the cumulative distribution function to propose a Kolmogorov-Smirnov type test statistic to assess adequacy of the multinomial logistic regression model for case-control data. A bootstrap procedure is presented to calculate the critical value of the proposed test statistic. Empirical type I error rates and powers of the test are performed by simulation studies. Some examples will be illustrated the implementation of the test.

Keywords: case-control studies, goodness-of-fit test, Kolmogorov-Smirnov test, multinomial logistic regression

Procedia PDF Downloads 427

3161 A Study on Inference from Distance Variables in Hedonic Regression

Authors: Yan Wang, Yasushi Asami, Yukio Sadahiro

Abstract:

In urban area, several landmarks may affect housing price and rents, hedonic analysis should employ distance variables corresponding to each landmarks. Unfortunately, the effects of distances to landmarks on housing prices are generally not consistent with the true price. These distance variables may cause magnitude error in regression, pointing a problem of spatial multicollinearity. In this paper, we provided some approaches for getting the samples with less bias and method on locating the specific sampling area to avoid the multicollinerity problem in two specific landmarks case.

Keywords: landmarks, hedonic regression, distance variables, collinearity, multicollinerity

Procedia PDF Downloads 425

3160 Forecasting of Grape Juice Flavor by Using Support Vector Regression

Authors: Ren-Jieh Kuo, Chun-Shou Huang

Abstract:

The research of juice flavor forecasting has become more important in China. Due to the fast economic growth in China, many different kinds of juices have been introduced to the market. If a beverage company can understand their customers’ preference well, the juice can be served more attractively. Thus, this study intends to introduce the basic theory and computing process of grapes juice flavor forecasting based on support vector regression (SVR). Applying SVR, BPN and LR to forecast the flavor of grapes juice in real data, the result shows that SVR is more suitable and effective at predicting performance.

Keywords: flavor forecasting, artificial neural networks, Support Vector Regression, China

Procedia PDF Downloads 452

3159 Estimation of Coefficients of Ridge and Principal Components Regressions with Multicollinear Data

Authors: Rajeshwar Singh

Abstract:

The presence of multicollinearity is common in handling with several explanatory variables simultaneously due to exhibiting a linear relationship among them. A great problem arises in understanding the impact of explanatory variables on the dependent variable. Thus, the method of least squares estimation gives inexact estimates. In this case, it is advised to detect its presence first before proceeding further. Using the ridge regression degree of its occurrence is reduced but principal components regression gives good estimates in this situation. This paper discusses well-known techniques of the ridge and principal components regressions and applies to get the estimates of coefficients by both techniques. In addition to it, this paper also discusses the conflicting claim on the discovery of the method of ridge regression based on available documents.

Keywords: conflicting claim on credit of discovery of ridge regression, multicollinearity, principal components and ridge regressions, variance inflation factor

Procedia PDF Downloads 374

3158 A Polynomial Time Clustering Algorithm for Solving the Assignment Problem in the Vehicle Routing Problem

Authors: Lydia Wahid, Mona F. Ahmed, Nevin Darwish

Abstract:

The vehicle routing problem (VRP) consists of a group of customers that needs to be served. Each customer has a certain demand of goods. A central depot having a fleet of vehicles is responsible for supplying the customers with their demands. The problem is composed of two subproblems: The first subproblem is an assignment problem where the number of vehicles that will be used as well as the customers assigned to each vehicle are determined. The second subproblem is the routing problem in which for each vehicle having a number of customers assigned to it, the order of visits of the customers is determined. Optimal number of vehicles, as well as optimal total distance, should be achieved. In this paper, an approach for solving the first subproblem (the assignment problem) is presented. In the approach, a clustering algorithm is proposed for finding the optimal number of vehicles by grouping the customers into clusters where each cluster is visited by one vehicle. Finding the optimal number of clusters is NP-hard. This work presents a polynomial time clustering algorithm for finding the optimal number of clusters and solving the assignment problem.

Keywords: vehicle routing problems, clustering algorithms, Clarke and Wright Saving Method, agglomerative hierarchical clustering

Procedia PDF Downloads 357

3157 Estimate of Maximum Expected Intensity of One-Half-Wave Lines Dancing

Authors: A. Bekbaev, M. Dzhamanbaev, R. Abitaeva, A. Karbozova, G. Nabyeva

Abstract:

In this paper, the regression dependence of dancing intensity from wind speed and length of span was established due to the statistic data obtained from multi-year observations on line wires dancing accumulated by power systems of Kazakhstan and the Russian Federation. The lower and upper limitations of the equations parameters were estimated, as well as the adequacy of the regression model. The constructed model will be used in research of dancing phenomena for the development of methods and means of protection against dancing and for zoning plan of the territories of line wire dancing.

Keywords: power lines, line wire dancing, dancing intensity, regression equation, dancing area intensity

Procedia PDF Downloads 287

3156 A Robust Optimization of Chassis Durability/Comfort Compromise Using Chebyshev Polynomial Chaos Expansion Method

Authors: Hanwei Gao, Louis Jezequel, Eric Cabrol, Bernard Vitry

Abstract:

The chassis system is composed of complex elements that take up all the loads from the tire-ground contact area and thus it plays an important role in numerous specifications such as durability, comfort, crash, etc. During the development of new vehicle projects in Renault, durability validation is always the main focus while deployment of comfort comes later in the project. Therefore, sometimes design choices have to be reconsidered because of the natural incompatibility between these two specifications. Besides, robustness is also an important point of concern as it is related to manufacturing costs as well as the performance after the ageing of components like shock absorbers. In this paper an approach is proposed aiming to realize a multi-objective optimization between chassis endurance and comfort while taking the random factors into consideration. The adaptive-sparse polynomial chaos expansion method (PCE) with Chebyshev polynomial series has been applied to predict responses’ uncertainty intervals of a system according to its uncertain-but-bounded parameters. The approach can be divided into three steps. First an initial design of experiments is realized to build the response surfaces which represent statistically a black-box system. Secondly within several iterations an optimum set is proposed and validated which will form a Pareto front. At the same time the robustness of each response, served as additional objectives, is calculated from the pre-defined parameter intervals and the response surfaces obtained in the first step. Finally an inverse strategy is carried out to determine the parameters’ tolerance combination with a maximally acceptable degradation of the responses in terms of manufacturing costs. A quarter car model has been tested as an example by applying the road excitations from the actual road measurements for both endurance and comfort calculations. One indicator based on the Basquin’s law is defined to compare the global chassis durability of different parameter settings. Another indicator related to comfort is obtained from the vertical acceleration of the sprung mass. An optimum set with best robustness has been finally obtained and the reference tests prove a good robustness prediction of Chebyshev PCE method. This example demonstrates the effectiveness and reliability of the approach, in particular its ability to save computational costs for a complex system.

Keywords: chassis durability, Chebyshev polynomials, multi-objective optimization, polynomial chaos expansion, ride comfort, robust design

Procedia PDF Downloads 129

3155 Impact of Infrastructural Development on Socio-Economic Growth: An Empirical Investigation in India

Authors: Jonardan Koner

Abstract:

The study attempts to find out the impact of infrastructural investment on state economic growth in India. It further tries to determine the magnitude of the impact of infrastructural investment on economic indicator, i.e., per-capita income (PCI) in Indian States. The study uses panel regression technique to measure the impact of infrastructural investment on per-capita income (PCI) in Indian States. Panel regression technique helps incorporate both the cross-section and time-series aspects of the dataset. In order to analyze the difference in impact of the explanatory variables on the explained variables across states, the study uses Fixed Effect Panel Regression Model. The conclusions of the study are that infrastructural investment has a desirable impact on economic development and that the impact is different for different states in India. We analyze time series data (annual frequency) ranging from 1991 to 2010. The study reveals that the infrastructural investment significantly explains the variation of economic indicators.

Keywords: infrastructural investment, multiple regression, panel regression techniques, economic development, fixed effect dummy variable model

Procedia PDF Downloads 345

3154 A Quadratic Model to Early Predict the Blastocyst Stage with a Time Lapse Incubator

Authors: Cecile Edel, Sandrine Giscard D'Estaing, Elsa Labrune, Jacqueline Lornage, Mehdi Benchaib

Abstract:

Introduction: The use of incubator equipped with time-lapse technology in Artificial Reproductive Technology (ART) allows a continuous surveillance. With morphocinetic parameters, algorithms are available to predict the potential outcome of an embryo. However, the different proposed time-lapse algorithms do not take account the missing data, and then some embryos could not be classified. The aim of this work is to construct a predictive model even in the case of missing data. Materials and methods: Patients: A retrospective study was performed, in biology laboratory of reproduction at the hospital ‘Femme Mère Enfant’ (Lyon, France) between 1 May 2013 and 30 April 2015. Embryos (n= 557) obtained from couples (n=108) were cultured in a time-lapse incubator (Embryoscope®, Vitrolife, Goteborg, Sweden). Time-lapse incubator: The morphocinetic parameters obtained during the three first days of embryo life were used to build the predictive model. Predictive model: A quadratic regression was performed between the number of cells and time. N = a. T² + b. T + c. N: number of cells at T time (T in hours). The regression coefficients were calculated with Excel software (Microsoft, Redmond, WA, USA), a program with Visual Basic for Application (VBA) (Microsoft) was written for this purpose. The quadratic equation was used to find a value that allows to predict the blastocyst formation: the synthetize value. The area under the curve (AUC) obtained from the ROC curve was used to appreciate the performance of the regression coefficients and the synthetize value. A cut-off value has been calculated for each regression coefficient and for the synthetize value to obtain two groups where the difference of blastocyst formation rate according to the cut-off values was maximal. The data were analyzed with SPSS (IBM, Il, Chicago, USA). Results: Among the 557 embryos, 79.7% had reached the blastocyst stage. The synthetize value corresponds to the value calculated with time value equal to 99, the highest AUC was then obtained. The AUC for regression coefficient ‘a’ was 0.648 (p < 0.001), 0.363 (p < 0.001) for the regression coefficient ‘b’, 0.633 (p < 0.001) for the regression coefficient ‘c’, and 0.659 (p < 0.001) for the synthetize value. The results are presented as follow: blastocyst formation rate under cut-off value versus blastocyst rate formation above cut-off value. For the regression coefficient ‘a’ the optimum cut-off value was -1.14.10-3 (61.3% versus 84.3%, p < 0.001), 0.26 for the regression coefficient ‘b’ (83.9% versus 63.1%, p < 0.001), -4.4 for the regression coefficient ‘c’ (62.2% versus 83.1%, p < 0.001) and 8.89 for the synthetize value (58.6% versus 85.0%, p < 0.001). Conclusion: This quadratic regression allows to predict the outcome of an embryo even in case of missing data. Three regression coefficients and a synthetize value could represent the identity card of an embryo. ‘a’ regression coefficient represents the acceleration of cells division, ‘b’ regression coefficient represents the speed of cell division. We could hypothesize that ‘c’ regression coefficient could represent the intrinsic potential of an embryo. This intrinsic potential could be dependent from oocyte originating the embryo. These hypotheses should be confirmed by studies analyzing relationship between regression coefficients and ART parameters.

Keywords: ART procedure, blastocyst formation, time-lapse incubator, quadratic model

Procedia PDF Downloads 284

3153 A Polynomial Approach for a Graphical-based Integrated Production and Transport Scheduling with Capacity Restrictions

Authors: M. Ndeley

Abstract:

The performance of global manufacturing supply chains depends on the interaction of production and transport processes. Currently, the scheduling of these processes is done separately without considering mutual requirements, which leads to no optimal solutions. An integrated scheduling of both processes enables the improvement of supply chain performance. The integrated production and transport scheduling problem (PTSP) is NP-hard, so that heuristic methods are necessary to efficiently solve large problem instances as in the case of global manufacturing supply chains. This paper presents a heuristic scheduling approach which handles the integration of flexible production processes with intermodal transport, incorporating flexible land transport. The method is based on a graph that allows a reformulation of the PTSP as a shortest path problem for each job, which can be solved in polynomial time. The proposed method is applied to a supply chain scenario with a manufacturing facility in South Africa and shipments of finished product to customers within the Country. The obtained results show that the approach is suitable for the scheduling of large-scale problems and can be flexibly adapted to different scenarios.

Keywords: production and transport scheduling problem, graph based scheduling, integrated scheduling

Procedia PDF Downloads 450

3152 Two-Phase Sampling for Estimating a Finite Population Total in Presence of Missing Values

Authors: Daniel Fundi Murithi

Abstract:

Missing data is a real bane in many surveys. To overcome the problems caused by missing data, partial deletion, and single imputation methods, among others, have been proposed. However, problems such as discarding usable data and inaccuracy in reproducing known population parameters and standard errors are associated with them. For regression and stochastic imputation, it is assumed that there is a variable with complete cases to be used as a predictor in estimating missing values in the other variable, and the relationship between the two variables is linear, which might not be realistic in practice. In this project, we estimate population total in presence of missing values in two-phase sampling. Instead of regression or stochastic models, non-parametric model based regression model is used in imputing missing values. Empirical study showed that nonparametric model-based regression imputation is better in reproducing variance of population total estimate obtained when there were no missing values compared to mean, median, regression, and stochastic imputation methods. Although regression and stochastic imputation were better than nonparametric model-based imputation in reproducing population total estimates obtained when there were no missing values in one of the sample sizes considered, nonparametric model-based imputation may be used when the relationship between outcome and predictor variables is not linear.

Keywords: finite population total, missing data, model-based imputation, two-phase sampling

Procedia PDF Downloads 104

3151 A Theorem Related to Sample Moments and Two Types of Moment-Based Density Estimates

Authors: Serge B. Provost

Abstract:

Numerous statistical inference and modeling methodologies are based on sample moments rather than the actual observations. A result justifying the validity of this approach is introduced. More specifically, it will be established that given the first n moments of a sample of size n, one can recover the original n sample points. This implies that a sample of size n and its first associated n moments contain precisely the same amount of information. However, it is efficient to make use of a limited number of initial moments as most of the relevant distributional information is included in them. Two types of density estimation techniques that rely on such moments will be discussed. The first one expresses a density estimate as the product of a suitable base density and a polynomial adjustment whose coefficients are determined by equating the moments of the density estimate to the sample moments. The second one assumes that the derivative of the logarithm of a density function can be represented as a rational function. This gives rise to a system of linear equations involving sample moments, the density estimate is then obtained by solving a differential equation. Unlike kernel density estimation, these methodologies are ideally suited to model ‘big data’ as they only require a limited number of moments, irrespective of the sample size. What is more, they produce simple closed form expressions that are amenable to algebraic manipulations. They also turn out to be more accurate as will be shown in several illustrative examples.

Keywords: density estimation, log-density, polynomial adjustments, sample moments

Procedia PDF Downloads 129

3150 A Novel Approach towards Test Case Prioritization Technique

Authors: Kamna Solanki, Yudhvir Singh, Sandeep Dalal

Abstract:

Software testing is a time and cost intensive process. A scrutiny of the code and rigorous testing is required to identify and rectify the putative bugs. The process of bug identification and its consequent correction is continuous in nature and often some of the bugs are removed after the software has been launched in the market. This process of code validation of the altered software during the maintenance phase is termed as Regression testing. Regression testing ubiquitously considers resource constraints; therefore, the deduction of an appropriate set of test cases, from the ensemble of the entire gamut of test cases, is a critical issue for regression test planning. This paper presents a novel method for designing a suitable prioritization process to optimize fault detection rate and performance of regression test on predefined constraints. The proposed method for test case prioritization m-ACO alters the food source selection criteria of natural ants and is basically a modified version of Ant Colony Optimization (ACO). The proposed m-ACO approach has been coded in 'Perl' language and results are validated using three examples by computation of Average Percentage of Faults Detected (APFD) metric.

Keywords: regression testing, software testing, test case prioritization, test suite optimization

Procedia PDF Downloads 304

3149 Prediction of the Thermodynamic Properties of Hydrocarbons Using Gaussian Process Regression

Authors: N. Alhazmi

Abstract:

Knowing the thermodynamics properties of hydrocarbons is vital when it comes to analyzing the related chemical reaction outcomes and understanding the reaction process, especially in terms of petrochemical industrial applications, combustions, and catalytic reactions. However, measuring the thermodynamics properties experimentally is time-consuming and costly. In this paper, Gaussian process regression (GPR) has been used to directly predict the main thermodynamic properties - standard enthalpy of formation, standard entropy, and heat capacity -for more than 360 cyclic and non-cyclic alkanes, alkenes, and alkynes. A simple workflow has been proposed that can be applied to directly predict the main properties of any hydrocarbon by knowing its descriptors and chemical structure and can be generalized to predict the main properties of any material. The model was evaluated by calculating the statistical error R², which was more than 0.9794 for all the predicted properties.

Keywords: thermodynamic, Gaussian process regression, hydrocarbons, regression, supervised learning, entropy, enthalpy, heat capacity

Procedia PDF Downloads 188

3148 A Biometric Template Security Approach to Fingerprints Based on Polynomial Transformations

Authors: Ramon Santana

Abstract:

The use of biometric identifiers in the field of information security, access control to resources, authentication in ATMs and banking among others, are of great concern because of the safety of biometric data. In the general architecture of a biometric system have been detected eight vulnerabilities, six of them allow obtaining minutiae template in plain text. The main consequence of obtaining minutia templates is the loss of biometric identifier for life. To mitigate these vulnerabilities several models to protect minutiae templates have been proposed. Several vulnerabilities in the cryptographic security of these models allow to obtain biometric data in plain text. In order to increase the cryptographic security and ease of reversibility, a minutiae templates protection model is proposed. The model aims to make the cryptographic protection and facilitate the reversibility of data using two levels of security. The first level of security is the data transformation level. In this level generates invariant data to rotation and translation, further transformation is irreversible. The second level of security is the evaluation level, where the encryption key is generated and data is evaluated using a defined evaluation function. The model is aimed at mitigating known vulnerabilities of the proposed models, basing its security on the impossibility of the polynomial reconstruction.

Keywords: fingerprint, template protection, bio-cryptography, minutiae protection

Procedia PDF Downloads 139

3147 Solving Single Machine Total Weighted Tardiness Problem Using Gaussian Process Regression

Authors: Wanatchapong Kongkaew

Abstract:

This paper proposes an application of probabilistic technique, namely Gaussian process regression, for estimating an optimal sequence of the single machine with total weighted tardiness (SMTWT) scheduling problem. In this work, the Gaussian process regression (GPR) model is utilized to predict an optimal sequence of the SMTWT problem, and its solution is improved by using an iterated local search based on simulated annealing scheme, called GPRISA algorithm. The results show that the proposed GPRISA method achieves a very good performance and a reasonable trade-off between solution quality and time consumption. Moreover, in the comparison of deviation from the best-known solution, the proposed mechanism noticeably outperforms the recently existing approaches.

Keywords: Gaussian process regression, iterated local search, simulated annealing, single machine total weighted tardiness

Procedia PDF Downloads 279

3146 The Profit Trend of Cosmetics Products Using Bootstrap Edgeworth Approximation

Authors: Edlira Donefski, Lorenc Ekonomi, Tina Donefski

Abstract:

Edgeworth approximation is one of the most important statistical methods that has a considered contribution in the reduction of the sum of standard deviation of the independent variables’ coefficients in a Quantile Regression Model. This model estimates the conditional median or other quantiles. In this paper, we have applied approximating statistical methods in an economical problem. We have created and generated a quantile regression model to see how the profit gained is connected with the realized sales of the cosmetic products in a real data, taken from a local business. The Linear Regression of the generated profit and the realized sales was not free of autocorrelation and heteroscedasticity, so this is the reason that we have used this model instead of Linear Regression. Our aim is to analyze in more details the relation between the variables taken into study: the profit and the finalized sales and how to minimize the standard errors of the independent variable involved in this study, the level of realized sales. The statistical methods that we have applied in our work are Edgeworth Approximation for Independent and Identical distributed (IID) cases, Bootstrap version of the Model and the Edgeworth approximation for Bootstrap Quantile Regression Model. The graphics and the results that we have presented here identify the best approximating model of our study.

Keywords: bootstrap, edgeworth approximation, IID, quantile

Procedia PDF Downloads 127