Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 123

Search results for: non-parametric bootstrapping

123 Handling Missing Data by Using Expectation-Maximization and Expectation-Maximization with Bootstrapping for Linear Functional Relationship Model

Authors: Adilah Abdul Ghapor, Yong Zulina Zubairi, A. H. M. R. Imon

Abstract:

Missing value problem is common in statistics and has been of interest for years. This article considers two modern techniques in handling missing data for linear functional relationship model (LFRM) namely the Expectation-Maximization (EM) algorithm and Expectation-Maximization with Bootstrapping (EMB) algorithm using three performance indicators; namely the mean absolute error (MAE), root mean square error (RMSE) and estimated biased (EB). In this study, we applied the methods of imputing missing values in two types of LFRM namely the full model of LFRM and in LFRM when the slope is estimated using a nonparametric method. Results of the simulation study suggest that EMB algorithm performs much better than EM algorithm in both models. We also illustrate the applicability of the approach in a real data set.

Keywords: expectation-maximization, expectation-maximization with bootstrapping, linear functional relationship model, performance indicators

Procedia PDF Downloads 429

122 A Comparative Study of Additive and Nonparametric Regression Estimators and Variable Selection Procedures

Authors: Adriano Z. Zambom, Preethi Ravikumar

Abstract:

One of the biggest challenges in nonparametric regression is the curse of dimensionality. Additive models are known to overcome this problem by estimating only the individual additive effects of each covariate. However, if the model is misspecified, the accuracy of the estimator compared to the fully nonparametric one is unknown. In this work the efficiency of completely nonparametric regression estimators such as the Loess is compared to the estimators that assume additivity in several situations, including additive and non-additive regression scenarios. The comparison is done by computing the oracle mean square error of the estimators with regards to the true nonparametric regression function. Then, a backward elimination selection procedure based on the Akaike Information Criteria is proposed, which is computed from either the additive or the nonparametric model. Simulations show that if the additive model is misspecified, the percentage of time it fails to select important variables can be higher than that of the fully nonparametric approach. A dimension reduction step is included when nonparametric estimator cannot be computed due to the curse of dimensionality. Finally, the Boston housing dataset is analyzed using the proposed backward elimination procedure and the selected variables are identified.

Keywords: additive model, nonparametric regression, variable selection, Akaike Information Criteria

Procedia PDF Downloads 238

121 A Comparison of Smoothing Spline Method and Penalized Spline Regression Method Based on Nonparametric Regression Model

Authors: Autcha Araveeporn

Abstract:

This paper presents a study about a nonparametric regression model consisting of a smoothing spline method and a penalized spline regression method. We also compare the techniques used for estimation and prediction of nonparametric regression model. We tried both methods with crude oil prices in dollars per barrel and the Stock Exchange of Thailand (SET) index. According to the results, it is concluded that smoothing spline method performs better than that of penalized spline regression method.

Keywords: nonparametric regression model, penalized spline regression method, smoothing spline method, Stock Exchange of Thailand (SET)

Procedia PDF Downloads 394

120 Distribution-Free Exponentially Weighted Moving Average Control Charts for Monitoring Process Variability

Authors: Chen-Fang Tsai, Shin-Li Lu

Abstract:

Distribution-free control chart is an oncoming area from the statistical process control charts in recent years. Some researchers have developed various nonparametric control charts and investigated the detection capability of these charts. The major advantage of nonparametric control charts is that the underlying process is not specifically considered the assumption of normality or any parametric distribution. In this paper, two nonparametric exponentially weighted moving average (EWMA) control charts based on nonparametric tests, namely NE-S and NE-M control charts, are proposed for monitoring process variability. Generally, weighted moving average (GWMA) control charts are extended by utilizing design and adjustment parameters for monitoring the changes in the process variability, namely NG-S and NG-M control charts. Statistical performance is also investigated on NG-S and NG-M control charts with run rules. Moreover, sensitivity analysis is performed to show the effects of design parameters under the nonparametric NG-S and NG-M control charts.

Keywords: Distribution-free control chart, EWMA control charts, GWMA control charts

Procedia PDF Downloads 229

119 Nonparametric Truncated Spline Regression Model on the Data of Human Development Index in Indonesia

Authors: Kornelius Ronald Demu, Dewi Retno Sari Saputro, Purnami Widyaningsih

Abstract:

Human Development Index (HDI) is a standard measurement for a country's human development. Several factors may have influenced it, such as life expectancy, gross domestic product (GDP) based on the province's annual expenditure, the number of poor people, and the percentage of an illiterate people. The scatter plot between HDI and the influenced factors show that the plot does not follow a specific pattern or form. Therefore, the HDI's data in Indonesia can be applied with a nonparametric regression model. The estimation of the regression curve in the nonparametric regression model is flexible because it follows the shape of the data pattern. One of the nonparametric regression's method is a truncated spline. Truncated spline regression is one of the nonparametric approach, which is a modification of the segmented polynomial functions. The estimator of a truncated spline regression model was affected by the selection of the optimal knots point. Knot points is a focus point of spline truncated functions. The optimal knots point was determined by the minimum value of generalized cross validation (GCV). In this article were applied the data of Human Development Index with a truncated spline nonparametric regression model. The results of this research were obtained the best-truncated spline regression model to the HDI's data in Indonesia with the combination of optimal knots point 5-5-5-4. Life expectancy and the percentage of an illiterate people were the significant factors depend to the HDI in Indonesia. The coefficient of determination is 94.54%. This means the regression model is good enough to applied on the data of HDI in Indonesia.

Keywords: generalized cross validation (GCV), Human Development Index (HDI), knots point, nonparametric regression, truncated spline

Procedia PDF Downloads 302

118 A Bathtub Curve from Nonparametric Model

Authors: Eduardo C. Guardia, Jose W. M. Lima, Afonso H. M. Santos

Abstract:

This paper presents a nonparametric method to obtain the hazard rate “Bathtub curve” for power system components. The model is a mixture of the three known phases of a component life, the decreasing failure rate (DFR), the constant failure rate (CFR) and the increasing failure rate (IFR) represented by three parametric Weibull models. The parameters are obtained from a simultaneous fitting process of the model to the Kernel nonparametric hazard rate curve. From the Weibull parameters and failure rate curves the useful lifetime and the characteristic lifetime were defined. To demonstrate the model the historic time-to-failure of distribution transformers were used as an example. The resulted “Bathtub curve” shows the failure rate for the equipment lifetime which can be applied in economic and replacement decision models.

Keywords: bathtub curve, failure analysis, lifetime estimation, parameter estimation, Weibull distribution

Procedia PDF Downloads 411

117 Nonparametric Specification Testing for the Drift of the Short Rate Diffusion Process Using a Panel of Yields

Authors: John Knight, Fuchun Li, Yan Xu

Abstract:

Based on a new method of the nonparametric estimator of the drift function, we propose a consistent test for the parametric specification of the drift function in the short rate diffusion process using observations from a panel of yields. The test statistic is shown to follow an asymptotic normal distribution under the null hypothesis that the parametric drift function is correctly specified, and converges to infinity under the alternative. Taking the daily 7-day European rates as a proxy of the short rate, we use our test to examine whether the drift of the short rate diffusion process is linear or nonlinear, which is an unresolved important issue in the short rate modeling literature. The testing results indicate that none of the drift functions in this literature adequately captures the dynamics of the drift, but nonlinear specification performs better than the linear specification.

Keywords: diffusion process, nonparametric estimation, derivative security price, drift function and volatility function

Procedia PDF Downloads 338

116 Application of Nonparametric Geographically Weighted Regression to Evaluate the Unemployment Rate in East Java

Authors: Sifriyani Sifriyani, I Nyoman Budiantara, Sri Haryatmi, Gunardi Gunardi

Abstract:

East Java Province has a first rank as a province that has the most counties and cities in Indonesia and has the largest population. In 2015, the population reached 38.847.561 million, this figure showed a very high population growth. High population growth is feared to lead to increase the levels of unemployment. In this study, the researchers mapped and modeled the unemployment rate with 6 variables that were supposed to influence. Modeling was done by nonparametric geographically weighted regression methods with truncated spline approach. This method was chosen because spline method is a flexible method, these models tend to look for its own estimation. In this modeling, there were point knots, the point that showed the changes of data. The selection of the optimum point knots was done by selecting the most minimun value of Generalized Cross Validation (GCV). Based on the research, 6 variables were declared to affect the level of unemployment in eastern Java. They were the percentage of population that is educated above high school, the rate of economic growth, the population density, the investment ratio of total labor force, the regional minimum wage and the ratio of the number of big industry and medium scale industry from the work force. The nonparametric geographically weighted regression models with truncated spline approach had a coefficient of determination 98.95% and the value of MSE equal to 0.0047.

Keywords: East Java, nonparametric geographically weighted regression, spatial, spline approach, unemployed rate

Procedia PDF Downloads 286

115 Median-Based Nonparametric Estimation of Returns in Mean-Downside Risk Portfolio Frontier

Authors: H. Ben Salah, A. Gannoun, C. de Peretti, A. Trabelsi

Abstract:

The Downside Risk (DSR) model for portfolio optimisation allows to overcome the drawbacks of the classical mean-variance model concerning the asymetry of returns and the risk perception of investors. This model optimization deals with a positive definite matrix that is endogenous with respect to portfolio weights. This aspect makes the problem far more difficult to handle. For this purpose, Athayde (2001) developped a new recurcive minimization procedure that ensures the convergence to the solution. However, when a finite number of observations is available, the portfolio frontier presents an appearance which is not very smooth. In order to overcome that, Athayde (2003) proposed a mean kernel estimation of the returns, so as to create a smoother portfolio frontier. This technique provides an effect similar to the case in which we had continuous observations. In this paper, taking advantage on the the robustness of the median, we replace the mean estimator in Athayde's model by a nonparametric median estimator of the returns. Then, we give a new version of the former algorithm (of Athayde (2001, 2003)). We eventually analyse the properties of this improved portfolio frontier and apply this new method on real examples.

Keywords: Downside Risk, Kernel Method, Median, Nonparametric Estimation, Semivariance

Procedia PDF Downloads 455

114 Adaptive Nonparametric Approach for Guaranteed Real-Time Detection of Targeted Signals in Multichannel Monitoring Systems

Authors: Andrey V. Timofeev

Abstract:

An adaptive nonparametric method is proposed for stable real-time detection of seismoacoustic sources in multichannel C-OTDR systems with a significant number of channels. This method guarantees given upper boundaries for probabilities of Type I and Type II errors. Properties of the proposed method are rigorously proved. The results of practical applications of the proposed method in a real C-OTDR-system are presented in this report.

Keywords: guaranteed detection, multichannel monitoring systems, change point, interval estimation, adaptive detection

Procedia PDF Downloads 419

113 A Brief Study about Nonparametric Adherence Tests

Authors: Vinicius R. Domingues, Luan C. S. M. Ozelim

Abstract:

The statistical study has become indispensable for various fields of knowledge. Not any different, in Geotechnics the study of probabilistic and statistical methods has gained power considering its use in characterizing the uncertainties inherent in soil properties. One of the situations where engineers are constantly faced is the definition of a probability distribution that represents significantly the sampled data. To be able to discard bad distributions, goodness-of-fit tests are necessary. In this paper, three non-parametric goodness-of-fit tests are applied to a data set computationally generated to test the goodness-of-fit of them to a series of known distributions. It is shown that the use of normal distribution does not always provide satisfactory results regarding physical and behavioral representation of the modeled parameters.

Keywords: Kolmogorov-Smirnov test, Anderson-Darling test, Cramer-Von-Mises test, nonparametric adherence tests

Procedia PDF Downloads 412

112 Modern Imputation Technique for Missing Data in Linear Functional Relationship Model

Authors: Adilah Abdul Ghapor, Yong Zulina Zubairi, Rahmatullah Imon

Abstract:

Missing value problem is common in statistics and has been of interest for years. This article considers two modern techniques in handling missing data for linear functional relationship model (LFRM) namely the Expectation-Maximization (EM) algorithm and Expectation-Maximization with Bootstrapping (EMB) algorithm using three performance indicators; namely the mean absolute error (MAE), root mean square error (RMSE) and estimated biased (EB). In this study, we applied the methods of imputing missing values in the LFRM. Results of the simulation study suggest that EMB algorithm performs much better than EM algorithm in both models. We also illustrate the applicability of the approach in a real data set.

Keywords: expectation-maximization, expectation-maximization with bootstrapping, linear functional relationship model, performance indicators

Procedia PDF Downloads 362

111 Orthogonal Regression for Nonparametric Estimation of Errors-In-Variables Models

Authors: Anastasiia Yu. Timofeeva

Abstract:

Two new algorithms for nonparametric estimation of errors-in-variables models are proposed. The first algorithm is based on penalized regression spline. The spline is represented as a piecewise-linear function and for each linear portion orthogonal regression is estimated. This algorithm is iterative. The second algorithm involves locally weighted regression estimation. When the independent variable is measured with error such estimation is a complex nonlinear optimization problem. The simulation results have shown the advantage of the second algorithm under the assumption that true smoothing parameters values are known. Nevertheless the use of some indexes of fit to smoothing parameters selection gives the similar results and has an oversmoothing effect.

Keywords: grade point average, orthogonal regression, penalized regression spline, locally weighted regression

Procedia PDF Downloads 381

110 Quantile Smoothing Splines: Application on Productivity of Enterprises

Authors: Semra Turkan

Abstract:

In this paper, we have examined the factors that affect the productivity of Turkey’s Top 500 Industrial Enterprises in 2014. The labor productivity of enterprises is taken as an indicator of productivity of industrial enterprises. When the relationships between some financial ratios and labor productivity, it is seen that there is a nonparametric relationship between labor productivity and return on sales. In addition, the distribution of labor productivity of enterprises is right-skewed. If the dependent distribution is skewed, the quantile regression is more suitable for this data. Hence, the nonparametric relationship between labor productivity and return on sales by quantile smoothing splines.

Keywords: quantile regression, smoothing spline, labor productivity, financial ratios

Procedia PDF Downloads 261

109 Two-Phase Sampling for Estimating a Finite Population Total in Presence of Missing Values

Authors: Daniel Fundi Murithi

Abstract:

Missing data is a real bane in many surveys. To overcome the problems caused by missing data, partial deletion, and single imputation methods, among others, have been proposed. However, problems such as discarding usable data and inaccuracy in reproducing known population parameters and standard errors are associated with them. For regression and stochastic imputation, it is assumed that there is a variable with complete cases to be used as a predictor in estimating missing values in the other variable, and the relationship between the two variables is linear, which might not be realistic in practice. In this project, we estimate population total in presence of missing values in two-phase sampling. Instead of regression or stochastic models, non-parametric model based regression model is used in imputing missing values. Empirical study showed that nonparametric model-based regression imputation is better in reproducing variance of population total estimate obtained when there were no missing values compared to mean, median, regression, and stochastic imputation methods. Although regression and stochastic imputation were better than nonparametric model-based imputation in reproducing population total estimates obtained when there were no missing values in one of the sample sizes considered, nonparametric model-based imputation may be used when the relationship between outcome and predictor variables is not linear.

Keywords: finite population total, missing data, model-based imputation, two-phase sampling

Procedia PDF Downloads 100

108 Nonparametric Estimation of Risk-Neutral Densities via Empirical Esscher Transform

Authors: Manoel Pereira, Alvaro Veiga, Camila Epprecht, Renato Costa

Abstract:

This paper introduces an empirical version of the Esscher transform for risk-neutral option pricing. Traditional parametric methods require the formulation of an explicit risk-neutral model and are operational only for a few probability distributions for the returns of the underlying. In our proposal, we make only mild assumptions on the pricing kernel and there is no need for the formulation of the risk-neutral model for the returns. First, we simulate sample paths for the returns under the physical distribution. Then, based on the empirical Esscher transform, the sample is reweighted, giving rise to a risk-neutralized sample from which derivative prices can be obtained by a weighted sum of the options pay-offs in each path. We compare our proposal with some traditional parametric pricing methods in four experiments with artificial and real data.

Keywords: esscher transform, generalized autoregressive Conditional Heteroscedastic (GARCH), nonparametric option pricing

Procedia PDF Downloads 456

107 Development of Generalized Correlation for Liquid Thermal Conductivity of N-Alkane and Olefin

Authors: A. Ishag Mohamed, A. A. Rabah

Abstract:

The objective of this research is to develop a generalized correlation for the prediction of thermal conductivity of n-Alkanes and Alkenes. There is a minority of research and lack of correlation for thermal conductivity of liquids in the open literature. The available experimental data are collected covering the groups of n-Alkanes and Alkenes.The data were assumed to correlate to temperature using Filippov correlation. Nonparametric regression of Grace Algorithm was used to develop the generalized correlation model. A spread sheet program based on Microsoft Excel was used to plot and calculate the value of the coefficients. The results obtained were compared with the data that found in Perry's Chemical Engineering Hand Book. The experimental data correlated to the temperature ranged "between" 273.15 to 673.15 K, with R2 = 0.99.The developed correlation reproduced experimental data that which were not included in regression with absolute average percent deviation (AAPD) of less than 7 %. Thus the spread sheet was quite accurate which produces reliable data.

Keywords: N-Alkanes, N-Alkenes, nonparametric, regression

Procedia PDF Downloads 622

106 Spatial Rank-Based High-Dimensional Monitoring through Random Projection

Authors: Chen Zhang, Nan Chen

Abstract:

High-dimensional process monitoring becomes increasingly important in many application domains, where usually the process distribution is unknown and much more complicated than the normal distribution, and the between-stream correlation can not be neglected. However, since the process dimension is generally much bigger than the reference sample size, most traditional nonparametric multivariate control charts fail in high-dimensional cases due to the curse of dimensionality. Furthermore, when the process goes out of control, the influenced variables are quite sparse compared with the whole dimension, which increases the detection difficulty. Targeting at these issues, this paper proposes a new nonparametric monitoring scheme for high-dimensional processes. This scheme first projects the high-dimensional process into several subprocesses using random projections for dimension reduction. Then, for every subprocess with the dimension much smaller than the reference sample size, a local nonparametric control chart is constructed based on the spatial rank test to detect changes in this subprocess. Finally, the results of all the local charts are fused together for decision. Furthermore, after an out-of-control (OC) alarm is triggered, a diagnostic framework is proposed. using the square-root LASSO. Numerical studies demonstrate that the chart has satisfactory detection power for sparse OC changes and robust performance for non-normally distributed data, The diagnostic framework is also effective to identify truly changed variables. Finally, a real-data example is presented to demonstrate the application of the proposed method.

Keywords: random projection, high-dimensional process control, spatial rank, sequential change detection

Procedia PDF Downloads 273

105 Nonparametric Quantile Regression for Multivariate Spatial Data

Authors: S. H. Arnaud Kanga, O. Hili, S. Dabo-Niang

Abstract:

Spatial prediction is an issue appealing and attracting several fields such as agriculture, environmental sciences, ecology, econometrics, and many others. Although multiple non-parametric prediction methods exist for spatial data, those are based on the conditional expectation. This paper took a different approach by examining a non-parametric spatial predictor of the conditional quantile. The study especially observes the stationary multidimensional spatial process over a rectangular domain. Indeed, the proposed quantile is obtained by inverting the conditional distribution function. Furthermore, the proposed estimator of the conditional distribution function depends on three kernels, where one of them controls the distance between spatial locations, while the other two control the distance between observations. In addition, the almost complete convergence and the convergence in mean order q of the kernel predictor are obtained when the sample considered is alpha-mixing. Such approach of the prediction method gives the advantage of accuracy as it overcomes sensitivity to extreme and outliers values.

Keywords: conditional quantile, kernel, nonparametric, stationary

Procedia PDF Downloads 120

104 The Classification Performance in Parametric and Nonparametric Discriminant Analysis for a Class- Unbalanced Data of Diabetes Risk Groups

Authors: Lily Ingsrisawang, Tasanee Nacharoen

Abstract:

Introduction: The problems of unbalanced data sets generally appear in real world applications. Due to unequal class distribution, many research papers found that the performance of existing classifier tends to be biased towards the majority class. The k -nearest neighbors’ nonparametric discriminant analysis is one method that was proposed for classifying unbalanced classes with good performance. Hence, the methods of discriminant analysis are of interest to us in investigating misclassification error rates for class-imbalanced data of three diabetes risk groups. Objective: The purpose of this study was to compare the classification performance between parametric discriminant analysis and nonparametric discriminant analysis in a three-class classification application of class-imbalanced data of diabetes risk groups. Methods: Data from a healthy project for 599 staffs in a government hospital in Bangkok were obtained for the classification problem. The staffs were diagnosed into one of three diabetes risk groups: non-risk (90%), risk (5%), and diabetic (5%). The original data along with the variables; diabetes risk group, age, gender, cholesterol, and BMI was analyzed and bootstrapped up to 50 and 100 samples, 599 observations per sample, for additional estimation of misclassification error rate. Each data set was explored for the departure of multivariate normality and the equality of covariance matrices of the three risk groups. Both the original data and the bootstrap samples show non-normality and unequal covariance matrices. The parametric linear discriminant function, quadratic discriminant function, and the nonparametric k-nearest neighbors’ discriminant function were performed over 50 and 100 bootstrap samples and applied to the original data. In finding the optimal classification rule, the choices of prior probabilities were set up for both equal proportions (0.33: 0.33: 0.33) and unequal proportions with three choices of (0.90:0.05:0.05), (0.80: 0.10: 0.10) or (0.70, 0.15, 0.15). Results: The results from 50 and 100 bootstrap samples indicated that the k-nearest neighbors approach when k = 3 or k = 4 and the prior probabilities of {non-risk:risk:diabetic} as {0.90:0.05:0.05} or {0.80:0.10:0.10} gave the smallest error rate of misclassification. Conclusion: The k-nearest neighbors approach would be suggested for classifying a three-class-imbalanced data of diabetes risk groups.

Keywords: error rate, bootstrap, diabetes risk groups, k-nearest neighbors

Procedia PDF Downloads 406

103 The Sequential Estimation of the Seismoacoustic Source Energy in C-OTDR Monitoring Systems

Authors: Andrey V. Timofeev, Dmitry V. Egorov

Abstract:

The practical efficient approach is suggested for estimation of the seismoacoustic sources energy in C-OTDR monitoring systems. This approach represents the sequential plan for confidence estimation both the seismoacoustic sources energy, as well the absorption coefficient of the soil. The sequential plan delivers the non-asymptotic guaranteed accuracy of obtained estimates in the form of non-asymptotic confidence regions with prescribed sizes. These confidence regions are valid for a finite sample size when the distributions of the observations are unknown. Thus, suggested estimates are non-asymptotic and nonparametric, and also these estimates guarantee the prescribed estimation accuracy in the form of the prior prescribed size of confidence regions, and prescribed confidence coefficient value.

Keywords: nonparametric estimation, sequential confidence estimation, multichannel monitoring systems, C-OTDR-system, non-lineary regression

Procedia PDF Downloads 319

102 Identifying Psychosocial, Autonomic, and Pain Sensitivity Risk Factors of Chronic Temporomandibular Disorder by Using Ridge Logistic Regression and Bootstrapping

Authors: Haolin Li, Eric Bair, Jane Monaco, Quefeng Li

Abstract:

The temporomandibular disorder (TMD) is a series of musculoskeletal disorders ranging from jaw pain to chronic debilitating pain, and the risk factors for the onset and maintenance of TMD are still unclear. Prior researches have shown that the potential risk factors for chronic TMD are related to psychosocial factors, autonomic functions, and pain sensitivity. Using data from the Orofacial Pain: Prospective Evaluation and Risk Assessment (OPPERA) study’s baseline case-control study, we examine whether the risk factors identified by prior researches are still statistically significant after taking all of the risk measures into account in one single model, and we also compare the relative influences of the risk factors in three different perspectives (psychosocial factors, autonomic functions, and pain sensitivity) on the chronic TMD. The statistical analysis is conducted by using ridge logistic regression and bootstrapping, in which the performance of the algorithms has been assessed using extensive simulation studies. The results support most of the findings of prior researches that there are many psychosocial and pain sensitivity measures that have significant associations with chronic TMD. However, it is surprising that most of the risk factors of autonomic functions have not presented significant associations with chronic TMD, as described by a prior research.

Keywords: autonomic function, OPPERA study, pain sensitivity, psychosocial measures, temporomandibular disorder

Procedia PDF Downloads 148

101 Air Pollution and Respiratory-Related Restricted Activity Days in Tunisia

Authors: Mokhtar Kouki Inès Rekik

Abstract:

This paper focuses on the assessment of the air pollution and morbidity relationship in Tunisia. Air pollution is measured by ozone air concentration and the morbidity is measured by the number of respiratory-related restricted activity days during the 2-week period prior to the interview. Socioeconomic data are also collected in order to adjust for any confounding covariates. Our sample is composed by 407 Tunisian respondents; 44.7% are women, the average age is 35.2, near 69% are living in a house built after the 1980, and 27.8% have reported at least one day of respiratory-related restricted activity. The model consists on the regression of the number of respiratory-related restricted activity days on the air quality measure and the socioeconomic covariates. In order to correct for zero-inflation and heterogeneity, we estimate several models (Poisson, Negative binomial, Zero inflated Poisson, Poisson hurdle, Negative binomial hurdle and finite mixture Poisson models). Bootstrapping and post-stratification techniques are used in order to correct for any sample bias. According to the Akaike information criteria, the hurdle negative binomial model has the greatest goodness of fit. The main result indicates that, after adjusting for socioeconomic data, the ozone concentration increases the probability of positive number of restricted activity days.

Keywords: bootstrapping, hurdle negbin model, overdispersion, ozone concentration, respiratory-related restricted activity days

Procedia PDF Downloads 228

100 Non-Parametric Regression over Its Parametric Couterparts with Large Sample Size

Authors: Jude Opara, Esemokumo Perewarebo Akpos

Abstract:

This paper is on non-parametric linear regression over its parametric counterparts with large sample size. Data set on anthropometric measurement of primary school pupils was taken for the analysis. The study used 50 randomly selected pupils for the study. The set of data was subjected to normality test, and it was discovered that the residuals are not normally distributed (i.e. they do not follow a Gaussian distribution) for the commonly used least squares regression method for fitting an equation into a set of (x,y)-data points using the Anderson-Darling technique. The algorithms for the nonparametric Theil’s regression are stated in this paper as well as its parametric OLS counterpart. The use of a programming language software known as “R Development” was used in this paper. From the analysis, the result showed that there exists a significant relationship between the response and the explanatory variable for both the parametric and non-parametric regression. To know the efficiency of one method over the other, the Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) are used, and it is discovered that the nonparametric regression performs better than its parametric regression counterparts due to their lower values in both the AIC and BIC. The study however recommends that future researchers should study a similar work by examining the presence of outliers in the data set, and probably expunge it if detected and re-analyze to compare results.

Keywords: Theil’s regression, Bayesian information criterion, Akaike information criterion, OLS

Procedia PDF Downloads 274

99 Asymmetrical Informative Estimation for Macroeconomic Model: Special Case in the Tourism Sector of Thailand

Authors: Chukiat Chaiboonsri, Satawat Wannapan

Abstract:

This paper used an asymmetric informative concept to apply in the macroeconomic model estimation of the tourism sector in Thailand. The variables used to statistically analyze are Thailand international and domestic tourism revenues, the expenditures of foreign and domestic tourists, service investments by private sectors, service investments by the government of Thailand, Thailand service imports and exports, and net service income transfers. All of data is a time-series index which was observed between 2002 and 2015. Empirically, the tourism multiplier and accelerator were estimated by two statistical approaches. The first was the result of the Generalized Method of Moments model (GMM) based on the assumption which the tourism market in Thailand had perfect information (Symmetrical data). The second was the result of the Maximum Entropy Bootstrapping approach (MEboot) based on the process that attempted to deal with imperfect information and reduced uncertainty in data observations (Asymmetrical data). In addition, the tourism leakages were investigated by a simple model based on the injections and leakages concept. The empirical findings represented the parameters computed from the MEboot approach which is different from the GMM method. However, both of the MEboot estimation and GMM model suggests that Thailand’s tourism sectors are in a period capable of stimulating the economy.

Keywords: TThailand tourism, Maximum Entropy Bootstrapping approach, macroeconomic model, asymmetric information

Procedia PDF Downloads 268

98 A Quinary Coding and Matrix Structure Based Channel Hopping Algorithm for Blind Rendezvous in Cognitive Radio Networks

Authors: Qinglin Liu, Zhiyong Lin, Zongheng Wei, Jianfeng Wen, Congming Yi, Hai Liu

Abstract:

The multi-channel blind rendezvous problem in distributed cognitive radio networks (DCRNs) refers to how users in the network can hop to the same channel at the same time slot without any prior knowledge (i.e., each user is unaware of other users' information). The channel hopping (CH) technique is a typical solution to this blind rendezvous problem. In this paper, we propose a quinary coding and matrix structure-based CH algorithm called QCMS-CH. The QCMS-CH algorithm can guarantee the rendezvous of users using only one cognitive radio in the scenario of the asynchronous clock (i.e., arbitrary time drift between the users), heterogeneous channels (i.e., the available channel sets of users are distinct), and symmetric role (i.e., all users play a same role). The QCMS-CH algorithm first represents a randomly selected channel (denoted by R) as a fixed-length quaternary number. Then it encodes the quaternary number into a quinary bootstrapping sequence according to a carefully designed quaternary-quinary coding table with the prefix "R00". Finally, it builds a CH matrix column by column according to the bootstrapping sequence and six different types of elaborately generated subsequences. The user can access the CH matrix row by row and accordingly perform its channel, hoping to attempt rendezvous with other users. We prove the correctness of QCMS-CH and derive an upper bound on its Maximum Time-to-Rendezvous (MTTR). Simulation results show that the QCMS-CH algorithm outperforms the state-of-the-art in terms of the MTTR and the Expected Time-to-Rendezvous (ETTR).

Keywords: channel hopping, blind rendezvous, cognitive radio networks, quaternary-quinary coding

Procedia PDF Downloads 58

97 Two-Stage Hospital Efficiency Analysis Including Qualitative Evidence: A Greek Case

Authors: Panos Xenos, Milton Nektarios, John Yfantopoulos

Abstract:

Background: Policy makers, professional organizations and payers have introduced a variety of initiatives and reforms for the health systems worldwide, aimed at improving hospital efficiency. Their efforts are concentrated in two main categories: to constrain increasing healthcare costs and to enhance quality of services provided. Research Objectives: This study examines the efficiency of 112 Greek public hospitals for the year 2009, evaluates the importance of bootstrapping techniques and investigates the effect of contextual factors on hospital efficiency. Furthermore, the effect of qualitative evidence, on hospital efficiency is explored using data from 28 large hospitals. Methods: We applied Data Envelopment Analysis, augmented by bootstrapping techniques, to estimate efficiency scores. In order to measure the effect of environmental factors on hospital efficiency we used Tobit regression analysis. The significance of our models is evaluated using statistical tests to compare distributions. Results: The Kolmogorov-Smirnov test between the original and the bootstrap-corrected efficiency indicates that their distributions are significantly different (p-value<0.01). The environmental factors, that seem to influence efficiency, are Occupancy Rating and the ratio between Outpatient Visits and Inpatient Days. Results indicate that the inclusion of the quality variable in DEA modelling generates statistically significant variations in efficiency scores (p-value<0.05). Conclusions: The inclusion of quality variables and the use of bootstrap resampling in efficiency analysis impose a statistically significant effect on the distribution of efficiency scores. As a policy conclusion we highlight the importance of these methods on hospital efficiency analysis and, by implication, on healthcare resource allocation.

Keywords: hospitals, efficiency, quality, data envelopment analysis, Greek public hospital sector

Procedia PDF Downloads 273

96 Using the Bootstrap for Problems Statistics

Authors: Brahim Boukabcha, Amar Rebbouh

Abstract:

The bootstrap method based on the idea of exploiting all the information provided by the initial sample, allows us to study the properties of estimators. In this article we will present a theoretical study on the different methods of bootstrapping and using the technique of re-sampling in statistics inference to calculate the standard error of means of an estimator and determining a confidence interval for an estimated parameter. We apply these methods tested in the regression models and Pareto model, giving the best approximations.

Keywords: bootstrap, error standard, bias, jackknife, mean, median, variance, confidence interval, regression models

Procedia PDF Downloads 351

95 Kernel-Based Double Nearest Proportion Feature Extraction for Hyperspectral Image Classiﬁcation

Authors: Hung-Sheng Lin, Cheng-Hsuan Li

Abstract:

Over the past few years, kernel-based algorithms have been widely used to extend some linear feature extraction methods such as principal component analysis (PCA), linear discriminate analysis (LDA), and nonparametric weighted feature extraction (NWFE) to their nonlinear versions, kernel principal component analysis (KPCA), generalized discriminate analysis (GDA), and kernel nonparametric weighted feature extraction (KNWFE), respectively. These nonlinear feature extraction methods can detect nonlinear directions with the largest nonlinear variance or the largest class separability based on the given kernel function. Moreover, they have been applied to improve the target detection or the image classification of hyperspectral images. The double nearest proportion feature extraction (DNP) can effectively reduce the overlap effect and have good performance in hyperspectral image classification. The DNP structure is an extension of the k-nearest neighbor technique. For each sample, there are two corresponding nearest proportions of samples, the self-class nearest proportion and the other-class nearest proportion. The term “nearest proportion” used here consider both the local information and other more global information. With these settings, the effect of the overlap between the sample distributions can be reduced. Usually, the maximum likelihood estimator and the related unbiased estimator are not ideal estimators in high dimensional inference problems, particularly in small data-size situation. Hence, an improved estimator by shrinkage estimation (regularization) is proposed. Based on the DNP structure, LDA is included as a special case. In this paper, the kernel method is applied to extend DNP to kernel-based DNP (KDNP). In addition to the advantages of DNP, KDNP surpasses DNP in the experimental results. According to the experiments on the real hyperspectral image data sets, the classification performance of KDNP is better than that of PCA, LDA, NWFE, and their kernel versions, KPCA, GDA, and KNWFE.

Keywords: feature extraction, kernel method, double nearest proportion feature extraction, kernel double nearest feature extraction

Procedia PDF Downloads 296

94 Spatiotemporal Variability in Rainfall Trends over Sinai Peninsula Using Nonparametric Methods and Discrete Wavelet Transforms

Authors: Mosaad Khadr

Abstract:

Knowledge of the temporal and spatial variability of rainfall trends has been of great concern for efficient water resource planning, management. In this study annual, seasonal and monthly rainfall trends over the Sinai Peninsula were analyzed by using absolute homogeneity tests, nonparametric Mann–Kendall (MK) test and Sen’s slope estimator methods. The homogeneity of rainfall time-series was examined using four absolute homogeneity tests namely, the Pettitt test, standard normal homogeneity test, Buishand range test, and von Neumann ratio test. Further, the sequential change in the trend of annual and seasonal rainfalls is conducted using sequential MK (SQMK) method. Then the trend analysis based on discrete wavelet transform technique (DWT) in conjunction with SQMK method is performed. The spatial patterns of the detected rainfall trends were investigated using a geostatistical and deterministic spatial interpolation technique. The results achieved from the Mann–Kendall test to the data series (using the 5% significance level) highlighted that rainfall was generally decreasing in January, February, March, November, December, wet season, and annual rainfall. A significant decreasing trend in the winter and annual rainfall with significant levels were inferred based on the Mann-Kendall rank statistics and linear trend. Further, the discrete wavelet transform (DWT) analysis reveal that in general, intra- and inter-annual events (up to 4 years) are more influential in affecting the observed trends. The nature of the trend captured by both methods is similar for all of the cases. On the basis of spatial trend analysis, significant rainfall decreases were also noted in the investigated stations. Overall, significant downward trends in winter and annual rainfall over the Sinai Peninsula was observed during the study period.

Keywords: trend analysis, rainfall, Mann–Kendall test, discrete wavelet transform, Sinai Peninsula

Procedia PDF Downloads 134