Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2579

Search results for: Poisson regression

2579 Model Averaging for Poisson Regression

Authors: Zhou Jianhong

Abstract:

Model averaging is a desirable approach to deal with model uncertainty, which, however, has rarely been explored for Poisson regression. In this paper, we propose a model averaging procedure based on an unbiased estimator of the expected Kullback-Leibler distance for the Poisson regression. Simulation study shows that the proposed model average estimator outperforms some other commonly used model selection and model average estimators in some situations. Our proposed methods are further applied to a real data example and the advantage of this method is demonstrated again.

Keywords: model averaging, poission regression, Kullback-Leibler distance, statistics

Procedia PDF Downloads 393
2578 Analysis of Factors Affecting the Number of Infant and Maternal Mortality in East Java with Geographically Weighted Bivariate Generalized Poisson Regression Method

Authors: Luh Eka Suryani, Purhadi

Abstract:

Poisson regression is a non-linear regression model with response variable in the form of count data that follows Poisson distribution. Modeling for a pair of count data that show high correlation can be analyzed by Poisson Bivariate Regression. Data, the number of infant mortality and maternal mortality, are count data that can be analyzed by Poisson Bivariate Regression. The Poisson regression assumption is an equidispersion where the mean and variance values are equal. However, the actual count data has a variance value which can be greater or less than the mean value (overdispersion and underdispersion). Violations of this assumption can be overcome by applying Generalized Poisson Regression. Characteristics of each regency can affect the number of cases occurred. This issue can be overcome by spatial analysis called geographically weighted regression. This study analyzes the number of infant mortality and maternal mortality based on conditions in East Java in 2016 using Geographically Weighted Bivariate Generalized Poisson Regression (GWBGPR) method. Modeling is done with adaptive bisquare Kernel weighting which produces 3 regency groups based on infant mortality rate and 5 regency groups based on maternal mortality rate. Variables that significantly influence the number of infant and maternal mortality are the percentages of pregnant women visit health workers at least 4 times during pregnancy, pregnant women get Fe3 tablets, obstetric complication handled, clean household and healthy behavior, and married women with the first marriage age under 18 years.

Keywords: adaptive bisquare kernel, GWBGPR, infant mortality, maternal mortality, overdispersion

Procedia PDF Downloads 66
2577 Regression for Doubly Inflated Multivariate Poisson Distributions

Authors: Ishapathik Das, Sumen Sen, N. Rao Chaganty, Pooja Sengupta

Abstract:

Dependent multivariate count data occur in several research studies. These data can be modeled by a multivariate Poisson or Negative binomial distribution constructed using copulas. However, when some of the counts are inflated, that is, the number of observations in some cells are much larger than other cells, then the copula based multivariate Poisson (or Negative binomial) distribution may not fit well and it is not an appropriate statistical model for the data. There is a need to modify or adjust the multivariate distribution to account for the inflated frequencies. In this article, we consider the situation where the frequencies of two cells are higher compared to the other cells, and develop a doubly inflated multivariate Poisson distribution function using multivariate Gaussian copula. We also discuss procedures for regression on covariates for the doubly inflated multivariate count data. For illustrating the proposed methodologies, we present a real data containing bivariate count observations with inflations in two cells. Several models and linear predictors with log link functions are considered, and we discuss maximum likelihood estimation to estimate unknown parameters of the models.

Keywords: copula, Gaussian copula, multivariate distributions, inflated distributios

Procedia PDF Downloads 63
2576 A Survey on Quasi-Likelihood Estimation Approaches for Longitudinal Set-ups

Authors: Naushad Mamode Khan

Abstract:

The Com-Poisson (CMP) model is one of the most popular discrete generalized linear models (GLMS) that handles both equi-, over- and under-dispersed data. In longitudinal context, an integer-valued autoregressive (INAR(1)) process that incorporates covariate specification has been developed to model longitudinal CMP counts. However, the joint likelihood CMP function is difficult to specify and thus restricts the likelihood based estimating methodology. The joint generalized quasilikelihood approach (GQL-I) was instead considered but is rather computationally intensive and may not even estimate the regression effects due to a complex and frequently ill conditioned covariance structure. This paper proposes a new GQL approach for estimating the regression parameters (GQLIII) that are based on a single score vector representation. The performance of GQL-III is compared with GQL-I and separate marginal GQLs (GQL-II) through some simulation experiments and is proved to yield equally efficient estimates as GQL-I and is far more computationally stable.

Keywords: longitudinal, com-Poisson, ill-conditioned, INAR(1), GLMS, GQL

Procedia PDF Downloads 281
2575 Integrated Nested Laplace Approximations For Quantile Regression

Authors: Kajingulu Malandala, Ranganai Edmore

Abstract:

The asymmetric Laplace distribution (ADL) is commonly used as the likelihood function of the Bayesian quantile regression, and it offers different families of likelihood method for quantile regression. Notwithstanding their popularity and practicality, ADL is not smooth and thus making it difficult to maximize its likelihood. Furthermore, Bayesian inference is time consuming and the selection of likelihood may mislead the inference, as the Bayes theorem does not automatically establish the posterior inference. Furthermore, ADL does not account for greater skewness and Kurtosis. This paper develops a new aspect of quantile regression approach for count data based on inverse of the cumulative density function of the Poisson, binomial and Delaporte distributions using the integrated nested Laplace Approximations. Our result validates the benefit of using the integrated nested Laplace Approximations and support the approach for count data.

Keywords: quantile regression, Delaporte distribution, count data, integrated nested Laplace approximation

Procedia PDF Downloads 69
2574 Effect of the Poisson’s Ratio on the Behavior of Epoxy Microbeam

Authors: Mohammad Tahmasebipour, Hosein Salarpour

Abstract:

Researchers suggest that variations in Poisson’s ratio affect the behavior of Timoshenko micro beam. Therefore, in this study, two epoxy Timoshenko micro beams with different dimensions were modeled using the finite element method considering all boundary conditions and initial conditions that govern the problem. The effect of Poisson’s ratio on the resonant frequency, maximum deflection, and maximum rotation of the micro beams was examined. The analyses suggest that an increased Poisson’s ratio reduces the maximum rotation and the maximum rotation and increases the resonant frequency. Results were consistent with those obtained using the couple stress, classical, and strain gradient elasticity theories.

Keywords: microbeam, microsensor, epoxy, poisson’s ratio, dynamic behavior, static behavior, finite element method

Procedia PDF Downloads 379
2573 Count Regression Modelling on Number of Migrants in Households

Authors: Tsedeke Lambore Gemecho, Ayele Taye Goshu

Abstract:

The main objective of this study is to identify the determinants of the number of international migrants in a household and to compare regression models for count response. This study is done by collecting data from total of 2288 household heads of 16 randomly sampled districts in Hadiya and Kembata-Tembaro zones of Southern Ethiopia. The Poisson mixed models, as special cases of the generalized linear mixed model, is explored to determine effects of the predictors: age of household head, farm land size, and household size. Two ethnicities Hadiya and Kembata are included in the final model as dummy variables. Stepwise variable selection has indentified four predictors: age of head, farm land size, family size and dummy variable ethnic2 (0=other, 1=Kembata). These predictors are significant at 5% significance level with count response number of migrant. The Poisson mixed model consisting of the four predictors with random effects districts. Area specific random effects are significant with the variance of about 0.5105 and standard deviation of 0.7145. The results show that the number of migrant increases with heads age, family size, and farm land size. In conclusion, there is a significantly high number of international migration per household in the area. Age of household head, family size, and farm land size are determinants that increase the number of international migrant in households. Community-based intervention is needed so as to monitor and regulate the international migration for the benefits of the society.

Keywords: Poisson regression, GLM, number of migrant, Hadiya and Kembata Tembaro zones

Procedia PDF Downloads 199
2572 Population Size Estimation Based on the GPD

Authors: O. Anan, D. Böhning, A. Maruotti

Abstract:

The purpose of the study is to estimate the elusive target population size under a truncated count model that accounts for heterogeneity. The purposed estimator is based on the generalized Poisson distribution (GPD), which extends the Poisson distribution by adding a dispersion parameter. Thus, it becomes an useful model for capture-recapture data where concurrent events are not homogeneous. In addition, it can account for over-dispersion and under-dispersion. The ratios of neighboring frequency counts are used as a tool for investigating the validity of whether generalized Poisson or Poisson distribution. Since capture-recapture approaches do not provide the zero counts, the estimated parameters can be achieved by modifying the EM-algorithm technique for the zero-truncated generalized Poisson distribution. The properties and the comparative performance of proposed estimator were investigated through simulation studies. Furthermore, some empirical examples are represented insights on the behavior of the estimators.

Keywords: capture, recapture methods, ratio plot, heterogeneous population, zero-truncated count

Procedia PDF Downloads 360
2571 An Algorithm for Removal of Noise from X-Ray Images

Authors: Sajidullah Khan, Najeeb Ullah, Wang Yin Chai, Chai Soo See

Abstract:

In this paper, we propose an approach to remove impulse and Poisson noise from X-ray images. Many filters have been used for impulse noise removal from color and gray scale images with their own strengths and weaknesses but X-ray images contain Poisson noise and unfortunately there is no intelligent filter which can detect impulse and Poisson noise from X-ray images. Our proposed filter uses the upgraded layer discrimination approach to detect both Impulse and Poisson noise corrupted pixels in X-ray images and then restores only those detected pixels with a simple efficient and reliable one line equation. Our Proposed algorithms are very effective and much more efficient than all existing filters used only for Impulse noise removal. The proposed method uses a new powerful and efficient noise detection method to determine whether the pixel under observation is corrupted or noise free. Results from computer simulations are used to demonstrate pleasing performance of our proposed method.

Keywords: X-ray image de-noising, impulse noise, poisson noise, PRWF

Procedia PDF Downloads 280
2570 Air Pollution and Respiratory-Related Restricted Activity Days in Tunisia

Authors: Mokhtar Kouki Inès Rekik

Abstract:

This paper focuses on the assessment of the air pollution and morbidity relationship in Tunisia. Air pollution is measured by ozone air concentration and the morbidity is measured by the number of respiratory-related restricted activity days during the 2-week period prior to the interview. Socioeconomic data are also collected in order to adjust for any confounding covariates. Our sample is composed by 407 Tunisian respondents; 44.7% are women, the average age is 35.2, near 69% are living in a house built after the 1980, and 27.8% have reported at least one day of respiratory-related restricted activity. The model consists on the regression of the number of respiratory-related restricted activity days on the air quality measure and the socioeconomic covariates. In order to correct for zero-inflation and heterogeneity, we estimate several models (Poisson, Negative binomial, Zero inflated Poisson, Poisson hurdle, Negative binomial hurdle and finite mixture Poisson models). Bootstrapping and post-stratification techniques are used in order to correct for any sample bias. According to the Akaike information criteria, the hurdle negative binomial model has the greatest goodness of fit. The main result indicates that, after adjusting for socioeconomic data, the ozone concentration increases the probability of positive number of restricted activity days.

Keywords: bootstrapping, hurdle negbin model, overdispersion, ozone concentration, respiratory-related restricted activity days

Procedia PDF Downloads 173
2569 A Nonlocal Means Algorithm for Poisson Denoising Based on Information Geometry

Authors: Dongxu Chen, Yipeng Li

Abstract:

This paper presents an information geometry NonlocalMeans(NLM) algorithm for Poisson denoising. NLM estimates a noise-free pixel as a weighted average of image pixels, where each pixel is weighted according to the similarity between image patches in Euclidean space. In this work, every pixel is a Poisson distribution locally estimated by Maximum Likelihood (ML), all distributions consist of a statistical manifold. A NLM denoising algorithm is conducted on the statistical manifold where Fisher information matrix can be used for computing distribution geodesics referenced as the similarity between patches. This approach was demonstrated to be competitive with related state-of-the-art methods.

Keywords: image denoising, Poisson noise, information geometry, nonlocal-means

Procedia PDF Downloads 214
2568 Count Data Regression Modeling: An Application to Spontaneous Abortion in India

Authors: Prashant Verma, Prafulla K. Swain, K. K. Singh, Mukti Khetan

Abstract:

Objective: In India, around 20,000 women die every year due to abortion-related complications. In the modelling of count variables, there is sometimes a preponderance of zero counts. This article concerns the estimation of various count regression models to predict the average number of spontaneous abortion among women in the Punjab state of India. It also assesses the factors associated with the number of spontaneous abortions. Materials and methods: The study included 27,173 married women of Punjab obtained from the DLHS-4 survey (2012-13). Poisson regression (PR), Negative binomial (NB) regression, zero hurdle negative binomial (ZHNB), and zero-inflated negative binomial (ZINB) models were employed to predict the average number of spontaneous abortions and to identify the determinants affecting the number of spontaneous abortions. Results: Statistical comparisons among four estimation methods revealed that the ZINB model provides the best prediction for the number of spontaneous abortions. Antenatal care (ANC) place, place of residence, total children born to a woman, woman's education and economic status were found to be the most significant factors affecting the occurrence of spontaneous abortion. Conclusions: The study offers a practical demonstration of techniques designed to handle count variables. Statistical comparisons among four estimation models revealed that the ZINB model provided the best prediction for the number of spontaneous abortions and is recommended to be used to predict the number of spontaneous abortions. The study suggests that women receive institutional Antenatal care to attain limited parity. It also advocates promoting higher education among women in Punjab, India.

Keywords: count data, spontaneous abortion, Poisson model, negative binomial model, zero hurdle negative binomial, zero-inflated negative binomial, regression

Procedia PDF Downloads 70
2567 Behind Fuzzy Regression Approach: An Exploration Study

Authors: Lavinia B. Dulla

Abstract:

The exploration study of the fuzzy regression approach attempts to present that fuzzy regression can be used as a possible alternative to classical regression. It likewise seeks to assess the differences and characteristics of simple linear regression and fuzzy regression using the width of prediction interval, mean absolute deviation, and variance of residuals. Based on the simple linear regression model, the fuzzy regression approach is worth considering as an alternative to simple linear regression when the sample size is between 10 and 20. As the sample size increases, the fuzzy regression approach is not applicable to use since the assumption regarding large sample size is already operating within the framework of simple linear regression. Nonetheless, it can be suggested for a practical alternative when decisions often have to be made on the basis of small data.

Keywords: fuzzy regression approach, minimum fuzziness criterion, interval regression, prediction interval

Procedia PDF Downloads 123
2566 Risk Factors for Defective Autoparts Products Using Bayesian Method in Poisson Generalized Linear Mixed Model

Authors: Pitsanu Tongkhow, Pichet Jiraprasertwong

Abstract:

This research investigates risk factors for defective products in autoparts factories. Under a Bayesian framework, a generalized linear mixed model (GLMM) in which the dependent variable, the number of defective products, has a Poisson distribution is adopted. Its performance is compared with the Poisson GLM under a Bayesian framework. The factors considered are production process, machines, and workers. The products coded RT50 are observed. The study found that the Poisson GLMM is more appropriate than the Poisson GLM. For the production Process factor, the highest risk of producing defective products is Process 1, for the Machine factor, the highest risk is Machine 5, and for the Worker factor, the highest risk is Worker 6.

Keywords: defective autoparts products, Bayesian framework, generalized linear mixed model (GLMM), risk factors

Procedia PDF Downloads 478
2565 Relation Between Traffic Mix and Traffic Accidents in a Mixed Industrial Urban Area

Authors: Michelle Eliane Hernández-García, Angélica Lozano

Abstract:

The traffic accidents study usually contemplates the relation between factors such as the type of vehicle, its operation, and the road infrastructure. Traffic accidents can be explained by different factors, which have a greater or lower relevance. Two zones are studied, a mixed industrial zone and the extended zone of it. The first zone has mainly residential (57%), and industrial (23%) land uses. Trucks are mainly on the roads where industries are located. Four sensors give information about traffic and speed on the main roads. The extended zone (which includes the first zone) has mainly residential (47%) and mixed residential (43%) land use, and just 3% of industrial use. The traffic mix is composed mainly of non-trucks. 39 traffic and speed sensors are located on main roads. The traffic mix in a mixed land use zone, could be related to traffic accidents. To understand this relation, it is required to identify the elements of the traffic mix which are linked to traffic accidents. Models that attempt to explain what factors are related to traffic accidents have faced multiple methodological problems for obtaining robust databases. Poisson regression models are used to explain the accidents. The objective of the Poisson analysis is to estimate a vector to provide an estimate of the natural logarithm of the mean number of accidents per period; this estimate is achieved by standard maximum likelihood procedures. For the estimation of the relation between traffic accidents and the traffic mix, the database is integrated of eight variables, with 17,520 observations and six vectors. In the model, the dependent variable is the occurrence or non-occurrence of accidents, and the vectors that seek to explain it, correspond to the vehicle classes: C1, C2, C3, C4, C5, and C6, respectively, standing for car, microbus, and van, bus, unitary trucks (2 to 6 axles), articulated trucks (3 to 6 axles) and bi-articulated trucks (5 to 9 axles); in addition, there is a vector for the average speed of the traffic mix. A Poisson model is applied, using a logarithmic link function and a Poisson family. For the first zone, the Poisson model shows a positive relation among traffic accidents and C6, average speed, C3, C2, and C1 (in a decreasing order). The analysis of the coefficient shows a high relation with bi-articulated truck and bus (C6 and the C3), indicating an important participation of freight trucks. For the expanded zone, the Poisson model shows a positive relation among traffic accidents and speed average, biarticulated truck (C6), and microbus and vans (C2). The coefficients obtained in both Poisson models shows a higher relation among freight trucks and traffic accidents in the first industrial zone than in the expanded zone.

Keywords: freight transport, industrial zone, traffic accidents, traffic mix, trucks

Procedia PDF Downloads 43
2564 Detecting Overdispersion for Mortality AIDS in Zero-inflated Negative Binomial Death Rate (ZINBDR) Co-infection Patients in Kelantan

Authors: Mohd Asrul Affedi, Nyi Nyi Naing

Abstract:

Overdispersion is present in count data, and basically when a phenomenon happened, a Negative Binomial (NB) is commonly used to replace a standard Poisson model. Analysis of count data event, such as mortality cases basically Poisson regression model is appropriate. Hence, the model is not appropriate when existing a zero values. The zero-inflated negative binomial model is appropriate. In this article, we modelled the mortality cases as a dependent variable by age categorical. The objective of this study to determine existing overdispersion in mortality data of AIDS co-infection patients in Kelantan.

Keywords: negative binomial death rate, overdispersion, zero-inflation negative binomial death rate, AIDS

Procedia PDF Downloads 385
2563 New High Order Group Iterative Schemes in the Solution of Poisson Equation

Authors: Sam Teek Ling, Norhashidah Hj. Mohd. Ali

Abstract:

We investigate the formulation and implementation of new explicit group iterative methods in solving the two-dimensional Poisson equation with Dirichlet boundary conditions. The methods are derived from a fourth order compact nine point finite difference discretization. The methods are compared with the existing second order standard five point formula to show the dramatic improvement in computed accuracy. Numerical experiments are presented to illustrate the effectiveness of the proposed methods.

Keywords: explicit group iterative method, finite difference, fourth order compact, Poisson equation

Procedia PDF Downloads 343
2562 Exploration and Evaluation of the Effect of Multiple Countermeasures on Road Safety

Authors: Atheer Al-Nuaimi, Harry Evdorides

Abstract:

Every day many people die or get disabled or injured on roads around the world, which necessitates more specific treatments for transportation safety issues. International road assessment program (iRAP) model is one of the comprehensive road safety models which accounting for many factors that affect road safety in a cost-effective way in low and middle income countries. In iRAP model road safety has been divided into five star ratings from 1 star (the lowest level) to 5 star (the highest level). These star ratings are based on star rating score which is calculated by iRAP methodology depending on road attributes, traffic volumes and operating speeds. The outcome of iRAP methodology are the treatments that can be used to improve road safety and reduce fatalities and serious injuries (FSI) numbers. These countermeasures can be used separately as a single countermeasure or mix as multiple countermeasures for a location. There is general agreement that the adequacy of a countermeasure is liable to consistent losses when it is utilized as a part of mix with different countermeasures. That is, accident diminishment appraisals of individual countermeasures cannot be easily added together. The iRAP model philosophy makes utilization of a multiple countermeasure adjustment factors to predict diminishments in the effectiveness of road safety countermeasures when more than one countermeasure is chosen. A multiple countermeasure correction factors are figured for every 100-meter segment and for every accident type. However, restrictions of this methodology incorporate a presumable over-estimation in the predicted crash reduction. This study aims to adjust this correction factor by developing new models to calculate the effect of using multiple countermeasures on the number of fatalities for a location or an entire road. Regression models have been used to establish relationships between crash frequencies and the factors that affect their rates. Multiple linear regression, negative binomial regression, and Poisson regression techniques were used to develop models that can address the effectiveness of using multiple countermeasures. Analyses are conducted using The R Project for Statistical Computing showed that a model developed by negative binomial regression technique could give more reliable results of the predicted number of fatalities after the implementation of road safety multiple countermeasures than the results from iRAP model. The results also showed that the negative binomial regression approach gives more precise results in comparison with multiple linear and Poisson regression techniques because of the overdispersion and standard error issues.

Keywords: international road assessment program, negative binomial, road multiple countermeasures, road safety

Procedia PDF Downloads 159
2561 Modeling of Maximum Rainfall Using Poisson-Generalized Pareto Distribution in Kigali, Rwanda

Authors: Emmanuel Iyamuremye

Abstract:

Extreme rainfall events have caused significant damage to agriculture, ecology, and infrastructure, disruption of human activities, injury, and loss of life. They also have significant social, economic, and environmental consequences because they considerably damage urban as well as rural areas. Early detection of extreme maximum rainfall helps to implement strategies and measures, before they occur, hence mitigating the consequences. Extreme value theory has been used widely in modeling extreme rainfall and in various disciplines, such as financial markets, the insurance industry, failure cases. Climatic extremes have been analyzed by using either generalized extreme value (GEV) or generalized Pareto (GP) distributions, which provides evidence of the importance of modeling extreme rainfall from different regions of the world. In this paper, we focused on Peak Over Thresholds approach, where the Poisson-generalized Pareto distribution is considered as the proper distribution for the study of the exceedances. This research also considers the use of the generalized Pareto (GP) distribution with a Poisson model for arrivals to describe peaks over a threshold. The research used statistical techniques to fit models that used to predict extreme rainfall in Kigali. The results indicate that the proposed Poisson-GP distribution provides a better fit to maximum monthly rainfall data. Further, the Poisson-GP models are able to estimate various return levels. The research also found a slow increase in return levels for maximum monthly rainfall for higher return periods, and further, the intervals are increasingly wider as the return period is increasing.

Keywords: exceedances, extreme value theory, generalized Pareto distribution, Poisson generalized Pareto distribution

Procedia PDF Downloads 49
2560 Characterization of the Upper Crust in Botswana Using Vp/Vs and Poisson's Ratios from Body Waves

Authors: Rapelang E. Simon, Thebeetsile A. Olebetse, Joseph R. Maritinkole, Ruth O. Moleleke

Abstract:

The P and S wave seismic velocity ratios (Vp/Vs) of some aftershocks are investigated using the method ofWadati diagrams. These aftershocks occurred after the 3rdApril 2017 Botswana’s Mw 6.5 earthquake and were recorded by the Network of Autonomously Recording Seismographs (NARS)-Botswana temporary network deployed from 2013 to 2018. In this paper, P and S wave data with good signal-to-noise ratiofrom twenty events of local magnitude greater or equal to 4.0are analysed with the Seisan software and used to infer properties of the upper crust in Botswana. The Vp/Vsratiosare determined from the travel-times of body waves and then converted to Poisson’s ratio, which is useful in determining the physical state of the subsurface materials. The Vp/Vs ratios of the upper crust in Botswana show regional variations from 1.70 to 1.77, with an average of 1.73. The Poisson’s ratios range from 0.24to 0.27 with an average of 0.25 and correlate well with the geological structures in Botswana.

Keywords: Botswana, earthquake, poisson's ratio, seismic velocity, Vp/Vs ratio

Procedia PDF Downloads 47
2559 Optimization of Machine Learning Regression Results: An Application on Health Expenditures

Authors: Songul Cinaroglu

Abstract:

Machine learning regression methods are recommended as an alternative to classical regression methods in the existence of variables which are difficult to model. Data for health expenditure is typically non-normal and have a heavily skewed distribution. This study aims to compare machine learning regression methods by hyperparameter tuning to predict health expenditure per capita. A multiple regression model was conducted and performance results of Lasso Regression, Random Forest Regression and Support Vector Machine Regression recorded when different hyperparameters are assigned. Lambda (λ) value for Lasso Regression, number of trees for Random Forest Regression, epsilon (ε) value for Support Vector Regression was determined as hyperparameters. Study results performed by using 'k' fold cross validation changed from 5 to 50, indicate the difference between machine learning regression results in terms of R², RMSE and MAE values that are statistically significant (p < 0.001). Study results reveal that Random Forest Regression (R² ˃ 0.7500, RMSE ≤ 0.6000 ve MAE ≤ 0.4000) outperforms other machine learning regression methods. It is highly advisable to use machine learning regression methods for modelling health expenditures.

Keywords: machine learning, lasso regression, random forest regression, support vector regression, hyperparameter tuning, health expenditure

Procedia PDF Downloads 95
2558 A Hyperexponential Approximation to Finite-Time and Infinite-Time Ruin Probabilities of Compound Poisson Processes

Authors: Amir T. Payandeh Najafabadi

Abstract:

This article considers the problem of evaluating infinite-time (or finite-time) ruin probability under a given compound Poisson surplus process by approximating the claim size distribution by a finite mixture exponential, say Hyperexponential, distribution. It restates the infinite-time (or finite-time) ruin probability as a solvable ordinary differential equation (or a partial differential equation). Application of our findings has been given through a simulation study.

Keywords: ruin probability, compound poisson processes, mixture exponential (hyperexponential) distribution, heavy-tailed distributions

Procedia PDF Downloads 187
2557 Proficient Estimation Procedure for a Rare Sensitive Attribute Using Poisson Distribution

Authors: S. Suman, G. N. Singh

Abstract:

The present manuscript addresses the estimation procedure of population parameter using Poisson probability distribution when characteristic under study possesses a rare sensitive attribute. The generalized form of unrelated randomized response model is suggested in order to acquire the truthful responses from respondents. The resultant estimators have been proposed for two situations when the information on an unrelated rare non-sensitive characteristic is known as well as unknown. The properties of the proposed estimators are derived, and the measure of confidentiality of respondent is also suggested for respondents. Empirical studies are carried out in the support of discussed theory.

Keywords: Poisson distribution, randomized response model, rare sensitive attribute, non-sensitive attribute

Procedia PDF Downloads 119
2556 A Comparison of Smoothing Spline Method and Penalized Spline Regression Method Based on Nonparametric Regression Model

Authors: Autcha Araveeporn

Abstract:

This paper presents a study about a nonparametric regression model consisting of a smoothing spline method and a penalized spline regression method. We also compare the techniques used for estimation and prediction of nonparametric regression model. We tried both methods with crude oil prices in dollars per barrel and the Stock Exchange of Thailand (SET) index. According to the results, it is concluded that smoothing spline method performs better than that of penalized spline regression method.

Keywords: nonparametric regression model, penalized spline regression method, smoothing spline method, Stock Exchange of Thailand (SET)

Procedia PDF Downloads 300
2555 Statistical Analysis for Overdispersed Medical Count Data

Authors: Y. N. Phang, E. F. Loh

Abstract:

Many researchers have suggested the use of zero inflated Poisson (ZIP) and zero inflated negative binomial (ZINB) models in modeling over-dispersed medical count data with extra variations caused by extra zeros and unobserved heterogeneity. The studies indicate that ZIP and ZINB always provide better fit than using the normal Poisson and negative binomial models in modeling over-dispersed medical count data. In this study, we proposed the use of Zero Inflated Inverse Trinomial (ZIIT), Zero Inflated Poisson Inverse Gaussian (ZIPIG) and zero inflated strict arcsine models in modeling over-dispersed medical count data. These proposed models are not widely used by many researchers especially in the medical field. The results show that these three suggested models can serve as alternative models in modeling over-dispersed medical count data. This is supported by the application of these suggested models to a real life medical data set. Inverse trinomial, Poisson inverse Gaussian, and strict arcsine are discrete distributions with cubic variance function of mean. Therefore, ZIIT, ZIPIG and ZISA are able to accommodate data with excess zeros and very heavy tailed. They are recommended to be used in modeling over-dispersed medical count data when ZIP and ZINB are inadequate.

Keywords: zero inflated, inverse trinomial distribution, Poisson inverse Gaussian distribution, strict arcsine distribution, Pearson’s goodness of fit

Procedia PDF Downloads 397
2554 Tests for Zero Inflation in Count Data with Measurement Error in Covariates

Authors: Man-Yu Wong, Siyu Zhou, Zhiqiang Cao

Abstract:

In quality of life, health service utilization is an important determinant of medical resource expenditures on Colorectal cancer (CRC) care, a better understanding of the increased utilization of health services is essential for optimizing the allocation of healthcare resources to services and thus for enhancing the service quality, especially for high expenditure on CRC care like Hong Kong region. In assessing the association between the health-related quality of life (HRQOL) and health service utilization in patients with colorectal neoplasm, count data models can be used, which account for over dispersion or extra zero counts. In our data, the HRQOL evaluation is a self-reported measure obtained from a questionnaire completed by the patients, misreports and variations in the data are inevitable. Besides, there are more zero counts from the observed number of clinical consultations (observed frequency of zero counts = 206) than those from a Poisson distribution with mean equal to 1.33 (expected frequency of zero counts = 156). This suggests that excess of zero counts may exist. Therefore, we study tests for detecting zero-inflation in models with measurement error in covariates. Method: Under classical measurement error model, the approximate likelihood function for zero-inflation Poisson regression model can be obtained, then Approximate Maximum Likelihood Estimation(AMLE) can be derived accordingly, which is consistent and asymptotically normally distributed. By calculating score function and Fisher information based on AMLE, a score test is proposed to detect zero-inflation effect in ZIP model with measurement error. The proposed test follows asymptotically standard normal distribution under H0, and it is consistent with the test proposed for zero-inflation effect when there is no measurement error. Results: Simulation results show that empirical power of our proposed test is the highest among existing tests for zero-inflation in ZIP model with measurement error. In real data analysis, with or without considering measurement error in covariates, existing tests, and our proposed test all imply H0 should be rejected with P-value less than 0.001, i.e., zero-inflation effect is very significant, ZIP model is superior to Poisson model for analyzing this data. However, if measurement error in covariates is not considered, only one covariate is significant; if measurement error in covariates is considered, only another covariate is significant. Moreover, the direction of coefficient estimations for these two covariates is different in ZIP regression model with or without considering measurement error. Conclusion: In our study, compared to Poisson model, ZIP model should be chosen when assessing the association between condition-specific HRQOL and health service utilization in patients with colorectal neoplasm. and models taking measurement error into account will result in statistically more reliable and precise information.

Keywords: count data, measurement error, score test, zero inflation

Procedia PDF Downloads 204
2553 Using Nonhomogeneous Poisson Process with Compound Distribution to Price Catastrophe Options

Authors: Rong-Tsorng Wang

Abstract:

In this paper, we derive a pricing formula for catastrophe equity put options (or CatEPut) with non-homogeneous loss and approximated compound distributions. We assume that the loss claims arrival process is a nonhomogeneous Poisson process (NHPP) representing the clustering occurrences of loss claims, the size of loss claims is a sequence of independent and identically distributed random variables, and the accumulated loss distribution forms a compound distribution and is approximated by a heavy-tailed distribution. A numerical example is given to calibrate parameters, and we discuss how the value of CatEPut is affected by the changes of parameters in the pricing model we provided.

Keywords: catastrophe equity put options, compound distributions, nonhomogeneous Poisson process, pricing model

Procedia PDF Downloads 51
2552 Modeling Karachi Dengue Outbreak and Exploration of Climate Structure

Authors: Syed Afrozuddin Ahmed, Junaid Saghir Siddiqi, Sabah Quaiser

Abstract:

Various studies have reported that global warming causes unstable climate and many serious impact to physical environment and public health. The increasing incidence of dengue incidence is now a priority health issue and become a health burden of Pakistan. In this study it has been investigated that spatial pattern of environment causes the emergence or increasing rate of dengue fever incidence that effects the population and its health. The climatic or environmental structure data and the Dengue Fever (DF) data was processed by coding, editing, tabulating, recoding, restructuring in terms of re-tabulating was carried out, and finally applying different statistical methods, techniques, and procedures for the evaluation. Five climatic variables which we have studied are precipitation (P), Maximum temperature (Mx), Minimum temperature (Mn), Humidity (H) and Wind speed (W) collected from 1980-2012. The dengue cases in Karachi from 2010 to 2012 are reported on weekly basis. Principal component analysis is applied to explore the climatic variables and/or the climatic (structure) which may influence in the increase or decrease in the number of dengue fever cases in Karachi. PC1 for all the period is General atmospheric condition. PC2 for dengue period is contrast between precipitation and wind speed. PC3 is the weighted difference between maximum temperature and wind speed. PC4 for dengue period contrast between maximum and wind speed. Negative binomial and Poisson regression model are used to correlate the dengue fever incidence to climatic variable and principal component score. Relative humidity is estimated to positively influence on the chances of dengue occurrence by 1.71% times. Maximum temperature positively influence on the chances dengue occurrence by 19.48% times. Minimum temperature affects positively on the chances of dengue occurrence by 11.51% times. Wind speed is effecting negatively on the weekly occurrence of dengue fever by 7.41% times.

Keywords: principal component analysis, dengue fever, negative binomial regression model, poisson regression model

Procedia PDF Downloads 351
2551 Characterization of 3D Printed Re-Entrant Chiral Auxetic Geometries

Authors: Tatheer Zahra

Abstract:

Auxetic materials have counteractive properties due to re-entrant geometry that enables them to possess Negative Poisson’s Ratio (NPR). These materials have better energy absorbing and shock resistance capabilities as compared to conventional positive Poisson’s ratio materials. The re-entrant geometry can be created through 3D printing for convenient application of these materials. This paper investigates the mechanical properties of 3D printed chiral auxetic geometries of various sizes. Small scale samples were printed using an ordinary 3D printer and were tested under compression and tension to ascertain their strength and deformation characteristics. A maximum NPR of -9 was obtained under compression and tension. The re-entrant chiral cell size has been shown to affect the mechanical properties of the re-entrant chiral auxetics.

Keywords: auxetic materials, 3D printing, Negative Poisson’s Ratio, re-entrant chiral auxetics

Procedia PDF Downloads 49
2550 Statistical Modeling of Local Area Fading Channels Based on Triply Stochastic Filtered Marked Poisson Point Processes

Authors: Jihad Daba, Jean-Pierre Dubois

Abstract:

Multi path fading noise degrades the performance of cellular communication, most notably in femto- and pico-cells in 3G and 4G systems. When the wireless channel consists of a small number of scattering paths, the statistics of fading noise is not analytically tractable and poses a serious challenge to developing closed canonical forms that can be analysed and used in the design of efficient and optimal receivers. In this context, noise is multiplicative and is referred to as stochastically local fading. In many analytical investigation of multiplicative noise, the exponential or Gamma statistics are invoked. More recent advances by the author of this paper have utilized a Poisson modulated and weighted generalized Laguerre polynomials with controlling parameters and uncorrelated noise assumptions. In this paper, we investigate the statistics of multi-diversity stochastically local area fading channel when the channel consists of randomly distributed Rayleigh and Rician scattering centers with a coherent specular Nakagami-distributed line of sight component and an underlying doubly stochastic Poisson process driven by a lognormal intensity. These combined statistics form a unifying triply stochastic filtered marked Poisson point process model.

Keywords: cellular communication, femto and pico-cells, stochastically local area fading channel, triply stochastic filtered marked Poisson point process

Procedia PDF Downloads 384