Zero Inflated Models for Overdispersed Count Data
Authors: Y. N. Phang, E. F. Loh
Abstract:
The zero inflated models are usually used in modeling count data with excess zeros where the existence of the excess zeros could be structural zeros or zeros which occur by chance. These type of data are commonly found in various disciplines such as finance, insurance, biomedical, econometrical, ecology, and health sciences which involve sex and health dental epidemiology. The most popular zero inflated models used by many researchers are zero inflated Poisson and zero inflated negative binomial models. In addition, zero inflated generalized Poisson and zero inflated double Poisson models are also discussed and found in some literature. Recently zero inflated inverse trinomial model and zero inflated strict arcsine models are advocated and proven to serve as alternative models in modeling overdispersed count data caused by excessive zeros and unobserved heterogeneity. The purpose of this paper is to review some related literature and provide a variety of examples from different disciplines in the application of zero inflated models. Different model selection methods used in model comparison are discussed.
Keywords: Overdispersed count data, model selection methods, likelihood ratio, AIC, BIC.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1086727
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4535References:
[1] M. Ridout, C. G. B. Demetrio, and J. Hinde, “Models for count with
many zeros”, in: Invited Paper Presented at the 19th International
Biometric Conference, CapeTown, South Africa, 1998, 178.
[2] A. C. Cameron and P. K. Trivedi, “Regression analysis of count data”.
Cambridge University Press. 1998
[3] A. C. Cameron and P. K. Trivedi, “Microenometrics: Methods and
Applications”. Cambridge University Press. 2005
[4] D. Lambert, “Zero-inflated Poisson regression, with an application to
random defects in manufacturing”. Technometrics, 34, 1992, 1-14
[5] L. Tom, M. Beatrijs, and D.S. Olivia, “The analysis of zero-inflated
count data: Beyond zero-inflated Poisson regression”, British Journal of
Mathematical and Statistical Psychology, 65, 163-180.
[6] Y. Xia, M. Dianne, J. Ma, C. Feng, C. Wendy, and X. Tu, “Modeling
count outcomes from HIV risk reduction interventions: A comparison of
competing statistical models for count responses”, AIDS Research and
Treatment, Vol. 2012, 1-11.
[7] B. M. Golam Kibria, “ Applicaations of some discrete regression models
for count data”, Pakistan Journal of Statistics and Operation research,
Vol11 No. 1, 2006, 1-16.
[8] A. Bilgic, W. J. Florkowski, and C. Akbay, “Demand for cigarettes in
Turkey: an application of count data models”, Empir, 39, 2010, 733-765.
[9] A. Khan, S. Ullah, and J. Nitz, J. (2011). “Statistical modelling of falls
count data with excess zeros. Journal of the International society for
Child and Adolescent Injury Prevention (Inj Prev)”, 17(4), 2011, 266-
270.
[10] S. Ullah, C. F. Finch, and L. Day, “Statistical modelling for falls count
data”. Accident Analysis and Prevention (Accid Anal Prev), 42(2),
2010, 384-392.
[11] K. K. W. Yau and K. C. H. Yip, “On modeling claim frequency data in
general insurance with extra zeros”. Insurance: Mathematics and
Economics Vol. 36, Issue 2, 2005, 153-163.
[12] A. H. Lee, M. R. Stevenson, K. Wang, and K. K. W. Yau, “Modelling
young driver motor vehicle crashes: data with extra zeros”, Accident
Analysis Prevention, 34, 2002, 515-521.
[13] M. L. Dalrymple, I. L. Hudson, and R. P. K. Ford, “Finite mixture, zeroinflated
Poisson and hurdle models with application to SIDS”,
Computational Statistics & Data Analysis, 41, 2003, 491-504
[14] A. C. Mehmet, “Zero-inflated regression models for modeling the effect
of air pollutants on hospital admissions”, Polish Journal of Environment
Studies, Vol. 21, No. 3, 2012, 565-568.
[15] R. Winkelmann, Econometric Analysis of Count Data. Springe Verlag,
Berlin, Heidelberg, 2008.
[16] R. Winkelmann, “Health care reform and the number of doctor visits –
An econometric analysis,” Journal of Applied Econometrics 19, 2004,
455-472
[17] K. K. W. Yau, K. Wang, and A. H.and Lee, “Zero-inflated negative
binomial mixed regression Modeling of overdispersed count data with
extra zeros”. Biometrical Journal, 45, 4, 2003, 437-452.
[18] S. Gurmu and P. K. Trivedi, “Excess zeros in count models for
recreational trips”, Journal of Business and Economic Statistics, 14,
1996, 469-477.
[19] D. B. Hall, “Zero inflated Poisson and binomial with random effects: a
case study,” Biometrics, 56, 2000, 1030-1039
[20] D. Bohning, E. Dietz, P. Schlattman, L. Mendonca and U. Kirchner,
“The zero-inflated Poisson model and the decayed, missing and filled
teeth index in dental epidemiology”. Journal of the Royal Statistical
Society, Series A, 1999, 162-209
[21] P. J. W. Carrivick, A. H. Lee, and K. K. W. Yau, “Zero inflated Poisson
modeling to evaluate occupational safety interventions”, Safety Science,
41, 2002, 53-63.
[22] A. H. Welsh, R. B. cunningham, C. F. Donnelly and D. B. Lindenmayer,
“Modelling the abundance of rare species: statistical models for counts
with extra zeros”, Ecolog Modell, 88, 1996, 297-308.
[23] F. Famoye and P. S. Karan, “ Zero-Inflated Generalized Poisson
Regression Model with an Application to Domestic Violence Data,” J of
Data Science 4, 2006, 117-130.
[24] Z. Yang, J. W. Hardin, and C. L. Addy, “Score test for Zero inflation in
overdispersed count data”, Cummunication in Statistics – Theory and
Methods, 39, 2010, 2008-2030.
[25] C. Czado, V. Erhardt, A. Min, and S. Wagner, “ Zero-inflated
generalized Poisson models with regression effects on the mean,
dispersion and zero-inflation level applied to patent outsourcing rates”,
Statiscal Modelling, 7, 2, 2007, 125-153.
[26] P. L. Gupta, R. C. Gupta, and R. C. Tripathi, Score test for zero inflated
generalized model”, Communication in Statistics – Theory and Methods,
Vol.33, No.1, 2004, 47-64.
[27] Y. N. Phang, “Statistical inference for a family of discrete distribution
with cubic variance functions”, Unpublished PhD thesis, University
Malaya, Malaysia, 2007
[28] Y. N. Phang, and E. R. Loh. Proceedings: IASC 2008: Joint Meeting of
4th World Conference of the IASC and 6th Conference of the IASC and
6th conference of the Asian Regional Section of the IASC on
Computational Statistic and Data Analysis, Yokohama, Japan, 2008
[29] N. Jasakul, and P. H. John, “ Score tests for extra-zero models in zeroinflated
negative binomial models”, Communications in Statistics-
Simulation and Computation, 38, 2009, 92-108.
[30] H. Akaike, “A new look at the statistical model identification”. IEEE
Transaction on Automatic Control, 19(6), 1974, 716-724.
[31] G. Schwarz, “Estimating the dimensions of a model”, Annals of
Statistics, 6, 1978, 461-464
[32] Q. H. Vuong, “Likelihood ratio tests for model selection and non-nested
hypotheses”, Econometrica, 57, 1989, 307-333.
[33] J. Neyman, and E. S. Pearson, “On the use and interpretation of certain
test criteria for purposes of statistical inference”, Biometrika, 20, 1928,
175-240.
[34] G. Sileshi, “Selecting the right statistical model for analysis of insect
count data by using information theoretic measures’, Bulletin of
Entomological Research, 96, 2006, 479-488.