Statistical Analysis for Overdispersed Medical Count Data
Authors: Y. N. Phang, E. F. Loh
Abstract:
Many researchers have suggested the use of zero inflated Poisson (ZIP) and zero inflated negative binomial (ZINB) models in modeling overdispersed medical count data with extra variations caused by extra zeros and unobserved heterogeneity. The studies indicate that ZIP and ZINB always provide better fit than using the normal Poisson and negative binomial models in modeling overdispersed medical count data. In this study, we proposed the use of Zero Inflated Inverse Trinomial (ZIIT), Zero Inflated Poisson Inverse Gaussian (ZIPIG) and zero inflated strict arcsine models in modeling overdispered medical count data. These proposed models are not widely used by many researchers especially in the medical field. The results show that these three suggested models can serve as alternative models in modeling overdispersed medical count data. This is supported by the application of these suggested models to a real life medical data set. Inverse trinomial, Poisson inverse Gaussian and strict arcsine are discrete distributions with cubic variance function of mean. Therefore, ZIIT, ZIPIG and ZISA are able to accommodate data with excess zeros and very heavy tailed. They are recommended to be used in modeling overdispersed medical count data when ZIP and ZINB are inadequate.
Keywords: Zero inflated, inverse trinomial distribution, Poisson inverse Gaussian distribution, strict arcsine distribution, Pearson’s goodness of fit.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1090701
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3314References:
[1] M.-C. Hu, M. Pavlicova, and E. V. Nunes, "Zero-inflated and hurdle models of count data with extra zeros: Examples from an HIV-risk reduction intervention trial," The American journal of drug and alcohol abuse, vol. 37, pp. 367-375, 2011.
[2] C. J. Brown, J. A. Pagán, and E. Rodríguez-Oreggia, "The decision-making process of health care utilization in Mexico," Health Policy, vol. 72, pp. 81-91, 2005.
[3] K. J. Krobot, J. S. Kaufman, D. B. Christensen, J. S. Preisser, W. C. Miller, and M. A. Ibrahim, "Accessing a new medication in Germany: A novel approach to assess a health insurance-related barrier," Annals of Epidemiology, vol. 15, pp. 756-761, 2005.
[4] T.-C. Liu and C.-S. Chen, "An analysis of private health insurance purchasing decisions with national health insurance in Taiwan," Social science & medicine, vol. 55, pp. 755-774, 2002.
[5] A. Baughman, "Mixture model framework facilitates understanding of zero-inflated and hurdle models for count data," Journal of Biopharmaceutical Statistics, vol. 17, pp. 943-946, 2007.
[6] C. E. Rose, S. W. Martin, K. A. Wannemuehler, and B. D. Plikaytis, "On the use of zero-inflated and hurdle models for modeling vaccine adverse event count data," Journal of Biopharmaceutical Statistics, vol. 16, pp. 463-481, 2006.
[7] M.-C. Hu, M. Pavlicova, and E. V. Nunes, "Zero-inflated and hurdle models of count data with extra zeros: Examples from an HIV-risk reduction intervention trial," The American journal of drug and alcohol abuse, vol. 37, pp. 367-375, 2011.
[8] A. K. Dwivedi, S. N. Dwivedi, S. Deo, R. Shukla, and E. Kopras, "Statistical models for predicting number of involved nodes in breast cancer patients," Health, vol. 2, p. 641, 2010.
[9] J. Lee, G. Han, W. Fulp, and A. Giuliano, "Analysis of overdispersed count data: application to the Human Papillomavirus Infection in Men (HIM) Study," Epidemiology and infection, vol. 140, pp. 1087-1094, 2012.
[10] M. S. Gilthorpe, M. Frydenberg, Y. Cheng, and V. Baelum, "Modelling count data with excessive zeros: The need for class prediction in zero-inflated models and the issue of data generation in choosing between zero inflated and generic mixture models for dental caries data," Statistics in medicine, vol. 28, pp. 3539-3553, 2009.
[11] S. B. Javali and P. V. Pandit, "Using zero inflated models to analyze dental caries with many zeroes," Indian Journal of Dental Research, vol. 21, p. 480, 2010.
[12] B. T. Pahel, J. S. Preisser, S. C. Stearns, and R. G. Rozier, "Multiple imputation of dental caries data using a zero‐inflated Poisson regression model," Journal of Public Health Dentistry, vol. 71, pp. 71-78, 2011.
[13] W. M. A. b. W. Ahmad, "Modeling and Handling Overdispersion Health Science Data with Zero-Inflated Poisson Model," Journal of Modern Applied Statistical Methods, vol. 12, p. 28, 2013.
[14] K. Shimizu and T. Yanagimoto, "The inverse trinomial distribution," Japanese Journal of Applied Statistics, vol. 20, pp. 89-96, 1991.
[15] M. S. Holla, "On a Poisson-inverse Gaussian distribution,” Metrika, 11, 115-121, 1966.
[16] M. Sankaran, "Mixtures by the inverse Gaussian distribution,” Sanky B, 30, 455-458,1968
[17] H. S. Sichel, ‘On a family of discrete distributions particularly suited to represent long-tailed frequency data,” in N.F. Laubscher (Ed.), Proceedings of the third Symposium on Mathematical Statistics, Pretoria, CSIR, 51-97,1971
[18] G. E. Willmot, "The Poisson-inverse Gaussian distribution as an alternative to the negative binomial,” Scandinavian Actuarial Journal, 113-127, 1989.
[19] J. K. Ord and G. A. Whitmore, "The poisson-inverse gaussian disiribuiion as a model for species abundance," Communications in Statistics-theory and Methods, vol. 15, pp. 853-871, 1986.
[20] S. H. Ong, "A note on the mixed Poisson formulation of the Poisson-inverse Gaussian distribution,” Communications in Statistics-Simulations, 27(1), 67-78, 1998.
[21] G. Letac and M. Mora, "Natural real exponential families with cubic variance functions,” The Annals of Statistics, 18, 1990, 1-37.
[22] C. C. Kokonendji and M. Khoudar, "On Strict Arcsine Distribution” Communications in Statistics. Theory Methods,33(5), 2004, pg 993-1006.