A Comparison of Models for Count Data with an Application to Over-Dispersion Data

Authors

  • Chadarat Tapan Department of Mathematics, Faculty of science, Naresuan University, Thailand
  • Anamai Na-udom Department of Mathematics, Faculty of science, Naresuan University, Thailand
  • Jaratsri Rungrattanaubol Department of Computer Science and Information Technology, Faculty of Science, Naresuan University, Thailand

Keywords:

Count models, Over-dispersion data, Poisson regression, Negative regression, Discrete Weibull regression

Abstract

Count models have been widely used in various fields, such as medicine, biology, and public health. The most frequently used count models are Poisson regression, negative binomial regression, and discrete Weibull regression models. The objective of this study was to compare the performance of Poisson, negative binomial, and discrete Weibull regression models using two different sets of data with over-dispersion. The AIC, BIC, and log-likelihood fit statistics were used as the criteria to compare the count models. The results revealed that the negative binomial and discrete Weibull regression were the best fit models as they produced the smallest AIC, BIC, and log-likelihood fit statistics.

References

Alebachew,A. (2019). A Comparison of count regression models on modeling of instructors publication factors: application of Ethiopian public universities. American Journal of Theoretical and Applied Statistics, 8(5),169-178.

Avcı, E., Altürk, S., & Soylu, E. N. (2015). Comparison count regression models for overdispersed alga data.International Journal of Research and Reviews in Applied Sciences,25(1), 1-5.

Cameron,A. C.,& Trivedi,P. K. (2013). Regression Analysis of Count Data. Cambridge, New York: The United States of America.

Cupal,M., Deev,O.,& Linnertova,D. (2015). The Poisson regression analysis for occurrence of floods. Procedia Economics and Finance, 23,1499-1502.

Davide C., Matthijs J. W.,& Giuseppe,J.(2021). The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ Computer Science, 18,1-24. https://doi.org/10.7717/peerj-cs.623

Durmus,B.,& Guneri,O. I. (2020). An application of the generalized Poisson model for over dispersion data on the number of strikes between 1984 and 2017. Alphanumeric journal, 8(2),249-260.

Emrah,A. (2019). A new model for over-dispersed count data: Poisson quasi-Lindley regression model. Mathematical Science, 13, 241-247.

Garthwaite,P. H., Jolliffe,I. T.,& Jones,B. (2002). Statistical Inference. Oxford University Press Inc., New York.

Gencturk Y.,& Yigiter,A. (2016). Modelling claim number using a new mixture model: negative bionmial gamma distribution. Journal of statistical computation and simulation, 86(10), 1829-1839.

Grine,R.,& Zeghdoudi,H. (2017). On Poisson quasi-Lindley distribution and its application. Journal of Modern Applied Statistical Methods, 16(2),403-417.

Harris,T., Yang,Z.,& Hardin,J. W. (2012). Modeling underdispered count data with generalized Poisson regression. The Stata Journal, 12(4),736-747.

Hilbe,J. M. (2014). Modeling count data. Cambridge University, www.cambridge.org/9781107611252Husain,M.,&Bagmar,S. H. (2015). Modeling under-dispersed count data using generalized Poisson regression approach. Global Journal of Quantitive Science, 2(4),22-29.

Ismail,N.,& Jemain,A. (2007). Handling overdispersion with negative binomial and generalized poisson regression model. Casualty Actuarial Society Forum, 103-158.

Jasin,M., Hussein,M.,& Hamodi,H. (2017). Comparison count regression models for the number of infected of Pneumonia. Global Journal of Pure and Applied Mathematics, 13,5359-5366.

Lee, J. H., Han, G., Fulp, W. J., & Giuliano, A. R. (2012). Analysis of overdispersed count data: application to the Human Papillomavirus Infection in Men (HIM) Study.Epidemiology & Infection,140(6), 1087-1094.

Klakattawi,H., Vinciotti,V.,& Yu,K. (2018). A simpleand adaptive dispersion regression model for count data. Entropy, 20(142).

Liang, K. Y., & Zeger, S. L. (1993). Regression analysis for correlated data.Annual review of public health,14(1), 43-68.

Linden,A.,& Mantyniemi,S. (2011). Using the negative binomial distribution to model overdispersion in ecological count data. Ecology, 92(7), 1414-1421.

Loomis,D., Richardson,D. B.,& Elliott,L. (2005). Poisson regression analysis of ungrouped data. Occup Environ Med, 62,325-329.

Melliana,A. (2013). The comparison of generalized Poisson regression and negative binomial regression method in overcoming overdispersion. International Journal of Scientific & Technology Research, 2,255-258.

Montgomery,D. C., Peck,E.,& Vining,G.G. (2012). Introduction linear regression analysis. John Wiley & Sons, Inc, Hoboken, New Jersey.

Nakagawa,T.,& Osaki,S. (1975). The discrete Weibull distribution. IEEE Trans Reliab, 24, 300-301.

Saputo,D., Susanti,A.,& Pratiwi,N. (2021). The handling of overdispersion on Poisson regression model with the generalized poisson regression model. AIP Conference Proceedings 2326, 020026(1-8).

Ver Hoef, J. M., & Boveng, P. L. (2007). Quasi‐Poisson vs. negative binomial regression: how should we model overdispersed count data?.Ecology,88(11), 2766-2772.

Wan,T., Hua,H.,& Xin,M. T. (2012). Applied Categorical and Count Data Analysis. Taylor & Francis Group: London New York.

Xia, Y., Morrison-Beedy, D., Ma, J., Feng, C., Cross, W., & Tu, X. (2012). Modeling count outcomes from HIV risk reduction interventions: A comparison of competing statistical models for count responses.AIDS research and treatment,2012.

Downloads

Published

2023-08-26

Issue

Section

Research Articles

Categories