Tweedie Distribution: A Statistical Solution for Unusually Dispersed Data

  • Zainol Mustafa Universiti Kebangsaan Malaysia, Malaysia
Keywords: insurance claims, overdispersion, tweedie distribution

Abstract

The Tweedie distribution has emerged as an effective statistical approach to model data with unusual dispersion characteristics, especially data with mixed discrete and continuous components. In this study, the Tweedie distribution is applied to insurance claims data to model the pattern of claims containing many zero values and large claims that are continuous in nature. With parameter estimation using the iteratively reweighted least squares (IRLS) algorithm in R software, the results show that the Tweedie distribution can handle higher variability (overdispersion) accurately. The estimated power parameter value () of 1.7 indicates that the Tweedie distribution combines the Poisson and Gamma distributions, which are effective in modeling claims data with high dispersion. This study also shows that the Tweedie distribution is able to provide better and more realistic predictions compared to traditional distributions such as Poisson or Gamma, which cannot handle data with mixed characteristics and overdispersion well. These findings provide important contributions to insurance claims modeling and open up the potential for wider applications in various other fields that face data with high variability and mixed patterns.

References

Abid, R., & Kokonendji, C. C. (2023). Choice between and within the classes of Poisson-Tweedie and Poisson-exponential-Tweedie count models. Communications in Statistics-Simulation and Computation, 52(5), 2115–2129. https://doi.org/10.1080/03610918.2021.1898635

Alghamdi, F. M., Ahsan-ul-Haq, M., Hussain, M. N. S., Hussam, E., Almetwally, E. M., Aljohani, H. M., Mustafa, M. S., Alshawarbeh, E., & Yusuf, M. (2024). Discrete Poisson Quasi-XLindley distribution with mathematical properties, regression model, and data analysis. Journal of Radiation Research and Applied Sciences, 17(2), 100874. https://doi.org/10.1016/j.jrras.2024.100874

Bouchet-Valat, M. (2022). General marginal-free association indices for contingency tables: From the Altham index to the intrinsic association coefficient. Sociological Methods & Research, 51(1), 203–236. https://doi.org/10.1177/0049124119852389

Charbonnel, A., Lambert, P., Lassalle, G., Quinton, E., Guisan, A., Mas, L., Paquignon, G., Lecomte, M., & Acolas, M.-L. (2023). Developing species distribution models for critically endangered species using participatory data: The European sturgeon marine habitat suitability. Estuarine, Coastal and Shelf Science, 280, 108136. https://doi.org/10.1016/j.ecss.2022.108136

Chen, T., Desmond, A. F., & Adamic, P. (2023). Generalized Additive Modelling of Dependent Frequency and Severity Distributions for Aggregate Claims. Journal of Statistical and Econometric Methods, 12(4), 1–37. https://doi.org/10.47260/jsem/1241

Gatarić, D., Ruškić, N., Aleksić, B., Đurić, T., Pezo, L., Lončar, B., & Pezo, M. (2023). Predicting road traffic accidents—Artificial neural network approach. Algorithms, 16(5), 257. https://doi.org/10.3390/a16050257

Haj Ahmad, H., Ramadan, D. A., & Almetwally, E. M. (2024). Evaluating the discrete generalized Rayleigh distribution: Statistical inferences and applications to real data analysis. Mathematics, 12(2), 183. https://doi.org/10.3390/math12020183

Hanemann, M., Labandeira, X., Labeaga, J. M., & Vásquez-Lavín, F. (2024). Discrete-continuous models of residential energy demand: A comprehensive review. Resource and Energy Economics, 101426. https://doi.org/10.1016/j.reseneeco.2024.101426

Harvey, G. B., & Boggio, G. S. (2024). Poisson-Tweedie Models for Count Data with Excessive Zeros. Comparison with the Negative Binomial Model. Colombian Journal of Statistics/Revista Colombiana de Estadística, 47(1). https://doi.org/10.15446/rce.v47n1.101952

Hayati, M., & Permatasari, R. (2024). Comparison of Generalized Linear Model between Gamma and Tweedie Compound Response for Rainfall Prediction in Lampung Province. Asian Journal of Probability and Statistics, 26(1), 41–49. https://doi.org/10.9734/ajpas/2024/v26i1583

Li, L., Gu, Z., Xu, W., Tan, Y., Fan, X., & Tan, D. (2023). Mixing mass transfer mechanism and dynamic control of gas-liquid-solid multiphase flow based on VOF-DEM coupling. Energy, 272, 127015. https://doi.org/10.1016/j.energy.2023.127015

Marra, G., Fasiolo, M., Radice, R., & Winkelmann, R. (2023). A flexible copula regression model with Bernoulli and Tweedie margins for estimating the effect of spending on mental health. Health Economics, 32(6), 1305–1322. https://doi.org/10.1002/hec.4668

Philipson, P. M. (2024). A truncated mean-parameterized Conway-Maxwell-Poisson model for the analysis of Test match bowlers. Statistical Modelling, 24(5), 480–495. https://doi.org/10.1177/1471082X231178584

Ramos, P., & Oliveira, J. M. (2023). Robust Sales forecasting Using Deep Learning with Static and Dynamic Covariates. Applied System Innovation, 6(5), 85. https://doi.org/10.3390/asi6050085

Ravindra, K., Bahadur, S. S., Katoch, V., Bhardwaj, S., Kaur-Sidhu, M., Gupta, M., & Mor, S. (2023). Application of machine learning approaches to predict the impact of ambient air pollution on outpatient visits for acute respiratory infections. Science of The Total Environment, 858, 159509. https://doi.org/10.1016/j.scitotenv.2022.159509

Suhaila, J. (2023). Tweedie models for Malaysia rainfall simulations with seasonal variabilities. Journal of Water and Climate Change, 14(10), 3648–3670. https://doi.org/10.2166/wcc.2023.275

Yin, J., Fu, X., Luo, Y., Leng, Y., Ao, L., & Xie, C. (2024). A Narrative Review of Diabetic Macroangiopathy: From Molecular Mechanism to Therapeutic Approaches. Diabetes Therapy, 1–25. https://doi.org/10.1007/s13300-024-01532-7

Zheng, N., Lim, Y., & Cadigan, N. G. (2023). A Tweedie Markov process and its application in fisheries stock assessment. Journal of the Royal Statistical Society Series C: Applied Statistics, 72(5), 1276–1292. https://doi.org/10.1093/jrsssc/qlad064

Published
2025-01-29
How to Cite
Zainol Mustafa. (2025). Tweedie Distribution: A Statistical Solution for Unusually Dispersed Data. Sciencestatistics: Journal of Statistics, Probability, and Its Application, 3(1), 29-37. https://doi.org/10.24127/sciencestatistics.v3i1.8003
Section
Articles