Tweedie Distribution: A Statistical Solution for Unusually Dispersed Data
Abstract
The Tweedie distribution has emerged as an effective statistical approach to model data with unusual dispersion characteristics, especially data with mixed discrete and continuous components. In this study, the Tweedie distribution is applied to insurance claims data to model the pattern of claims containing many zero values and large claims that are continuous in nature. With parameter estimation using the iteratively reweighted least squares (IRLS) algorithm in R software, the results show that the Tweedie distribution can handle higher variability (overdispersion) accurately. The estimated power parameter value () of 1.7 indicates that the Tweedie distribution combines the Poisson and Gamma distributions, which are effective in modeling claims data with high dispersion. This study also shows that the Tweedie distribution is able to provide better and more realistic predictions compared to traditional distributions such as Poisson or Gamma, which cannot handle data with mixed characteristics and overdispersion well. These findings provide important contributions to insurance claims modeling and open up the potential for wider applications in various other fields that face data with high variability and mixed patterns.
References
Abid, R., & Kokonendji, C. C. (2023). Choice between and within the classes of Poisson-Tweedie and Poisson-exponential-Tweedie count models. Communications in Statistics-Simulation and Computation, 52(5), 2115–2129. https://doi.org/10.1080/03610918.2021.1898635
Alghamdi, F. M., Ahsan-ul-Haq, M., Hussain, M. N. S., Hussam, E., Almetwally, E. M., Aljohani, H. M., Mustafa, M. S., Alshawarbeh, E., & Yusuf, M. (2024). Discrete Poisson Quasi-XLindley distribution with mathematical properties, regression model, and data analysis. Journal of Radiation Research and Applied Sciences, 17(2), 100874. https://doi.org/10.1016/j.jrras.2024.100874
Bouchet-Valat, M. (2022). General marginal-free association indices for contingency tables: From the Altham index to the intrinsic association coefficient. Sociological Methods & Research, 51(1), 203–236. https://doi.org/10.1177/0049124119852389
Charbonnel, A., Lambert, P., Lassalle, G., Quinton, E., Guisan, A., Mas, L., Paquignon, G., Lecomte, M., & Acolas, M.-L. (2023). Developing species distribution models for critically endangered species using participatory data: The European sturgeon marine habitat suitability. Estuarine, Coastal and Shelf Science, 280, 108136. https://doi.org/10.1016/j.ecss.2022.108136
Chen, T., Desmond, A. F., & Adamic, P. (2023). Generalized Additive Modelling of Dependent Frequency and Severity Distributions for Aggregate Claims. Journal of Statistical and Econometric Methods, 12(4), 1–37. https://doi.org/10.47260/jsem/1241
Gatarić, D., Ruškić, N., Aleksić, B., Đurić, T., Pezo, L., Lončar, B., & Pezo, M. (2023). Predicting road traffic accidents—Artificial neural network approach. Algorithms, 16(5), 257. https://doi.org/10.3390/a16050257
Haj Ahmad, H., Ramadan, D. A., & Almetwally, E. M. (2024). Evaluating the discrete generalized Rayleigh distribution: Statistical inferences and applications to real data analysis. Mathematics, 12(2), 183. https://doi.org/10.3390/math12020183
Hanemann, M., Labandeira, X., Labeaga, J. M., & Vásquez-Lavín, F. (2024). Discrete-continuous models of residential energy demand: A comprehensive review. Resource and Energy Economics, 101426. https://doi.org/10.1016/j.reseneeco.2024.101426
Harvey, G. B., & Boggio, G. S. (2024). Poisson-Tweedie Models for Count Data with Excessive Zeros. Comparison with the Negative Binomial Model. Colombian Journal of Statistics/Revista Colombiana de Estadística, 47(1). https://doi.org/10.15446/rce.v47n1.101952
Hayati, M., & Permatasari, R. (2024). Comparison of Generalized Linear Model between Gamma and Tweedie Compound Response for Rainfall Prediction in Lampung Province. Asian Journal of Probability and Statistics, 26(1), 41–49. https://doi.org/10.9734/ajpas/2024/v26i1583
Li, L., Gu, Z., Xu, W., Tan, Y., Fan, X., & Tan, D. (2023). Mixing mass transfer mechanism and dynamic control of gas-liquid-solid multiphase flow based on VOF-DEM coupling. Energy, 272, 127015. https://doi.org/10.1016/j.energy.2023.127015
Marra, G., Fasiolo, M., Radice, R., & Winkelmann, R. (2023). A flexible copula regression model with Bernoulli and Tweedie margins for estimating the effect of spending on mental health. Health Economics, 32(6), 1305–1322. https://doi.org/10.1002/hec.4668
Philipson, P. M. (2024). A truncated mean-parameterized Conway-Maxwell-Poisson model for the analysis of Test match bowlers. Statistical Modelling, 24(5), 480–495. https://doi.org/10.1177/1471082X231178584
Ramos, P., & Oliveira, J. M. (2023). Robust Sales forecasting Using Deep Learning with Static and Dynamic Covariates. Applied System Innovation, 6(5), 85. https://doi.org/10.3390/asi6050085
Ravindra, K., Bahadur, S. S., Katoch, V., Bhardwaj, S., Kaur-Sidhu, M., Gupta, M., & Mor, S. (2023). Application of machine learning approaches to predict the impact of ambient air pollution on outpatient visits for acute respiratory infections. Science of The Total Environment, 858, 159509. https://doi.org/10.1016/j.scitotenv.2022.159509
Suhaila, J. (2023). Tweedie models for Malaysia rainfall simulations with seasonal variabilities. Journal of Water and Climate Change, 14(10), 3648–3670. https://doi.org/10.2166/wcc.2023.275
Yin, J., Fu, X., Luo, Y., Leng, Y., Ao, L., & Xie, C. (2024). A Narrative Review of Diabetic Macroangiopathy: From Molecular Mechanism to Therapeutic Approaches. Diabetes Therapy, 1–25. https://doi.org/10.1007/s13300-024-01532-7
Zheng, N., Lim, Y., & Cadigan, N. G. (2023). A Tweedie Markov process and its application in fisheries stock assessment. Journal of the Royal Statistical Society Series C: Applied Statistics, 72(5), 1276–1292. https://doi.org/10.1093/jrsssc/qlad064