International Conference on Actuarial Sciences [ICActS] 2025

Permanent URI for this collectionhttps://rda.sliit.lk/handle/123456789/4496

Browse

Search Results

Now showing 1 - 1 of 1
  • Thumbnail Image
    PublicationOpen Access
    Latent Structures in Zero-Inflated Risk Domains: An Elastic–Tweedie Synergy for Claim Forecasting
    (Department of Mathematics and Statistics, Faculty of Humanities and Sciences, SLIIT, 2025-10-10) Kumarasinghe, P. B. W. S. R.; Napagoda, N. A. D. N.
    The frequency of insurance claims presents a unique modeling challenge due to high-dimensional inputs, strong feature correlations, and the dominance of zero-inflated outcomes. Conventional statistical models often fall short under these conditions, failing to capture the underlying structure of complex data sets. This study proposes an advanced predictive framework integrating Elastic Net regularization and a Tweedie-distribution-based XGBoost algorithm to address these issues in the context of motor insurance. Those methodologies were applied to the French Motor Claims data set,which contains over 678,000 policies, to distill influential variables while suppressing redundancy and noise. Lasso Regression, Elastic Net and the Boruta algorithm were employed to select relevant features. Elastic Net, in particular proved effective in identifying critical predictors including Exposure, Vehicle Age, Driver Age, BonusMalus, Area, and Fuel Type by balancing sparsity and multicollinearity. Thesefeatures were used to train both standard and Tweedie-distribution-based XGBoost models. Performance was evaluated using RMSE, MAE, and R², where the Tweedie XGBoost model guided by Elastic Net-selected features achieved the highest accuracy and explanatory power. The proposed architecture not only offers superior generalization and interpretability but also exhibits robustness in modeling skewed, zero-dominated distributions inherent to claim data. Beyond predictive enhancement, this framework has practical implications for actuarial science, particularly in dynamicpricing strategies, refined segmentation, and adaptive underwriting. This approach marks a shift toward more nuanced and scalable machine learning paradigms in insurance analytics by integrating statistically grounded feature selection with distribution-aware boosting.