Research Publications

Now showing 1 - 2 of 2

Open Access
Latent Structures in Zero-Inflated Risk Domains: An Elastic–Tweedie Synergy for Claim Forecasting
(Department of Mathematics and Statistics, Faculty of Humanities and Sciences, SLIIT, 2025-10-10) Kumarasinghe, P. B. W. S. R.; Napagoda, N. A. D. N.
The frequency of insurance claims presents a unique modeling challenge due to high-dimensional inputs, strong feature correlations, and the dominance of zero-inflated outcomes. Conventional statistical models often fall short under these conditions, failing to capture the underlying structure of complex data sets. This study proposes an advanced predictive framework integrating Elastic Net regularization and a Tweedie-distribution-based XGBoost algorithm to address these issues in the context of motor insurance. Those methodologies were applied to the French Motor Claims data set,which contains over 678,000 policies, to distill influential variables while suppressing redundancy and noise. Lasso Regression, Elastic Net and the Boruta algorithm were employed to select relevant features. Elastic Net, in particular proved effective in identifying critical predictors including Exposure, Vehicle Age, Driver Age, BonusMalus, Area, and Fuel Type by balancing sparsity and multicollinearity. Thesefeatures were used to train both standard and Tweedie-distribution-based XGBoost models. Performance was evaluated using RMSE, MAE, and R², where the Tweedie XGBoost model guided by Elastic Net-selected features achieved the highest accuracy and explanatory power. The proposed architecture not only offers superior generalization and interpretability but also exhibits robustness in modeling skewed, zero-dominated distributions inherent to claim data. Beyond predictive enhancement, this framework has practical implications for actuarial science, particularly in dynamicpricing strategies, refined segmentation, and adaptive underwriting. This approach marks a shift toward more nuanced and scalable machine learning paradigms in insurance analytics by integrating statistically grounded feature selection with distribution-aware boosting.
Open Access
Recommendations for Students in Higher Education: A Machine Learning Approach.
(International Postgraduate Research Conference (IPRC) , Faculty of Graduate Studies, University of Kelaniya, 2017) Kasthuriarachchi, K. S. T; Liyanage, S. R
Educational Data Mining is a rising discipline in Data Mining setting which concentrated on creating systems for investigating one of a kind data that starts from educational settings, and utilizing those procedures to better comprehend students and the settings which they learn in. There were numerous potential circumstances for applying data mining in education, such as; predicting the performance of students in education domain, advancement of student models, making methodologies for instructive help, settling on decisions to growing better learning systems, upgrading the execution of students and lessening the dropout rate of students and so on. There were sure examinations directed in dissecting students' data to foresee the execution in light of data mining approaches utilizing machine learning algorithms. However, a few of them were guiding the students using the recommendations of educators to success in their academic lives. The key objective of this research is to provide educators‘ recommendations to students in higher education through data analysis using machine learning algorithms. In this experiment, the data about more than 3000 students with eight attributes; age, gender, A/L Stream, A/L English Grade, does the student has repeat modules, GPA of Semester1, GPA of Semester 2 and Pass status of year 1 were included into the research sample who registered and were following their first academic year of an Information Technology degree in an institute. Three classification type machine learning algorithms were used to build the predictive model. They were Naïve Bayes algorithm, Decision Tree algorithm and Support Vector Machine algorithm. The accuracy of the models built by each algorithm have been tested against each other to identify the best model and extracted the most influencing/ important attributes in the model to predict the final grade (pass/ fail) in the end of first year of the students. Accordingly, the accuracy measures of Naïve Bayes, Decision tree andSupport Vector Machine were recorded as 74.67%, 74.01% and 74.01% respectively and it was clear that all three algorithms were holding almost same accuracy level. However, the model generated by Naïve Bayes algorithm has been selected since it was outperformed the rest. Then rank features by importance method was used as the feature selection method to identify the most influencing factors of the predictive model. As the result of it, past repeat modules, GPA of Semester1, GPA of Semester 2 were extracted as the most influencing attributes. Furthermore, these attributes were tested using correlation analysis to measure the significance of the relationship with the target attribute. According to this study, the educators will be able to recommend the students to score good marks for assessments of the subjects to obtain a better GPA to semester 1 and semester 2 without failing the modules to successfully complete the first year of the degree course which make more beneficial for educators as well as students to be success.

Research Publications

Browse

Filters

Advanced Search

Filter by

Settings

Sort By

Results per page

Search Results