Performance Analysis of Text Classification Algorithms for Dhivehi Language Documents

Mohamed, F.R; Haddela, P.S

Performance Analysis of Text Classification Algorithms for Dhivehi Language Documents

Files

Performance_Analysis_of_Text_Classification_Algorithms_for_Dhivehi_Language_Documents.pdf (504.75 KB)

Date

2025

Authors

Mohamed, F.R

Haddela, P.S

Publisher

Institute of Electrical and Electronics Engineers Inc.

Abstract

This study examines the effectiveness of various machine learning algorithms in classifying text written in 'Dhivehi,' the official language of the Maldives. As a low-resource language with limited research in text analytics, 'Dhivehi' poses unique challenges due to its distinctive linguistic properties. To address these challenges, this research evaluates the performance of algorithms, including Support Vector Machines, Naive Bayes, Decision Trees, Neural Networks, XGBoost, and Random Forest, leveraging a newly curated 'Dhivehi' language dataset. The evaluation highlights that K-Neighbors achieved the highest performance, with an accuracy of 64.7% and F1 scores (macro: 0.640, weighted: 0.642), demonstrating a strong balance between precision and recall. Support Vector Machines (accuracy: 63.9%) and XGBoost (accuracy: 62.8%) also showed competitive results, with SVM slightly outperforming XGBoost in F1 metrics. Decision Tree exhibited the lowest performance across all metrics. The findings provide critical insights into improving text classification for low-resource languages and contribute to developing natural language processing tools adapted explicitly for 'Dhivehi.' Furthermore, the dataset is publicly available on Mendeley data under the name 'Dhivehi Categories data set' to foster future research and innovation in this domain.

Keywords

Asian Linguistics, Dhivehi Language, Low-Resource Languages, Machine Learning, Text Classification

URI

https://rda.sliit.lk/handle/123456789/4887

Collections

Faculty of Computing
Faculty of Computing-Scopus

Full item page

Performance Analysis of Text Classification Algorithms for Dhivehi Language Documents

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By