Performance Analysis of Text Classification Algorithms for Dhivehi Language Documents

dc.contributor.authorMohamed, F.R
dc.contributor.authorHaddela, P.S
dc.date.accessioned2026-03-21T10:08:15Z
dc.date.issued2025
dc.description.abstractThis study examines the effectiveness of various machine learning algorithms in classifying text written in 'Dhivehi,' the official language of the Maldives. As a low-resource language with limited research in text analytics, 'Dhivehi' poses unique challenges due to its distinctive linguistic properties. To address these challenges, this research evaluates the performance of algorithms, including Support Vector Machines, Naive Bayes, Decision Trees, Neural Networks, XGBoost, and Random Forest, leveraging a newly curated 'Dhivehi' language dataset. The evaluation highlights that K-Neighbors achieved the highest performance, with an accuracy of 64.7% and F1 scores (macro: 0.640, weighted: 0.642), demonstrating a strong balance between precision and recall. Support Vector Machines (accuracy: 63.9%) and XGBoost (accuracy: 62.8%) also showed competitive results, with SVM slightly outperforming XGBoost in F1 metrics. Decision Tree exhibited the lowest performance across all metrics. The findings provide critical insights into improving text classification for low-resource languages and contribute to developing natural language processing tools adapted explicitly for 'Dhivehi.' Furthermore, the dataset is publicly available on Mendeley data under the name 'Dhivehi Categories data set' to foster future research and innovation in this domain.
dc.identifier.doiDOI: 10.1109/ICARC64760.2025.10963084
dc.identifier.isbn979-833153098-3
dc.identifier.urihttps://rda.sliit.lk/handle/123456789/4887
dc.language.isoen
dc.publisherInstitute of Electrical and Electronics Engineers Inc.
dc.relation.ispartofseries2025 5th International Conference on Advanced Research in Computing: Converging Horizons: Uniting Disciplines in Computing Research through AI Innovation, ICARC 2025 - Proceedings
dc.subjectAsian Linguistics
dc.subjectDhivehi Language
dc.subjectLow-Resource Languages
dc.subjectMachine Learning
dc.subjectText Classification
dc.titlePerformance Analysis of Text Classification Algorithms for Dhivehi Language Documents
dc.typeArticle

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Performance_Analysis_of_Text_Classification_Algorithms_for_Dhivehi_Language_Documents.pdf
Size:
504.75 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.69 KB
Format:
Item-specific license agreed upon to submission
Description: