Performance Analysis of Text Classification Algorithms for Dhivehi Language Documents
| dc.contributor.author | Mohamed, F.R | |
| dc.contributor.author | Haddela, P.S | |
| dc.date.accessioned | 2026-03-21T10:08:15Z | |
| dc.date.issued | 2025 | |
| dc.description.abstract | This study examines the effectiveness of various machine learning algorithms in classifying text written in 'Dhivehi,' the official language of the Maldives. As a low-resource language with limited research in text analytics, 'Dhivehi' poses unique challenges due to its distinctive linguistic properties. To address these challenges, this research evaluates the performance of algorithms, including Support Vector Machines, Naive Bayes, Decision Trees, Neural Networks, XGBoost, and Random Forest, leveraging a newly curated 'Dhivehi' language dataset. The evaluation highlights that K-Neighbors achieved the highest performance, with an accuracy of 64.7% and F1 scores (macro: 0.640, weighted: 0.642), demonstrating a strong balance between precision and recall. Support Vector Machines (accuracy: 63.9%) and XGBoost (accuracy: 62.8%) also showed competitive results, with SVM slightly outperforming XGBoost in F1 metrics. Decision Tree exhibited the lowest performance across all metrics. The findings provide critical insights into improving text classification for low-resource languages and contribute to developing natural language processing tools adapted explicitly for 'Dhivehi.' Furthermore, the dataset is publicly available on Mendeley data under the name 'Dhivehi Categories data set' to foster future research and innovation in this domain. | |
| dc.identifier.doi | DOI: 10.1109/ICARC64760.2025.10963084 | |
| dc.identifier.isbn | 979-833153098-3 | |
| dc.identifier.uri | https://rda.sliit.lk/handle/123456789/4887 | |
| dc.language.iso | en | |
| dc.publisher | Institute of Electrical and Electronics Engineers Inc. | |
| dc.relation.ispartofseries | 2025 5th International Conference on Advanced Research in Computing: Converging Horizons: Uniting Disciplines in Computing Research through AI Innovation, ICARC 2025 - Proceedings | |
| dc.subject | Asian Linguistics | |
| dc.subject | Dhivehi Language | |
| dc.subject | Low-Resource Languages | |
| dc.subject | Machine Learning | |
| dc.subject | Text Classification | |
| dc.title | Performance Analysis of Text Classification Algorithms for Dhivehi Language Documents | |
| dc.type | Article |
Files
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- Performance_Analysis_of_Text_Classification_Algorithms_for_Dhivehi_Language_Documents.pdf
- Size:
- 504.75 KB
- Format:
- Adobe Portable Document Format
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 1.69 KB
- Format:
- Item-specific license agreed upon to submission
- Description:
