Publication: Rule-Based Translation of Sinhala Slang and Colloquial Expressions into English for Enhanced Cross-Cultural Communication
DOI
Type:
Thesis
Date
2024-12
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
SLIIT
Abstract
Begin This research paper presents the development and evaluation of a hybrid translation system
designed to translate Sinhala slang and colloquial expressions into English. Recognizing the
challenges of accurately conveying informal language in cross-cultural communication, the study
addresses the need for precise and contextually relevant translations. Sinhala slang and
colloquialisms often contain nuanced meanings, cultural references, and idiomatic expressions that
are challenging to capture with conventional machine translation approaches. To address these
complexities, the proposed system combines linguistic rules, pattern recognition algorithms, and
context-based translation models, specifically tailored for Sinhala slang and colloquial styles. A
novel component of this research is the combination of rule-based matching (for known slang)
with unsupervised learning using Word2Vec embeddings and K-Means clustering for new slang
detection. This hybrid approach enhances the system’s ability to differentiate formal from informal
language by leveraging predefined rules for familiar slang while dynamically identifying emerging
slang patterns through clustering. This integration enables more targeted translation processing by
identifying slang, which is subsequently handled by rule-based frameworks developed through
data collection and analysis of Sinhala slang expressions. The system’s effectiveness is rigorously
evaluated using metrics such as accuracy, precision, recall, F1-score, and Bilingual Evaluation
Understudy score, confirming its utility in promoting cross-cultural understanding through
culturally sensitive translations. The outcomes of this research advance rule-based and
unsupervised learning-supported machine translation techniques for informal language, fostering
communication across diverse linguistic and cultural contexts.
Description
Keywords
Rule-Based Translation, Sinhala Slang, Colloquial Expressions, Enhanced Cross-Cultural, Communication
