Research Publications Authored by SLIIT Staff

Permanent URI for this communityhttps://rda.sliit.lk/handle/123456789/4195

This collection includes all SLIIT staff publications presented at external conferences and published in external journals. The materials are organized by faculty to facilitate easy retrieval.

Browse

Search Results

Now showing 1 - 5 of 5

Embargo
Sinhala Part of Speech Tagger using Deep Learning Techniques
(IEEE, 2022-12-21) Sathsarani, M.W.A.R.; Thalawaththa, T.P.A.B.; Galappaththi, N.K.; Danthanarayana, J.N.; Gamage, A
Natural Language Processing (NLP) is a sub-field of Artificial Intelligence (AI) that consists of a collection of computational methods motivated by theory for the automated classification and reflection of human languages. The foundation for many sophisticated applications of NLP, including named entity recognition, sentiment analysis, machine translation, in-formation retrieval, and information processing, is laid by Part of Speech (POS) tagging, which is part of the lexical layer of NLP systems. In contrast to English, French, German, and other languages from the same geographical region, the development of high-accuracy, stable POS taggers for the Sinhala language is still in its early stages. Hence, Sinhala is identified as a low-resource language. The main objective of this research is to create a POS tagger for the Sinhala language to solve this issue. An innovative and novel strategy that has never been used with the Sinhala language has been designed. This approach has been suggested specifically to evaluate the possibility of enhancing the accuracy compared to other methodologies. So, deep learning algorithms have been applied in this study, which has a significant impact on improving tagger performance. First, highly accurate individual classifiers for primary POS tags were implemented, and then they were combined into one composite model. As expected, all individual classifiers and the final composite model have achieved a higher accuracy level. Thus, it demonstrates that the proposed solution using deep learning algorithms outperformed other methods, such as rule-based and stochastic, in terms of accuracy.
Embargo
Evolutionary Algorithm for Sinhala to English Translation
(IEEE, 2019-10-08) Nugaliyadde, A; Joseph, J. K; Chathurika, W. M. T; Mallawarachchi, Y
Machine Translation (MT) is an area in natural language processing, which focuses on translating from one language to another. Many approaches ranging from statistical methods to deep learning approaches are used in order to achieve MT. However, these methods either require a large number of data or a clear understanding about the language. Sinhala language has less digital text which could be used to train a deep neural network. Furthermore, Sinhala has complex rules, and therefore, it is harder to create statistical rules in order to apply statistical methods in MT. This research focuses on Sinhala to English translation using an Evolutionary Algorithm (EA). EA is used to identifying the correct meaning of Sinhala text and to translate it into English. The Sinhala text is passed to identify the meaning in order to get the correct meaning of the sentence. With the use of the EA the translation is carried out. The translated text is passed on to grammatically correct the sentence. This has shown to achieve accurate results.
Embargo
Sinhala to english language translator
(IEEE, 2008-12-12) De Silva, D; Alahakoon, A; Udayangani, I; Kumara, V; Kolonnage, D; Perera, H; Thelijjagoda, S
This paper describes a machine translation system that is capable of translating a grammatically correct Sinhala sentence in to its corresponding English sentence. This is the first Sinhala to English machine translation system, which comes with features such as an inbuilt keyboard, an inbuilt dictionary, an integrated word processor based on Unicode fonts, a grammar tool, a Sinhalese grammar checker, an add word tool, and a debugging tool. With the expansion of the world, English has become an important language that people should learn, as the majority of the worldwide population understand and carry out their day-to-day work in English. In addressing this need, we thought of taking up the challenge of building, a Sinhala to English language translator. To build this system, we used the transfer-based machine translation approach, which is a rule-based approach. At present, the system has achieved a success rate of 75% with a corpus of 150 sentences.
Embargo
Conditional Random Fields based named entity recognition for sinhala
(IEEE, 2015-12-18) Senevirathne, K. U; Attanayake, N. S; Dhananjanie, A. W. M. H; Weragoda, W. A. S. U; Nugaliyadde, A; Thelijjagoda, S
Named Entity Recognition (NER) plays an important role in Natural Language Processing (NLP). Named Entities (NEs) are special atomic elements in natural languages belonging to predefined categories such as persons, organizations, locations, expressions of times, quantities, monetary values and percentages etc. These are referring to specific things and not listed in grammar or lexicons. NER is the task of identifying such NEs. This is a task entwined with number of challenges. Entities may be difficult to find at first, and once found, difficult to classify. For instance, locations and person names can be the same, and follow similar formatting. This becomes tough when it comes to South and South East Asian languages. That is mainly due to the nature of these languages. Even though Latin languages have accurate NER solutions those cannot be directly applied for Indic languages, because the features found in those languages are different from English. Therefore the research was based on producing a mathematical model which acts as the integral part of the Sinhala NER system. The researchers used Sinhala News corpus as the data set to train the Conditional Random Fields (CRFs) algorithm. 90% of the corpus was used in training the model, 10% is used in testing the resulted model. The research makes use of orthographic word-level features along with contextual information, which are helpful in predicting three different NE classes namely Persons, Locations and Organizations. The findings of the research were applied in developing the NE Annotator which identified NE classes from unstructured Sinhala text. The prominent contribution of this research for NER could benefit Sinhala NLP application developers and NLP related researchers in near future.
Embargo
A translator from sinhala to english and english to sinhala (sees)
(IEEE, 2012-12-12) Wijerathna, L.; Pulasinghe, K; Somaweera, W. L. S. L; Kaduruwana, S. L; Wijesinghe, Y. U; De Silva, D. I; Thellijjagoda, S
This paper presents a rule based machine translation system which is capable of translating sentences from Sinhala to English and vice versa. This is the first Sinhala to English and English to Sinhala machine translation system which comes with features such as a Sinhalese font translator, which is capable of interpreting Sinhalese words written in English characters (Singlish) to Sinhala characters, and an English grammar and spell checker. An entered sentence to the system will be tokenized and translated according to a rule. When translating Sinhala sentences to English the user input should be in Singlish and when translating English sentences to Sinhala input should be in English. The main objective of this translator is to enable a smooth flow translation of words, sentences and paragraphs to locals as well as foreigners and thereby eliminate the language barrier. A considerable amount of rules, patterns and words of both languages were used to develop this system. With 87% accuracy this pilot machine translation system translated 500 grammatically well-structured Sinhala sentences to English and 150 grammatically well-structured English sentences to Sinhala. The system is capable of translating approximately 70 sentences in one minute.

Research Publications Authored by SLIIT Staff

Browse

Filters

Advanced Search

Filter by

Settings

Sort By

Results per page

Search Results