Browsing by Author "Nugaliyadde, A"

Now showing 1 - 13 of 13

Embargo
Conditional Random Fields based named entity recognition for sinhala
(IEEE, 2015-12-18) Senevirathne, K. U; Attanayake, N. S; Dhananjanie, A. W. M. H; Weragoda, W. A. S. U; Nugaliyadde, A; Thelijjagoda, S
Named Entity Recognition (NER) plays an important role in Natural Language Processing (NLP). Named Entities (NEs) are special atomic elements in natural languages belonging to predefined categories such as persons, organizations, locations, expressions of times, quantities, monetary values and percentages etc. These are referring to specific things and not listed in grammar or lexicons. NER is the task of identifying such NEs. This is a task entwined with number of challenges. Entities may be difficult to find at first, and once found, difficult to classify. For instance, locations and person names can be the same, and follow similar formatting. This becomes tough when it comes to South and South East Asian languages. That is mainly due to the nature of these languages. Even though Latin languages have accurate NER solutions those cannot be directly applied for Indic languages, because the features found in those languages are different from English. Therefore the research was based on producing a mathematical model which acts as the integral part of the Sinhala NER system. The researchers used Sinhala News corpus as the data set to train the Conditional Random Fields (CRFs) algorithm. 90% of the corpus was used in training the model, 10% is used in testing the resulted model. The research makes use of orthographic word-level features along with contextual information, which are helpful in predicting three different NE classes namely Persons, Locations and Organizations. The findings of the research were applied in developing the NE Annotator which identified NE classes from unstructured Sinhala text. The prominent contribution of this research for NER could benefit Sinhala NLP application developers and NLP related researchers in near future.
Open Access
D-REHABIA: A Drug Addiction Recovery Through Mobile Based Application
(SLIIT, 2016-04-06) Somasiri, L. U; Galabada, S. S. G; Wijethunga, H. M; Dayananda, H. M; Nugaliyadde, A; Thelijjagoda, S; Rajasuriya, M
Drug addiction has become a major issue in the world. There are certain governmental and nongovernmental organizations which provide various programs to prevent, recover and rehabilitate drug addicts. The patients who are in the recovery process have a higher tendency of relapsing after being released to the society. The objective of this research is to produce a mobile based Drug Recovery Application and prevent patients from relapsing during the recovery process and to involve both family and rehabilitation center to the recovery of the patient. In order to accomplish this objective, the application contains an artificial intelligent assistant which will guide/help the patient regarding issues occurred during the recovery process, a location tracking mechanism to identify the movements of the patient and possible high risk places where drugs can circulate, a voice analysis mechanism to analyze the voice of the patient and identify emotional states which might cause the patient to relapse and treatments to reduce the stress, anxiety and depression level of the patient. The field of drug rehabilitation has been barely addressed via a proper technological solution, hence the system implemented as the result of this research can be effectively used for the recovery of the patient.
Embargo
Deep learning approach to classify Tiger beetles of Sri Lanka
(Elsevier, 2021-05-01) Abeywardhana, D. L; Dangalle, C. D; Nugaliyadde, A; Mallawarachchi, Y
Deep learning has shown to achieve dramatic results in image classification tasks. However, deep learning models require large amounts of data to train. Most of the real-world datasets, generally insect classification data does not have large number of training dataset. These images have a large amount of noise and various differences. The paper proposes a novel architectural model which removes the background noise and classify the Tiger beetles. Here object location is identified using contours by converting the original coloured image to white on black background. Then the remaining background is eliminated using grabcut algorithm. Later the extracted images are classified using a modified SqueezeNet transfer learning model to identify the tiger beetle class up to genus level. Transfer learning models with fewer trainable parameters performed well than the total number of parameters in the original model. When evaluating results it was identified that by freezing uppermost layers of SqueezeNet model better accuracy can be gained while freezing lowermost layers will reduce the validation accuracy. The proposed model achieved more than 90% for the test set in 40 epochs using 701,481 trainable parameters by freezing the top 19 layers of the original model. Improving the pre-processing to localize insect has improved the accuracy.
Embargo
Dynamic 3D model construction using architectural house plans
(IEEE, 2017-01-27) Ruwanthika, R. G. N; Amarasekera, P. A. D. B. M; Chandrasiri, R. U. I. B; Rangana, D. M. A. I; Nugaliyadde, A; Mallawarachchi, Y
The paper presents a complete approach to a dynamic 3D model construction from 2D house plans. This tool assembles 3D models and overlays virtual model on the real 2D blueprint of a house (architectural or hand-drawn). Key content of this research covers three dimensions which are; Wall detection and Wall modeling, Roof detection and Roof modeling and Template matching of Doors/Windows. The end result will be mainly based on Image Processing and Augmented Reality technologies. This tool lets users easily manipulate 3D models in real time through their smartphones and to showcase architecture models are in an entirely new way.
Open Access
Evolutionary algorithm for sinhala to english translation
(arXiv preprint arXiv:1907.03202, 2019-07-06) Joseph, JK; Chathurika, W. M. T; Nugaliyadde, A; Mallawarachchi, Y
Machine Translation (MT) is an area in natural language processing, which focus on translating from one language to another. Many approaches ranging from statistical methods to deep learning approaches are used in order to achieve MT. However, these methods either require a large number of data or a clear understanding about the language. Sinhala language has less digital text which could be used to train a deep neural network. Furthermore, Sinhala has complex rules therefore, it is harder to create statistical rules in order to apply statistical methods in MT. This research focuses on Sinhala to English translation using an Evolutionary Algorithm (EA). EA is used to identifying the correct meaning of Sinhala text and to translate it to English. The Sinhala text is passed to identify the meaning in order to get the correct meaning of the sentence. With the use of the EA the translation is carried out. The translated text is passed on to grammatically correct the sentence. This has shown to achieve accurate results
Embargo
Evolutionary Algorithm for Sinhala to English Translation
(IEEE, 2019-10-08) Nugaliyadde, A; Joseph, J. K; Chathurika, W. M. T; Mallawarachchi, Y
Machine Translation (MT) is an area in natural language processing, which focuses on translating from one language to another. Many approaches ranging from statistical methods to deep learning approaches are used in order to achieve MT. However, these methods either require a large number of data or a clear understanding about the language. Sinhala language has less digital text which could be used to train a deep neural network. Furthermore, Sinhala has complex rules, and therefore, it is harder to create statistical rules in order to apply statistical methods in MT. This research focuses on Sinhala to English translation using an Evolutionary Algorithm (EA). EA is used to identifying the correct meaning of Sinhala text and to translate it into English. The Sinhala text is passed to identify the meaning in order to get the correct meaning of the sentence. With the use of the EA the translation is carried out. The translated text is passed on to grammatically correct the sentence. This has shown to achieve accurate results.
Embargo
Linguistic features based personality recognition using social media data
(IEEE, 2017-01-27) Sewwandi, D; Perera, K; Sandaruwan, S; Lakchani, O; Nugaliyadde, A; Thelijjagoda, S
Social media has become a prominent platform for opinions and thoughts. This stated that the characteristics of a person can be assessed through social media status updates. The purpose of this research article is to provide a web application in order to detect one's personality using linguistic feature analysis. The personality of a person has classified according to Eysenck's Three Factor personality model. The proposed technique is based on ontology based text classification, linguistic feature-vector matrix using LIWC (Linguistic Inquiry and Word Count) features including semantic analysis using supervised machine learning algorithms and questionnaire based personality detection. This is vital for HR management system when recruiting and promoting employees, R&D Psychologists can use the dynamic ontology for storage purposes and all the other API users including universities and sports clubs. According to the test results the proposed system is in an accuracy level of 91%, when tested with a real world personality detection questionnaire based application, and results demonstrate that the proposed technique can detect the personality of a person with considerable accuracy and a speed.
Open Access
“Mahoshadha”, the Sinhala Tagged Corpus Based Question Answering System
(Springer, Cham, 2016) Jayakody, J. A. T. K; Gamlath, T. S. K; Lasantha, W. A. N; Premachandra, K. M. K. P; Nugaliyadde, A; Mallawarachchi, Y
“Mahoshadha” the Sinhala Question Answering Systems aims at retrieving precise information from a large Sinhala tagged corpus. This paper describes a novel architecture for a Question Answering System which summarizes a tagged corpus and uses the summarization to generate the answers for a query. The summarized corpuses are categorized according to a set of topics enabling fast search for information. K-Nearest Neighbor Algorithms is used in order to cluster the summarized corpuses. The query will be tagged, the tagged query will be used to get more accurate results. Through the tagged query the question will be identified clearly with the category of the query. Support Vector Machine is used in order to both automate the summarization and question understanding. This will enable “Mahoshadha” to answer any type of query as well as summarize any type of Sinhala corpus. This enables the Question Answering System to be more useable through many applications.
Embargo
Semantic video search by automatic video annotation using TensorFlow
(IEEE, 2016-10-22) Ashangani, K; Wickramasinghe, K. U; De Silva, D. W. N; Gamwara, V. M; Nugaliyadde, A; Mallawarachchi, Y
The paper discusses a tool for video structure analysis, feature extraction, classification and semantic querying suitable for an extremely broad scale of video data set. The tool analyses the video structure to detect shot boundaries where shots in each video are identified using image duplication techniques. A single frame from each shot is passed to a deep learning model implemented using TensorFlow, that is trained for feature extraction and classification of objects in each frame. Subsequently, an automatic textual annotation is generated for each video and finally with the aid of ontology, semantic searching is done using NLP, which allows receiving an efficient result other than manual video annotation of a large scale dataset. While maintaining accurate querying with automatic video content analysis and annotation with semantic searching with around seventy-four percent accuracy rate, this becomes a useful tool in video tagging and annotation.
Open Access
Simplifying Law Statements Using Natural Language Processing
(SLIIT, 2016-11-16) Dharmasiri, N; Gunathilake, B; Pathirana, u; Senevirathne, S; Nugaliyadde, A; Thelijjagoda, S
Understanding the law statements for general public is evidently complex. The research derives a computational solution on reducing the complexity of the law statements. Given a law statement, the research will use both wordnet and “LawNet” to create a simpler meaning. The research will focus on information extraction, information retrieval, question analysis and answer generation techniques to derive better meaning of law statements. The law statement will be treated as a question and the “LawNet” and wordnet will be used in as information extraction points. The law statement will be analyzed as a question; more information will be retrieved through the wordnet and “LawNet”. This process mostly acts similar to a search engine’s process. The results provide on average 80% accuracy for a 1500 dataset.
Open Access
Solving Sinhala Language Arithmetic Problems using Neural Networks
(arxiv logo > cs > arXiv:1809.04557, 2018-09-11) Chathurika, W. M. T; De Silva, K. C; Raddella, A. M; Ekanayake, E. M. R. S; Nugaliyadde, A; Mallawarachchi, Y
A methodology is presented to solve Arithmetic problems in Sinhala Language using a Neural Network. The system comprises of (a) keyword identification, (b) question identification, (c) mathematical operation identification and is combined using a neural network. Naïve Bayes Classification is used in order to identify keywords and Conditional Random Field to identify the question and the operation which should be performed on the identified keywords to achieve the expected result. “One vs. all Classification” is done using a neural network for sentences. All functions are combined through the neural network which builds an equation to solve the problem. The paper compares each methodology in ARIS and Mahoshadha to the method presented in the paper. Mahoshadha2 learns to solve arithmetic problems with the accuracy of 76%.
Embargo
An ultra-specific image dataset for automated insect identification
(Springer Nature, 2022-01) Abeywardhana, L; Dangalle, C; Nugaliyadde, A; Mallawarachchi, Y
Automated identifcation of insects is a tough task where many challenges like data limitation, imbalanced data count, and background noise needs to be overcome for better performance. This paper describes such an image dataset which consists of a limited, imbalanced number of images regarding six genera of subfamily Cicindelinae (tiger beetles) of order Coleoptera. The diversity of image collection is at a high level as the images were taken from diferent sources, angles and on diferent scales. Thus, the salient regions of the images have a large variation. Therefore, one of the main intentions in this process was to get an idea about the image dataset while comparing diferent unique patterns and features in images. The dataset was evaluated on diferent classifcation algorithms including deep learning models based on diferent approaches to provide a benchmark. The dynamic nature of the dataset poses a challenge to the image classifcation algorithms. However transfer learning models using softmax classifer performed well on the current dataset. The tiger beetle classifcation can be challenging even to a trained human eye, therefore, this dataset opens a new avenue for the classifcation algorithms to develop, to identify features which human eyes have not identifed.
Open Access
An ultra-specific image dataset for automated insect identification
(Springer US, 2022-01-09) Abeywardhana, D. L; Dangalle, C. D; Nugaliyadde, A; Mallawarachchi, Y
Automated identifcation of insects is a tough task where many challenges like data limitation, imbalanced data count, and background noise needs to be overcome for better performance. This paper describes such an image dataset which consists of a limited, imbalanced number of images regarding six genera of subfamily Cicindelinae (tiger beetles) of order Coleoptera. The diversity of image collection is at a high level as the images were taken from diferent sources, angles and on diferent scales. Thus, the salient regions of the images have a large variation. Therefore, one of the main intentions in this process was to get an idea about the image dataset while comparing diferent unique patterns and features in images. The dataset was evaluated on diferent classifcation algorithms including deep learning models based on diferent approaches to provide a benchmark. The dynamic nature of the dataset poses a challenge to the image classifcation algorithms. However transfer learning models using softmax classifer performed well on the current dataset. The tiger beetle classifcation can be challenging even to a trained human eye, therefore, this dataset opens a new avenue for the classifcation algorithms to develop, to identify features which human eyes have not identifed.