Research Publications Authored by SLIIT Staff
Permanent URI for this communityhttps://rda.sliit.lk/handle/123456789/4195
This collection includes all SLIIT staff publications presented at external conferences and published in external journals. The materials are organized by faculty to facilitate easy retrieval.
Browse
11 results
Filters
Advanced Search
Filter by
Settings
Search Results
Publication Embargo Graph Neural Network based Child Activity Recognition(IEEE, 2022-08-25) Mohottala, S; Samarasinghe, P; Kasthurirathna, D; Abhayaratne, CThis paper presents an implementation on child activity recognition (CAR) with a graph convolution network (GCN) based deep learning model since prior implementations in this domain have been dominated by CNN, LSTM and other methods despite the superior performance of GCN. To the best of our knowledge, we are the first to use a GCN model in child activity recognition domain. In overcoming the challenges of having small size publicly available child action datasets, several learning methods such as feature extraction, fine-tuning and curriculum learning were implemented to improve the model performance. Inspired by the contradicting claims made on the use of transfer learning in CAR, we conducted a detailed implementation and analysis on transfer learning together with a study on negative transfer learning effect on CAR as it hasn’t been addressed previously. As the principal contribution, we were able to develop a ST-GCN based CAR model which, despite the small size of the dataset, obtained around 50% accuracy on vanilla implementations. With feature extraction and fine tuning methods, accuracy was improved by 20%-30% with the highest accuracy being 82.24%. Furthermore, the results provided on activity datasets empirically demonstrate that with careful selection of pre-train model datasets through methods such as curriculum learning could enhance the accuracy levels. Finally, we provide preliminary evidence on possible frame rate effect on the accuracy of CAR models, a direction future research can explore.Publication Embargo Continuous American Sign Language Recognition Using Computer Vision And Deep Learning Technologies(IEEE, 2022-08-29) Senanayaka, S.A.M.A.S; Perera, R.A.D.B.S; Rankothge, W.; Usgalhewa, S.S.; Hettihewa, H.DSign language is a non-verbal communication method used to communicate between hard of hearing or deaf and ordinary people. Automatic Sign language detection is a complex computer vision problem due to the diversity of modern sign languages and variations in gesture positions, hand and finger form, and body part placements. This research paper aims to conduct a systematic experimental evaluation of computer vision-based approaches for sign language recognition. The present research focuses on mapping non-segmented video streams to glosses to gain insights into sign language recognition. The proposed machine learning model consists of Recurrent Neural Network (RNN) layers such as Long Short-Term Memory (LSTM). The model is implemented using current deep learning frameworks such as Google TensorFlow and Keras API.Publication Embargo Review On Hand Gesture Recognition for Bengali Sign Language(IEEE, 2022-04-14) Perera, D; Kanchana, B; Peiris, R; Madushan, K; Kasthurirathna, DCommunication becomes difficult when interaction between the disabled and the general public are required. People with disabilities of various races communicate using various sign languages. For persons who are deaf or hard of hearing sign language is their primary mode of communication. However, the majority of our community does not understand sign language, taking them out in public is incredibly challenging. In order to make sign language understandable to the general public, computer vision-based methods are now widely used. Recognition of hand gesture is one of the computer vision based technologies for recognizing sign language, and it is attracting a lot of attention from analysis. For a long time, it has been a popular research area. In the area of hand gesture recognition in computer vision, some recent research has achieved outstanding improvements by employing deep learning techniques. In this paper we have discussed the previous research methods, technologies, datasets and models used in Bengal sign language gestures that are interconnected in terms of achieving a successful result. Therefore, this review article tried to reveal the independent techniques which are used to overcome the challenges in research.Publication Embargo Review On Hand Gesture Recognition for Bengali Sign Language(IEEE, 2022-02-23) Perera, D; Kanchana, B. M; Peiris, R; Madushan, K; Kasthurirathna, DCommunication becomes difficult when interaction between the disabled and the general public are required. People with disabilities of various races communicate using various sign languages. For persons who are deaf or hard of hearing sign language is their primary mode of communication. However, the majority of our community does not understand sign language, taking them out in public is incredibly challenging. In order to make sign language understandable to the general public, computer vision-based methods are now widely used. Recognition of hand gesture is one of the computer vision based technologies for recognizing sign language, and it is attracting a lot of attention from analysis. For a long time, it has been a popular research area. In the area of hand gesture recognition in computer vision, some recent research has achieved outstanding improvements by employing deep learning techniques. In this paper we have discussed the previous research methods, technologies, datasets and models used in Bengal sign language gestures that are interconnected in terms of achieving a successful result. Therefore, this review article tried to reveal the independent techniques which are used to overcome the challenges in research.Publication Open Access Combined Approach of Supervised and Unsupervised learning for Dog Face Recognition(IEEE, 2021-04-02) Weerasekara, D. T; Gamage, A; Kulasooriya, K. S. A. FOne would be surprised to hear the lost dog rates around the world. Even though it is something that one doesn't ponder a lot about, lost dogs are a problem that most dog owners fear. Dogs provide humans with companionship, protection, and unconditional love, and to the dogs; their whole world revolves around their owner and their family members. Therefore, when a pet dog goes missing, not only the dog owner but also the pet dog is affected. Unfortunately, in Sri Lanka, a lost dog being found is a very rare occurrence. A reason for this can be pointed out as the lack of an easily-accessible, public platform for lost dogs. In this research project, a solution to this problem has been implemented using image processing. This research study is about image classification and recognition using the Convolutional Neural Network (CNN) or also known as Shift Invariant or Space Invariant Artificial Neural Network (SIANN) by using TensorFlow framework as well as Keras library. The VGG16 model was customized for being used feature extraction. The implementation was a combination of both Machine Learning and Deep Learning. The platform to upload the found dog is also a continuous and inter-related subcomponent that provides a happy and healthy life for stray dogs too. That idea is providing them a higher chance to find a safe place to survive and also a home where they will be loved. The results are discussed in terms of the accuracy of the image recognition and classification in percentage. Each group of dogs get around 90% accuracy or above.Publication Embargo Road Navigation System Using Automatic Speech Recognition (ASR) And Natural Language Processing (NLP)(IEEE, 2019-01-31) Withanage, P; Liyanage, T; Deeyakaduwe, N; Dias, E; Thelijjagoda, SIn a highly evolving technical era, Voice-based Navigation Systems play a major role to bridge the gap between human and machine. To overcome the difficulty in taking and understanding user's voice commands, simulating the natural language, process the route with user's turn by turn directions while mentioning key entities like street names, landmarks, point of interests, junctions and map the route in an interactive interface, we propose a user-centric roadmap navigation mobile application called “Direct Me”. The approach of generating the user preferred route, system will first convert the audio streams into text through Automatic Speech Recognizer (ASR) using Pocket Sphinx Library, followed by Natural Language Processing (NLP) by utilizing Stanford CoreNLP Framework to retrieve the navigation-associated information and process the route in the Map using Google Map API upon the user request. This system is used to provide an efficient approach to translate natural language directions to a machine-understandable format and will benefit the development of voice-based navigation-oriented humanmachine interface.Publication Embargo Recognition and translation of Ancient Brahmi Letters using deep learning and NLP(IEEE, 2019-12) Wijerathna, K. A. S. A. N; Sepalitha, R; Thuiyadura, I; Athauda, H; Suranjini, P. D; Silva, J. A. D. C; Jayakodi, AInscriptions are major resources for studying the ancient history and culture of civilization in any country. Analyzing, recognizing and translating the ancient letters (Brahmi letters) from the inscription is a very difficult work for present generation. There is no any automatic system for translating Brahmi letters to Sinhala language. However, they are using manual method for translating inscriptions. The method that used in epigraphy is being taken a long period to decipher, analyze and translate the inscribed text in inscriptions. This research mainly focuses on recognition of ancient Brahmi characters written the time period between 3 rd B.C and 1 st A. D. First, we remove the noise, segment the letters from the inscription image and convert it into the binary image using image processing techniques. Secondly, we recognize the correct Brahmi letters, broken letters and then identify the time period of the inscriptions using Convolution Neural Networks in deep learning. Finally, the Brahmi letters are translated into modern Sinhala letters and provide the meaning of the inscription using Natural Language Processing. This proposed system builds up solution to overcome the existing problems in epigraphy.Publication Open Access Advance Technology for Kids to Improve Knowledge and Skills using Motion Gesture Recognition – Leap Mania(SLIIT, 2014-12-16) Nandasiri, K. G. M. P; Nawarathna, N. H. C. E. M; Mohamad, M. M. R; Herath, H. M. C. K; Kasthuriarachchi, K. T. S; Wijendra, DLeap mania is a gesture controlled e-leaning system which targets the nursery level kids to improve their knowledge and skills in a pleasurable learning environment. Game-based learning is becoming popular in the academic discussion of Learning Technologies. However, even though the educational potential of games has been thoroughly discussed in modern days, teaching to small kids became difficult due to the short attention spans of them. In addition to traditional methods of learning and teaching, such as reading books and newspapers, a huge variety of online educational resources are available to provide an atmosphere of fun and interactive designs to keep children engaged. However, there is no proper e-learning game tools with gesture control mechanism found among the tools and computer based applications for kids. This research focuses on building an enthusiastic and pleasurable learning environment to enhance the knowledge and skills of kids by implementing a game-based learning application using leap motion controller.Publication Open Access Bidirectional LSTM-CRF for Named Entity Recognition(32nd Pacific Asia Conference on Language, Information and Computation, 2018-12-01) Panchendrarajan, R; Amaresan, ANamed Entity Recognition (NER) is a challenging sequence labeling task which requires a deep understanding of the orthographic and distributional representation of words. In this paper, we propose a novel neural architecture that benefits from word and character level information and dependencies across adjacent labels. This model includes bidirectional LSTM (BI-LSTM) with a bidirectional Conditional Random Field (BI-CRF) layer. Our work is the first to experiment BI-CRF in neural architectures for sequence labeling task. We show that CRF can be extended to capture the dependencies between labels in both right and left directions of the sequence. This variation of CRF is referred to as BI-CRF and our results show that BI-CRF improves the performance of the NER model compare to an unidirectional CRF and backward CRF is capable of capturing most difficult entities compare to the forward CRF. Our system is competitive on the CoNLL-2003 dataset for English and outperforms most of the existing approaches which do not use any external labeled data.Publication Open Access Snap & Hear: Comic Book Analyst for Children Having Literacy and Visual Barriers(CSEDU 2020 - 12th International Conference on Computer Supported Education, 2020) Yapa, R. B. D; Kahaduwa Arachchi, T. L; Suriyarachchi, V. S; Abegunasekara, U. D; Thelijjagoda, SComic books are very popular across the world due to the unique experience they provide for all of us in the society without any age limitation. Because of this attraction, which comic books have received, it has proved that comic literature will be able to survive in the twenty first century, even with the existence of multidimensional movie theatres as its competitors. While the biggest global filmmakers are busy with making movies from comic books, many researchers have been investigating their time on digitizing the comic stories as it is, expecting to create a new era in the comic world. But most of them have focused only on one or few components of the story. This paper is based on a research which aims to give the full experience of enjoying the comic books for everyone in the world despite of visual and literacy barriers people are having. Proposed solution comes as a web application that translates input image of a comic story into a text format and delivers it as an audio story to the user. The story will be created using extracted components such as characters, objects, speech text and balloons and considering the association among them with the use of image processing and deep learning technologies.
