Faculty of Computing

Permanent URI for this communityhttps://rda.sliit.lk/handle/123456789/4202

Browse

Search Results

Now showing 1 - 5 of 5

Embargo
Speech Master: Natural Language Processing and Deep Learning Approach for Automated Speech Evaluation
(IEEE, 2021-12-06) Kooragama, K.G.C.M; Jayashanka, L. R. W. D; Munasinghe, J. A; Jayawardana, K. W; Tissera, M; Jayasingha, T. B
Every English speaker wishes to expertise his/her public speaking skills sharply. However, it is extremely difficult and requires a significant amount of practice and experience on an individual basis. This paper introduces a novel online tool “Speech Master” to practice and improve public English speech delivering skills in a professional manner. Using natural language processing, machine learning, and deep learning approaches, the proposed system analyzes the user's speech in terms of content, grammatical accuracy, grammatical richness, facial expressions, and flow. The accuracy was checked by comparing actual results taken from experts with the predicted results obtained from the tool. “Speech Master” achieves an average accuracy of more than 80% and produces a better overall result. This novel tool benefits English speakers all over the world by meeting the demand for a simple and easy-to-use solution for improving or practicing English speech delivery skills; enhancing oratory skills, boosting confidence, and delivering well-articulated speeches.
Embargo
Fuzzy Based User-Centric Smart Approach to Prevent Unhealthy Eating Habit Crisis
(IEEE, 2021-12-06) Malshani, H.K.N; Sasanka, D; Wickramaratne, U. I; Kavindi, Y; Tissera, M; Attanayaka, B. L
Cholesterol plays a major role in keeping the human heart-healthy. High blood cholesterol is a key risk factor for cardiovascular diseases such as coronary heart disease and stroke. High blood cholesterol has become a severe health problem. The risk factors that affect a person's health cholesterol content include but are not limited to unhealthy diet, physical inactivity, high-stress status, genetic factors. However, many studies have discovered that there is a direct association between elevated levels of serum cholesterol and unhealthy eating habits. This problematic situation is influenced by busy lifestyles, a lack of time to pay attention to their meals, and a lack of knowledge about removing trans fats from daily diets. The primary goal of this study is to provide personalized guidance to prevent unhealthy eating habits via a mobile application. It allows to create a virtual food plate, analyze the risk status of that food plate, and predict a balanced diet as a replacement for the original food plate. The proposed approach uses an originally constructed food knowledge base and fuzzy logic to predict a balanced diet. Outcomes of the approach were tested using the interrater agreement manual method and obtained an accuracy of 80%. The computed Inter Rater Reliability (IRR) score was 75%, which shows the high reliability among the testers. Finally, this approach will assist people in self-controlling their cholesterol levels and prevent deadly heart diseases.
Embargo
Auto Generation of Gold Standard, Class Labeled Data Set and Ontology Extension Tool [QuadW]
(IEEE, 2019-02-25) Tissera, M; Weerasinghe, R
Automatic Knowledge Extraction (AKE) from domain independent, unstructured text sources is a challenging task in Natural Language Processing and Text analytics. Though, supervised learning mechanisms are very much result promising, application is painful due to the mandatory requirement of a class labeled training data set, as it involves expensive manual effort which is more time consuming. As a solution for this problem, this paper introduces a novel mechanism to build a self-learned classifier model that can automatically generate class labeled training data set for Knowledge/Information Extraction from domain independent unstructured text. Sri Lankan English newspapers (which comprise unstructured text in unconstrained domains) are the main data source for this study and a prototype was built to Professional Information Extraction with the semantic pattern Who holds/held What position, Where and When (Four words start with `W', hence named `QuadW'). Methodology uses advanced machine learning techniques such as, a Random Forest with Adaboost ensemble algorithm to build a composite classification model. This classifier is called as self-learned since, it generates its own training data set automatically. This composite model has improved accuracy and avoided over fitting to data as well. The rule-based feature extraction algorithm and the hand-craft ontology developed, can also be considered as novel components of this study. Self-learned classifier has been extensively improved and tested to show higher accuracy with precision and recall close to one. Therefore, the classified output from the self-learned classifier can be used as a gold-standard data set for future research in Professional Information Extraction. The constructed ontology with approximately 400 facts, also can be effectively used in future researches. Further, introduced classifier can be used as a tool to extend the existing ontology as well. A novel usage of machine learning algorithms to text classification demonstrates that, this study goes with the state-of-the-art technologies.
Embargo
Deepfake Audio Detection: A Deep Learning Based Solution for Group Conversations
(IEEE, 2020-12-10) Wijethunga, R. L. M. A. P. C; Matheesha, D. M. K; Noman, A. A; De Silva, K. H. V. T. A; Tissera, M; Rupasinghe, L
The recent advancements in deep learning and other related technologies have led to improvements in various areas such as computer vision, bio-informatics, and speech recognition etc. This research mainly focuses on a problem with synthetic speech and speaker diarization. The developments in audio have resulted in deep learning models capable of replicating natural-sounding voice also known as text-to-speech (TTS) systems. This technology could be manipulated for malicious purposes such as deepfakes, impersonation, or spoofing attacks. We propose a system that has the capability of distinguishing between real and synthetic speech in group conversations.We built Deep Neural Network models and integrated them into a single solution using different datasets, including but not limited to Urban-Sound8K (5.6GB), Conversational (12.2GB), AMI-Corpus (5GB), and FakeOrReal (4GB). Our proposed approach consists of four main components. The speech-denoising component cleans and preprocesses the audio using Multilayer- Perceptron and Convolutional Neural Network architectures, with 93% and 94% accuracies accordingly. The speaker diarization was implemented using two different approaches, Natural Language Processing for text conversion with 93% accuracy and Recurrent Neural Network model for speaker labeling with 80% accuracy and 0.52 Diarization-Error-Rate. The final component distinguishes between real and fake audio using a CNN architecture with 94 % accuracy. With these findings, this research will contribute immensely to the domain of speech analysis.
Embargo
Deepfake audio detection: a deep learning based solution for group conversations
(IEEE, 2020-12-10) Wijethunga, R. L. M. A. P. C; Matheesha, D. M. K; Noman, A. A; De Silva, K. H. V. T. A; Tissera, M; Rupasinghe, L
The recent advancements in deep learning and other related technologies have led to improvements in various areas such as computer vision, bio-informatics, and speech recognition etc. This research mainly focuses on a problem with synthetic speech and speaker diarization. The developments in audio have resulted in deep learning models capable of replicating natural-sounding voice also known as text-to-speech (TTS) systems. This technology could be manipulated for malicious purposes such as deepfakes, impersonation, or spoofing attacks. We propose a system that has the capability of distinguishing between real and synthetic speech in group conversations.We built Deep Neural Network models and integrated them into a single solution using different datasets, including but not limited to Urban-Sound8K (5.6GB), Conversational (12.2GB), AMI-Corpus (5GB), and FakeOrReal (4GB). Our proposed approach consists of four main components. The speech-denoising component cleans and preprocesses the audio using Multilayer- Perceptron and Convolutional Neural Network architectures, with 93% and 94% accuracies accordingly. The speaker diarization was implemented using two different approaches, Natural Language Processing for text conversion with 93% accuracy and Recurrent Neural Network model for speaker labeling with 80% accuracy and 0.52 Diarization-Error-Rate. The final component distinguishes between real and fake audio using a CNN architecture with 94 % accuracy. With these findings, this research will contribute immensely to the domain of speech analysis.

Faculty of Computing

Browse

Filters

Advanced Search

Filter by

Settings

Sort By

Results per page

Search Results