Browsing by Author "Jayasuriya, P"

Now showing 1 - 4 of 4

Embargo
Sentiment classification of Sinhala content in social media
(IEEE, 2020-09-24) Jayasuriya, P; Ekanayake, S; Munasinghe, R; Kumarasinghe, B; Weerasinghe, I; Thelijjagoda, S
In this study, we focus on the classification of Sinhala social media sentiments into positive and negative classes for a particular domain (sports). We have employed machine learning algorithms and lexicon-based sentiment classification methods. We also consider a hybrid approach by constructing an ensemble classifier in which we combine Machine Learning and Lexicon based methods. For individual methods, machine learning algorithms performed best in terms of accuracy. The ensemble classifier was able to improve performance further.
Embargo
Sentiment classification of Sinhala content in social media
(IEEE, 2020-09-24) Jayasuriya, P; Ekanayake, S; Munasinghe, R; Munasinghe, B; Weerasinghe, I; Thelijjagoda, S
In this study, we focus on the classification of Sinhala social media sentiments into positive and negative classes for a particular domain (sports). We have employed machine learning algorithms and lexicon-based sentiment classification methods. We also consider a hybrid approach by constructing an ensemble classifier in which we combine Machine Learning and Lexicon based methods. For individual methods, machine learning algorithms performed best in terms of accuracy. The ensemble classifier was able to improve performance further.
Embargo
Sentiment Classification of Sinhala Content in Social Media: A Comparison between Stemmers and N-gram Features
(IEEE, 2021-12-09) Jayasuriya, P; Munasinghe, R; Thelijjagoda, S
Sentiment classification for non-English languages has gained significant attention from researchers in the past few years with the increasing use of non-English scripts and Romanized scripts for expressing sentiments over social media. In this study, we begin by classifying Sinhala sentiments on social media into positive and negative polarity classes using N-gram feature extraction. N-grams are a contiguous sequence of words or characters of a text. Then we focus on improving the classification accuracy by employing different stemming methods. Stemming is generally used to reduce the dimensionality of the feature set - something which needs to be carried out with great care as over reducing feature dimensionality causes the classification accuracy to decrease. Finally, we compare the accuracy and efficiency of N-gram feature extraction and stemming based sentiment analysis models.
Embargo
Sentiment Classification of Sinhala Content in Social Media: An Ensemble Approach
(IEEE, 2021-12-09) Jayasuriya, P; Munasinghe, R; Thelijjagoda, S
We focus on the binary classification of Sinhala social media content in the sports domain using machine learning algorithms. In particular, we improve upon the accuracy achieved in a previous study of ours that utilized word and character N-grams. We use the base learners from that study to implement a probability-based stacking ensemble approach. This is done by creating a base learner library of 1066 base learners, using 13 different algorithms and different N-gram feature extraction methods. Different base learner combinations from the library are then stacked together to find the best stacking ensemble model. The best stacking ensemble model achieves an accuracy of 83.8% which is an improvement of over 1.5% of our previous study.