Publication:
Sentiment Classification of Sinhala Content in Social Media: A Comparison between Stemmers and N-gram Features

dc.contributor.authorJayasuriya, P
dc.contributor.authorMunasinghe, R
dc.contributor.authorThelijjagoda, S
dc.date.accessioned2022-03-03T07:46:47Z
dc.date.available2022-03-03T07:46:47Z
dc.date.issued2021-12-09
dc.description.abstractSentiment classification for non-English languages has gained significant attention from researchers in the past few years with the increasing use of non-English scripts and Romanized scripts for expressing sentiments over social media. In this study, we begin by classifying Sinhala sentiments on social media into positive and negative polarity classes using N-gram feature extraction. N-grams are a contiguous sequence of words or characters of a text. Then we focus on improving the classification accuracy by employing different stemming methods. Stemming is generally used to reduce the dimensionality of the feature set - something which needs to be carried out with great care as over reducing feature dimensionality causes the classification accuracy to decrease. Finally, we compare the accuracy and efficiency of N-gram feature extraction and stemming based sentiment analysis models.en_US
dc.identifier.citationP. Jayasuriya, R. Munasinghe and S. Thelijjagoda, "Sentiment Classification of Sinhala Content in Social Media: A Comparison between Stemmers and N-gram Features," 2021 IEEE 16th International Conference on Industrial and Information Systems (ICIIS), 2021, pp. 134-139, doi: 10.1109/ICIIS53135.2021.9660711.en_US
dc.identifier.doi10.1109/ICIIS53135.2021.9660711en_US
dc.identifier.issn2164-7011
dc.identifier.urihttps://rda.sliit.lk/handle/123456789/1453
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.relation.ispartofseries2021 IEEE 16th International Conference on Industrial and Information Systems (ICIIS);Pages 134-139
dc.subjectSentiment Classificationen_US
dc.subjectSinhala Contenten_US
dc.subjectSocial Mediaen_US
dc.subjectComparison between Stemmersen_US
dc.subjectN-gram Featuresen_US
dc.titleSentiment Classification of Sinhala Content in Social Media: A Comparison between Stemmers and N-gram Featuresen_US
dc.typeArticleen_US
dspace.entity.typePublication

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Sentiment_Classification_of_Sinhala_Content_in_Social_Media_A_Comparison_between_Stemmers_and_N-gram_Features.pdf
Size:
449.18 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: