Please use this identifier to cite or link to this item: https://rda.sliit.lk/handle/123456789/2003
Full metadata record
DC FieldValueLanguage
dc.contributor.authorGunasekara, S. V. S-
dc.contributor.authorHaddela, P. S-
dc.date.accessioned2022-04-22T02:41:21Z-
dc.date.available2022-04-22T02:41:21Z-
dc.date.issued2018-10-02-
dc.identifier.citationS. V. S. Gunasekara and P. S. Haddela, "Context aware stopwords for Sinhala Text classification," 2018 National Information Technology Conference (NITC), 2018, pp. 1-6, doi: 10.1109/NITC.2018.8550073.en_US
dc.identifier.issn2279-3895-
dc.identifier.urihttp://rda.sliit.lk/handle/123456789/2003-
dc.description.abstractWhen working with Text Classification (TC), often the term "stopword" can be heard. Words in a document that are frequently occurring, but meaningless in terms of Information Retrieval (IR) are called Stopwords. There are various stopword lists available for many languages. According to the best of knowledge, no any generic stopword list has been built for the Sinhala language. This paper demonstrates how to generate a domain-specific stopword list from a given data set of Sinhala Newspapers. Hence, the seven stopword identification methods previously applied to other languages are presented to remove stopwords. Then, a new algorithm for building a domain-specific stopword list is proposed. For this method, it is assumed that average F-measure and average accuracy for the set of different stopword lists are measured by the performance of two classifiers. Based on the given comparative study, the most effective method to classify stopwords in Sinhala corpus can be identified.en_US
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.relation.ispartofseries2018 National Information Technology Conference (NITC);Pages 1-6-
dc.subjectText classificationen_US
dc.subjectSinhala Texten_US
dc.subjectContext awareen_US
dc.subjectstopwordsen_US
dc.titleContext aware stopwords for Sinhala Text classificationen_US
dc.typeArticleen_US
dc.identifier.doi10.1109/NITC.2018.8550073en_US
Appears in Collections:Department of Information Technology-Scopes
Research Papers - IEEE
Research Papers - SLIIT Staff Publications

Files in This Item:
File Description SizeFormat 
Context_aware_stopwords_for_Sinhala_Text_classification.pdf
  Until 2050-12-31
336.94 kBAdobe PDFView/Open Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.