Publication:
“Mahoshadha”, the Sinhala Tagged Corpus Based Question Answering System

Research Projects

Organizational Units

Journal Issue

Abstract

“Mahoshadha” the Sinhala Question Answering Systems aims at retrieving precise information from a large Sinhala tagged corpus. This paper describes a novel architecture for a Question Answering System which summarizes a tagged corpus and uses the summarization to generate the answers for a query. The summarized corpuses are categorized according to a set of topics enabling fast search for information. K-Nearest Neighbor Algorithms is used in order to cluster the summarized corpuses. The query will be tagged, the tagged query will be used to get more accurate results. Through the tagged query the question will be identified clearly with the category of the query. Support Vector Machine is used in order to both automate the summarization and question understanding. This will enable “Mahoshadha” to answer any type of query as well as summarize any type of Sinhala corpus. This enables the Question Answering System to be more useable through many applications.

Description

Keywords

Question answering, Document summarization, Document categorization, SVM algorithm, k-NN classification

Citation

Jayakody, J.A.T.K., Gamlath, T.S.K., Lasantha, W.A.N., Premachandra, K.M.K.P., Nugaliyadde, A., Mallawarachchi, Y. (2016). “Mahoshadha”, the Sinhala Tagged Corpus Based Question Answering System. In: Satapathy, S., Das, S. (eds) Proceedings of First International Conference on Information and Communication Technology for Intelligent Systems: Volume 1. Smart Innovation, Systems and Technologies, vol 50. Springer, Cham. https://doi.org/10.1007/978-3-319-30933-0_32

Endorsement

Review

Supplemented By

Referenced By