Publication:
Intelligent Digitalization of the Sinhala Form Templates

Research Projects

Organizational Units

Journal Issue

Abstract

In Sri Lanka, most of the population uses the Sinhala Language as their first language to communicate and for documentation in most government departments. It is evident that the digitalization of the Sinhala Language is essential in a country like Sri Lanka. The specialty of Sinhalese characters is that they have very tiny differences in feature, and the number of different characters formed from the letters of the Sinhala alphabet and its elements is relatively high, leading to the classification among the Sinhala letters becoming quite a complex task. Previous proposed research case studies involved machine learning based feature detections related to rule-based theories and geometry features that had average accuracy rates, which indicate that further improvement is required with new features. Consequently, in this research paper, a Deep Learning Character Classification method for Sinhala OCR is proposed, which is for both Printed and Handwritten Sinhala texts as well as an Intelligent Sinhala Form Automation technique to read both answers and questions in an application to convert them into e-texts. The converted e-texts will be sharpened and fixed through a Sinhala Spelling & Grammar checking feature that is developed in the system more intelligently. In this research work, it was a success to obtain an overall accuracy level of more than 90% considering all components.

Description

Keywords

Intelligent Digitalization, Sinhala Form, Templates

Citation

K. Gomez, M. Jinadasa, V. Dantanarayana, S. Dissanayake, N. Kodagoda and T. Kuruppu, "Intelligent Digitalization of the Sinhala Form Templates," TENCON 2021 - 2021 IEEE Region 10 Conference (TENCON), 2021, pp. 527-532, doi: 10.1109/TENCON54134.2021.9707186.

Endorsement

Review

Supplemented By

Referenced By