Publication:
Algorithmically Navigating Complex Tabular Structures in Images for Information Extraction

dc.contributor.authorNugawela, M
dc.contributor.authorAbeywardena, K. Y
dc.contributor.authorMahaadikara, H
dc.date.accessioned2023-02-11T08:56:26Z
dc.date.available2023-02-11T08:56:26Z
dc.date.issued2022-12-26
dc.description.abstractComputer vision has been in the forefront of automating workflows to replace manual repetitive tasks with convenience and accuracy. Recognizing text from images of commercial documents through optical character recognition (OCR) form the initial step of most such workflows where majority of their information are in the form of complex data structures such as tables and nested tables. Although OCR technology has evolved to effectively capture text from images, there is still room for improvement in recognizing complex data structures and extracting tabular data from images. This paper proposes an algorithmic approach based on keyword detection and the position of words relative to each other in order to recognize nested structures and successfully extract tabular data into a program and human readable format, which aims to take a different approach as opposed to using machine learning models or pre-defined templates for layout recognition. Furthermore, this approach is shown to yield successful results in correctly comprehending the layout and data of nested table structures in multiple rows in a table.en_US
dc.identifier.citationM. Nugawela, K. Y. Abeywardena and H. Mahaadikara, "Algorithmically Navigating Complex Tabular Structures in Images for Information Extraction," 2022 3rd International Informatics and Software Engineering Conference (IISEC), Ankara, Turkey, 2022, pp. 1-6, doi: 10.1109/IISEC56263.2022.9998220.en_US
dc.identifier.doi10.1109/IISEC56263.2022.9998220en_US
dc.identifier.issn978-1-6654-5995-2
dc.identifier.urihttps://rda.sliit.lk/handle/123456789/3264
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.relation.ispartofseries2022 3rd International Informatics and Software Engineering Conference (IISEC);
dc.subjectAlgorithmicallyen_US
dc.subjectNavigating Complexen_US
dc.subjectTabular Structuresen_US
dc.subjectInformation Extractionen_US
dc.subjectImagesen_US
dc.titleAlgorithmically Navigating Complex Tabular Structures in Images for Information Extractionen_US
dc.typeArticleen_US
dspace.entity.typePublication

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Algorithmically_Navigating_Complex_Tabular_Structures_in_Images_for_Information_Extraction.pdf
Size:
510.92 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: