documents in tabular format or incidentally in data blocks without distinctive graphical borders. A borderless table may help to simplify the visual perception of semi-structured data for us, humans. From the machine-reading point of view, such presenting information on a page has quite a few shortcomings which make it difficult to separate the data belonging to a presumptive table structure from the surrounding textual context.
Tabular data extraction as a business challenge may have several ad-hoc or heuristiс rules-based solutions which definitely will fail with a table of a bit different layout or style. On a large scale, one should use a more general approach for identifying table-like structures in an image, more specifically a deep learning-based object detection approach.
#tensorflow #ocr #object-detection #deep-learning #machine-learning #borderless tables detection with deep learning and opencv