Today I would like to share with you a simple solution to image deskewing problem (straightening a rotated image). If you’re working on anything that has text extraction from images — you will have to deal with image deskewing in one form or another. From camera pictures to scanned documents — deskewing is a mandatory step in image pre-processing before feeding the cleaned-up image to an OCR tool.

As I myself was learning and experimenting with image processing in OpenCV, I found that in the majority of tutorials you just get a copy-pasted code solution, with barely any explanation of the logic behind it. That’s just not right. We need to understand the algorithms and how we can combine various image transformations to solve a given problem. Otherwise we won’t make any progress as software engineers. So in this tutorial I will try to keep the code snippets to bare minimum, and concentrate on explaining the ideas that make it work. But don’t worry, you can always find the complete code in my GitHub repo by the link at the end of this article.


Deskewing algorithm

Let’s start by discussing the general idea of deskeweing algorithm. Our main goal will be splitting the rotated image into text blocks, and determining the angle from them. To give you a detailed break-down of the approach that I’ll use:

  1. Per usual — convert the image to gray scale.
  2. Apply slight blurring to decrease noise in the image.
  3. Now our goal is to find areas with text, i.e. text blocks of the image. To make text block detection easier we will invert and maximize the colors of our image, that will be achieved via thresholding. So now text becomes white (exactly 255,255,255 white), and background is black (same deal 0,0,0 black).
  4. To find text blocks we need to merge all printed characters of the block. We achieve this via dilation (expansion of white pixels). With a larger kernel on X axis to get rid of all spaces between words, and a smaller kernel on Y axis to blend in lines of one block between each other, but keep larger spaces between text blocks intact.
  5. Now a simple contour detection with min area rectangle enclosing our contour will form all the text blocks that we need.
  6. There can be various approaches to determine skew angle, but we’ll stick to the simple one — take the largest text block and use its angle.

#image-processing #python #ai #computer-vision #opencv

How to automatically deskew (straighten) a text image using OpenCV
28.20 GEEK