OCR and the WordSearch solver AI

Introduction

Recently, I have seen a lot of posts on Sodoku solvers. The algorithm for it is quite simple, usually a backtracking recursion algorithm in only a few lines of code in Python. My favorite video is from Computerphile and Professor Thorsten Altenkirch, who I could watch all day. If using a computer vision approach, a simple model already trained on MNIST is all you need, or something equivalent. A really great video explaining all the steps can be found here.

So where does that leave us with Wordsearch. Well, I was getting bored with seeing multiple versions and videos of Sodoku puzzle solvers. So I instead thought of seeing if anyone had done a Wordsearch solver. To my surprise there wasn’t something quite what I was looking for. Now, that is not to say it people haven’t done this, but I felt like it would be a good project to practice and so decided to not look further into it as I wanted to solve this myself. For this I am using WordSearch.com

After completing this project I did stumble across a nice post from Martin Charles, which is typical to most I have seen where the text needs to be entered by hand.

This was a good project because it touched on the following topics:

Custom OCR — Training (including getting images) while handling small and imbalanced data, Testing, Deploying
Image Processing — Finding the grid etc.
Path finding — The brain. This lead me down a long and interesting path although I stuck with my original simple solution in the end
Automated control — Getting the computer to solve the puzzle

For the sake of brevity, I will refrain my discussion to the OCR, Path finding and automated control. After all, finding the grid was easy and the same conditions used in the Sodoku solvers, for which there are many, can be used. Here is the result of the image processing which finds the grid box (outlined in red), the words box (outlined in yellow) and each letter bounding box outlined in green.

#ai

Introduction

towardsdatascience.com

OCR and the WordSearch solver AI