In 1985, computer scientists and researchers at Hewlett-Packard began working on an Optical Character Recognition (OCR) to help computers differentiate between images and text. In 2005, the project was open-sourced and in 2006, Google adopted the project and has been its primary sponsor ever since.

In its current state, Tesseract is supported by hundreds of freelance developers and works with over 100 languages across the globe from English to Mandarin, to Yiddish. In its first iteration, Tesseract was written in C, and then ported to C++ in 1998. While the software is run almost entirely through the command line, several developers have created a GUI for beginners.

Let’s talk about how to install Tesseract on our machines, ensure it’s working well, and give it a test run on an image or two.

As I’m using a Mac, my instructions will be for OS-X, however, there are plenty of guides available for Linux and Windows.

Step One, Installing Tesseract:

The easiest way to do this is to use Homebrew:

$ brew install tesseract

$ tesseract -v

#google #install #tesseract #testing #python

Installing Tesseract OCR
3.95 GEEK