What is Google API Vision? And how to use it

What is Google API Vision? And how to use it

Extract text from image with OCR using a service account. This post finds his root in an interesting project of knowledge extraction. The first step was to extract the text of pdf documents.


This post finds his root in an interesting project of knowledge extraction. The first step was to extract the text of pdf documents. The company that I work for is based on the Google platform, so naturally, I would like to use the OCR of the API Vision but, can’t find an easy way to use the API to extract text. So here this post.

_The notebook of this post is available on [GitHub_](https://github.com/Christophe-pere/API_vision_google)

Google API Vision

Google released the API to help people, industry, and researchers to use their functionalities.

Google Cloud's Vision API has powerful machine learning models pre-trained through REST and RPC APIs. Tag images and quickly organize them into millions of predefined categories. You will be able to detect objects and faces, read printed or handwritten text, and integrate useful metadata into your image catalog. (source: [API Vision_](https://cloud.google.com/vision))_

The part of the API that interested us for this post is the OCR part.


Optical Character Recognition or OCR is a technology where characters are recognized and detected inside an image. Most of the time Convolutional Neural Networks (CNN) are trained on a very large dataset of characters and numbers in different types and colors. You can imagine a small window slicing on each pixel or group of pixels to detect characters or partial characters, spaces, forms, lines etc.

Service Account

A service account is a special type of Google account intended to represent a non-human user that needs to authenticate and be authorized to access data in Google APIs. (source: IAM google cloud)

Basically you can imagine it as an RSA key (encrypted key to communicate with high security between machine via the internet) with which you can connect to Google services (API, GCS, IAM…). Its basic form is a json file.


Here, I will show you the different functions to use the API and extract the text from the image automatically.

Libraries needed to be installed:

!pip install google-cloud
!pip install google-cloud-storage
!pip install google-cloud-pubsub
!pip install google-cloud-vision
!pip install pdf2image
!pip install google-api-python-client
!pip install google-auth

The libraries used:

from pdf2image import convert_from_bytes
import glob
from tqdm import tqdm
import base64
import json
import os
from io import BytesIO
import numpy as np
import io
from PIL import Image
from google.cloud import pubsub_v1
from google.cloud import vision
from google.oauth2 import service_account
import googleapiclient.discovery
## to see a progress bar

data-science deep-learning machine-learning towards-data-science computer-vision

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

Why you should learn Computer Vision and how you can get started

A few compelling reasons for you to starting learning Computer. In today’s world, Computer Vision technologies are everywhere.

Most popular Data Science and Machine Learning courses — July 2020

Most popular Data Science and Machine Learning courses — August 2020. This list was last updated in August 2020 — and will be updated regularly so as to keep it relevant

Difference between Machine Learning, Data Science, AI, Deep Learning, and Statistics

In this article, I clarify the various roles of the data scientist, and how data science compares and overlaps with related fields such as machine learning, deep learning, AI, statistics, IoT, operations research, and applied mathematics.

Applications Of Data Science On 3D Imagery Data

The agenda of the talk included an introduction to 3D data, its applications and case studies, 3D data alignment and more.

15 Machine Learning and Data Science Project Ideas with Datasets

Learning is a new fun in the field of Machine Learning and Data Science. In this article, we’ll be discussing 15 machine learning and data science projects.