Extract text from image with OCR using a service account. This post finds his root in an interesting project of knowledge extraction. The first step was to extract the text of pdf documents.
This post finds his root in an interesting project of knowledge extraction. The first step was to extract the text of pdf documents. The company that I work for is based on the Google platform, so naturally, I would like to use the OCR of the API Vision but, can’t find an easy way to use the API to extract text. So here this post.
_The notebook of this post is available on [GitHub_](https://github.com/Christophe-pere/API_vision_google)
Google released the API to help people, industry, and researchers to use their functionalities.
Google Cloud's Vision API has powerful machine learning models pre-trained through REST and RPC APIs. Tag images and quickly organize them into millions of predefined categories. You will be able to detect objects and faces, read printed or handwritten text, and integrate useful metadata into your image catalog. (source: [API Vision_](https://cloud.google.com/vision))_
The part of the API that interested us for this post is the OCR part.
Optical Character Recognition or OCR is a technology where characters are recognized and detected inside an image. Most of the time Convolutional Neural Networks (CNN) are trained on a very large dataset of characters and numbers in different types and colors. You can imagine a small window slicing on each pixel or group of pixels to detect characters or partial characters, spaces, forms, lines etc.
A service account is a special type of Google account intended to represent a non-human user that needs to authenticate and be authorized to access data in Google APIs. (source: IAM google cloud)
Basically you can imagine it as an RSA key (encrypted key to communicate with high security between machine via the internet) with which you can connect to Google services (API, GCS, IAM…). Its basic form is a json file.
Here, I will show you the different functions to use the API and extract the text from the image automatically.
Libraries needed to be installed:
!pip install google-cloud
!pip install google-cloud-storage
!pip install google-cloud-pubsub
!pip install google-cloud-vision
!pip install pdf2image
!pip install google-api-python-client
!pip install google-auth
The libraries used:
from pdf2image import convert_from_bytes
import glob
from tqdm import tqdm
import base64
import json
import os
from io import BytesIO
import numpy as np
import io
from PIL import Image
from google.cloud import pubsub_v1
from google.cloud import vision
from google.oauth2 import service_account
import googleapiclient.discovery
## to see a progress bar
tqdm().pandas()
data-science deep-learning machine-learning towards-data-science computer-vision
A few compelling reasons for you to starting learning Computer. In today’s world, Computer Vision technologies are everywhere.
Most popular Data Science and Machine Learning courses — August 2020. This list was last updated in August 2020 — and will be updated regularly so as to keep it relevant
In this article, I clarify the various roles of the data scientist, and how data science compares and overlaps with related fields such as machine learning, deep learning, AI, statistics, IoT, operations research, and applied mathematics.
The agenda of the talk included an introduction to 3D data, its applications and case studies, 3D data alignment and more.
Learning is a new fun in the field of Machine Learning and Data Science. In this article, we’ll be discussing 15 machine learning and data science projects.