What you see is what you guess

What you see is what you guess

Browsing through the menu of services that Microsoft’s cloud services platform offers, I came across Computer Vision (CV). I decided to play a bit with it to have a feeling of what the service has to offer.

Browsing through the menu of services that Microsoft’s cloud services platform offers, I came across Computer Vision (CV). I decided to play a bit with it to have a feeling of what the service has to offer.

So I created a free account using my github credentials and got successfully onboarded (link for the US) after the usual verification steps. Then I followed the instructions to create a CV instance and get my API key.

The service analyzes content in images and video either supplied by the user or publicly available in the internet. The service is free for the first 12 months and with an upper limit of 5,000 transactions per month at a maximum rate of 20 per minute, which is more that in enough to carry out some tests.

Technically, the service is delivered through a REST web service, which made me think that I could start testing it pretty fast, which I did.

What it delivers

Of the various services within CV, I will focus on image analysis. In this case, CV provides you with the following information:

  • Tags: From a universe of thousands of tags, CV lists the identified object types in your image, such as dogtree _or _car.
  • Objects: Whenever possible, CV also provides a list of objects bounded by rectangles in your picture. So if there are three identified bicycles, it will give the coordinates of the bounding box for each of the three.
  • Brands: It can detect logos from a set of thousands of well-known brands.
  • Category: From a fixed list of predefined 86 categories, CV will assign to your picture the category that fits best (example: food_grilled or outdoor_street)
  • Description: It gives you a description of the whole image in the language you select. Actually this is the feature I was interested in, so I will stop the enumeration here. For the full list see this.

Enough literature, let us write some python code and throw some Unsplash images to CV to start the fun

A simplified Computer Vision client

REST web services can be consumed in basically any general purpose programming language. We’ll be using Python here with the image above this lines. Have a look and read the comments:

import requests
import json

## connection details
## Replace this silly pun with your API
azure_cv_api_key = "MyAPI Heat"
## same here
azure_cv_endpoint = "somesubdomain.cognitiveservices.azure.com"
azure_cv_resource = "vision/v3.1/analyze"
language = "en"
## We just ask for some features
visual_features = "Objects,Categories,Description"
image_path = "c:/work/images/tobias-adam-Twm64rH8wdc-unsplash.jpg"

azure_cv_url = "https://{}/{}".format(azure_cv_endpoint,
                                      azure_cv_resource)
headers = {'Ocp-Apim-Subscription-Key': azure_cv_api_key,
           'Content-Type': 'application/octet-stream'}

params = {"visualFeatures": visual_features, "language": language}

## We need to read the image as a byte stream
image_data = open(image_path, "rb").read()

response = requests.post(azure_cv_url, params=params, data=image_data, headers=headers)

## assume you get a 200 status (ok)
content = json.loads(response.content.decode(response.encoding))

## This is where the picture description can be found
print("Description\n{}".format(content["description"]["captions"][0]["text"]))
## Which objects have you found?
for o in content["objects"]:
    print("Object {} Parent {} Grandparent {}".format(o["object"], o["parent"]["object"]), o["parent"]["parent"]["object"])

We run it and get:

Description

a baby elephant walks next to its mother

Object African elephant Parent elephant Grandparent mammal

Object African elephant Parent elephant Grandparent mammal

Wow — I know, the bigger elephant could be the father or the aunt, but it sounds really good.

ocr azure

What is Geek Coin

What is GeekCash, Geek Token

Best Visual Studio Code Themes of 2021

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

How to set up Azure Data Sync between Azure SQL databases and on-premises SQL Server

In this article, you learn how to set up Azure Data Sync services. In addition, you will also learn how to create and set up a data sync group between Azure SQL database and on-premises SQL Server.

Analyze Azure Cosmos DB data using Azure Synapse Analytics

This article will help you understand how to analyze Azure Cosmos DB data using Azure Synapse Analytics.

Integrating Azure Purview with Azure Synapse Analytics

In this Tutorial, we will learn how to integrate Azure Purview and Azure Synapse Analytics capabilities to access data catalog assets hosted in Azure Purview from Azure Synapse.

Getting Started With Azure Event Grid Viewer

In the article, we will go to the next step to create a subscription and use webhook event handlers to view those logs in our Azure web application.

Azure Automation: Automate Azure SQL Database indexes and statistics maintenance

This article will show how to automate Azure SQL Database index and statistics maintenance.