In this blog, I will walk through an introduction to instance-level recognition, use cases, challenges, currently available dataset, and state of the art results (recent winner solutions) on these challenges/datasets.

Introduction

Instance Level Recognition (ILR), is a visual recognition task to recognize a specific instance of an object not just object class.

For example, as shown in the above image, painting is an object class, and “Mona Lisa” by Leonardo Da Vinci is an instance of that painting. Similarly, the Taj Mahal, India is an instance of the object class building.

Use cases

  • Landmark Recognition: Recognize landmarks in images.
  • Landmark Retrieval: Retrieve relevant landmark images from a large-scale database.
  • Artwork Recognition: Recognize artworks in images.
  • Product Retrieval: Retrieve relevant product images from a large-scale database.

Challenges

  • Large scale: Most of the current state of the art results of recognition tasks are measured on very limited categories e.g. ~1000 image classes in ImageNet, ~80 categories in COCO. But use-cases like landmark retrieval and recognition, has 200K+ classes e.g. in Google Landmark Dataset V2 (GLDv2), 100K+ classes of the product on Amazon.
  • Long-tailedFew popular places have more than 1000+ images but many less know places have less than 5 images in GLDv2.

Google Landmarks Dataset v2 (GLDv2) Class Distribution, Image from https://arxiv.org/pdf/2004.01804.pdf

  • **Intra-class variability: **Landmarks are mostly spread across a wide region and have very high intra-class variability as shown in the below image.

Image for post

Images from Google Landmarks Dataset v2 (GLDv2)

  • **Noisy Labels: **The success of machine learning models depends on high-quality labeled training data, as the presence of labels errors can greatly reduce the model’s performance. These noisy labels as shown in the below image, unfortunately, noisy labels are part of a large training set and need additional learning steps.

Image for post

#object-detection #machine-learning #deep-learning #data-science #computer-vision

Instance-level Recognition
2.50 GEEK