The first thing we would do is to resort to state-of-the-art datasets that can be used for our task, but the problem is we don’t always find what we need. At this moment we are facing a manual and a painful task. This article shows how to avoid this manual work.

Using Selenium [1](an open-source web-based automation tool), Instagram (indirectly one of the largest image databases in the world) and YOLO [2](one of the most employed deep learning algorithms in object detection) we can generate a dataset automatically (the only thing you can’t avoid is the validation step). To show a simple example, we will generate a simple dataset of two classes: cat and dog.

#selenium #web-scraping #dataset

Web Scraping using Selenium and YOLO to build Computer Vision Datasets
8.80 GEEK