Semantic segmentation is the task of predicting the class of each pixel in an image. This problem is more difficult than object detection, where you have to predict a box around the object. It is slightly easier than instance segmentation, where you have to not only predict the class of each pixel but also differentiate between multiple instances of the same class. The picture below shows the result that we are trying to get.

Image for post

A sample of semantic hand segmentation. (images from HOF dataset[1])

Here we will try to get a quick and easy hand segmentation software up and running, using Pytorch and its pre-defined models.

We would not be designing our own neural network but will use DeepLabv3 with a Resnet50 backbone from Pytorch’s model repository. Then we will train our model on a combined dataset comprising of EGO Hands[2], GTEA[3] and Hand over Face[1] datasets. This will make up roughly 28k images and their segmentation mask which is 2.1 GB of data. Finally we will write some functions to use the model to segment hands in real time using OpenCV.

#pytorch #opencv #data-science #machine-learning #pyhton

Semantic Hand Segmentation using Pytorch
7.30 GEEK