Image segmentation has been a hot topic for a while now. Various uses cases involving segmentation had emerged in a bunch of different areas, machine vision, medical imaging, object detection, recognition tasks, traffic control systems, video surveillance and, a lot more. The intuition behind these intelligent systems is to capture the diverse components that form the image and therefore, teach computer vision models to grasp more insight and a better understanding of the scene and the context.
Original Photo by Melody Jacob on Unsplash on the left, segmented version on the right
The two types of image segmentation commonly used are:
For this article, I will use the Pytorch implementation of the Google DeepLab V3 segmentation model to customize the background of an image. The intention is to segment the foreground and detach it from the rest while replacing the remaining background with a whole different picture. The model will be served through a Django REST API.
You can check the entire code for this project under my Gihut repo.
Segmentation models use fully convolutional neural networks **FCNN ** during a prior image detection stage where masks and boundaries are put in place then, the inputs are processed through a vastly deep network where the accumulated convolutions and poolings cause the image to importantly decrease its resolution and quality, hence results are yield with a high loss of information. DeepLab models address the challenge leveraging on Atrous convolutions and Atrous Spatial Pyramid Pooling (ASPP) architectures.
#deeplab #computer-vision #deep-learning #artificial-intelligence #semantic-segmentation