As I wrote in the last article of this series, focal loss is a more focused cross entropy loss. In semantic segmentation problems, focal loss can help the model focus on pixels that have not been well trained yet, which is more effective and purposeful than cross entropy loss. I recommend the article if you haven ‘t read it yet.

Demystifying Focal Loss I: A More Focused Version of Cross Entropy Loss

In computer vision, cross entropy is a widely used loss item in classification problems. In semantic segmentation, we…

Again in this article, I ‘d like to talk about a variant of focal loss that can be used as a distance-aware cross entropy loss in semantic segmentation problems, especially with sparse labels.

The issue of standard cross entropy loss

Cross entropy loss is typically used in semantic segmentation, in which each pixel of the image is labeled with a category number, and a model is trained to predict that number for each pixel.

This per-pixel prediction fashion is widely used in almost all academic papers about semantic segmentation. However, there exists an issue, which may cause bad influence on convergence and is seldomly discussed. The sparser the labels are, the more significant the issue will be.

Let ‘s take a binary classification scenario as an example. Extension to multi-class classification is straightforward. Look at the following Figure 1, in which a blue pixel is labeled with 1, and all other pixels are labeled with 0. This labeled image is extremely sparse and foreground-background imbalanced because only one pixel is labeled as foreground class.

Image for post

Figure 1: a labeled image for binary classification

Now, imaging Figure 1 is input to our model, and we get an output image in which the label of every pixel is predicted. As conceptually shown in Figure 2, among all the resulting pixels, we take two of them for discussion. The two pixels are colored in green, and the labeled blue pixel is also preserved here for ease of explanation.

#optimization #machine-learning #deep-learning #convolutional-network #computer-vision

Understanding Focal Loss for Pixel-level Classification
31.50 GEEK