Deep neural networks (DNNs) are easy-to-implement, versatile machine learning models that can achieve state-of-the-art performance in many domains (for example, computer visionnatural language processingspeech recognitionrecommendation systems). DNNs, however, are not perfect. You can read any number of articlesblog posts, and books discussing the various problems with supervised deep learning. In this article we’ll focus on a (relatively) narrow but major issue: the inability for a standard DNN to reliably show when it is uncertain about a prediction. For a Rumsfeldian take on it: The inability of DNNs to know “known unknowns.”

As a simple example of this failure mode in DNNs, consider training a DNN for a binary classification task. You might reasonably presume that the softmax (or sigmoid) output of a DNN could be used to measure how certain or uncertain the DNN is in its prediction; you would expect that seeing a softmax output close to 0 or 1 would indicate certainty, and an output close to 0.5 would indicate uncertainty. In reality, the softmax outputs are rarely close to 0.5 and are, more frequently than not, close to 0 or 1 regardless of whether the DNN is making a correct prediction. Unfortunately, this fact makes naive uncertainty estimates unreliable (for instance, entropy over the softmax outputs).

To be fair, uncertainty estimates are not needed for every application of a DNN. If a social media company uses a DNN to detect faces in images so that its users can more easily tag their friends, and the DNN fails, then the failure of the method is nearly inconsequential. A user might be slightly inconvenienced, but in low-stakes environments like social media or advertising, uncertainty estimates aren’t vital to creating value from a DNN.

In high-stakes environments, however, like self-driving cars, health care, or military applications, a measure of how uncertain the DNN is in its prediction could be vital. Uncertainty measurements can reduce the risk of deploying a model because they can alert a user to the fact that a scenario is either inherently difficult to do prediction in, or the scenario has not been seen by the model before.

In a self-driving car, it seems plausible that a DNN should be more uncertain about predictions at night (at least in the measurements coming from optical cameras) because of the lower signal-to-noise ratio. In health care, a DNN that diagnoses skin cancer should be more uncertain if it were shown a particularly blurry image, especially if the model had not seen such blurry examples in the training set. In a model to segment satellite imagery, a DNN should be more uncertain if an adversary changed how they disguise certain military installations. If the uncertainty inherent in these situations were relayed to the user, the information could be used to change the behavior of the system in a safer way.

In this article, we explore how to estimate two types of statistical uncertainty alongside a prediction in a DNN. We first discuss the definition of both types of uncertainty, and then we highlight one popular and easy-to-implement technique to estimate these types of uncertainty. Finally, we show and implement some examples for both classification and regression that makes use of these uncertainty estimates.

For those who are most interested in looking at code examples, here are two Jupyter Notebooks one with a toy regression example and the other with a toy classification example. There are also PyTorch-based code snippets in the “Examples and Applications” section below.

What do we mean by ‘uncertainty’?

Uncertainty is defined by the Cambridge Dictionary as: “a situation in which something is not known.” There are several reasons why something may not be known, and — taking a statistical perspective — we will discuss two types of uncertainty called aleatory (sometimes referred to as aleatoric) and _epistemic _uncertainty.

#uncertainty #statistics #deep learning

Knowing known unknowns with deep neural networks
1.10 GEEK