ROC Curve and AUC — Explained

ROC (receiver operating characteristics) curve and AOC (area under the curve) are performance measures that provide a comprehensive evaluation of classification models.

ROC curve summarizes the performance by combining confusion matrices at all threshold values. **AUC **turns the ROC curve into a numeric representation of performance for a binary classifier. AUC is the area under the ROC curve and takes a value between 0 and 1. AUC indicates how successful a model is at separating positive and negative classes.

Before going in detail, let’s first explain the confusion matrix and how different threshold values change the outcome of it.

A confusion matrix is not a metric to evaluate a model, but it provides insight into the predictions. Confusion matrix goes deeper than classification accuracy by showing the correct and incorrect (i.e. true or false) predictions on each class. In case of a binary classification task, a confusion matrix is a 2x2 matrix. If there are three different classes, it is a 3x3 matrix and so on.

Image for post

Confusion matrix of a binary classification (Image by author)

Let’s assume class A is positive class and class B is negative class. The key terms of a confusion matrix are as follows:

True positive (TP): Predicting positive class as positive (ok)
False positive (FP): Predicting negative class as positive (not ok)
False negative (FN): Predicting positive class as negative (not ok)
True negative (TN): Predicting negative class as negative (ok)

Algorithms like logistic regression return probabilities rather than discrete outputs. We set a threshold value on the probabilities to distinguish positive and negative class. Depending on the threshold value, the predicted class of some observations may change.

Image for post

How threshold value can change the predicted class (Image by author)

As we can see from the image above, adjusting the threshold value changes the prediction and thus results in a different confusion matrix. When the elements in a confusion matrix change, **precision **and **recall **also change.

Precision and recall metrics take the classification accuracy one step further and allow us to get a more specific understanding of model evaluation.

The focus of precision is positive predictions. It indicates how many of the positive predictions are true.

#machine-learning #data-science #predictive-analytics #artificial-intelligence #data analytic

towardsdatascience.com

ROC Curve and AUC — Explained