Hierarchical Performance Metrics and Where to Find Them

Hierarchical machine learning models are one top-notch trick. As discussed in previous posts, considering the natural taxonomy of the data when designing our models can be well worth our while. Instead of flattening out and ignoring those inner hierarchies, we’re able to use them, making our models smarter and more accurate.

“More accurate”, I say — are they, though? How can we tell? We are people of science, after all, and we expect bold claims to be be supported by the data. This is why we have performance metrics. Whether it’s precision, f1-score, or any other lovely metric we’ve got our eye on — if using hierarchy in our models improves their performance, the metrics should show it.

Problem is, if we use regular performance metrics — the ones designed for flat, one-level classification — we go back to ignoring that natural taxonomy of the data.

If we do hierarchy, let’s do it all the way. If we’ve decided to celebrate our data’s taxonomy and build our model in its image, this needs to also be a part of measuring its performance.

How do we do this? The answer lies below.

Before We Dive In

This post is about measuring the performance of machine learning models designed for hierarchical classification. It kind of assumes you know what all those words mean. If you don’t, check out my previous posts on the topic. Especially the one introducing the subject. Really. You’re gonna want to know what hierarchical classification is before learning how to measure it. That’s kind of an obvious one.

Throughout this post, I’ll be giving examples based on this taxonomy of common house pets:

Image for post

The taxonomy of common house pets. My neighbor just adopted the cutest baby Pegasus.

Oh So Many Metrics

So we’ve got a whole ensemble of hierarchically-structured local classifiers, ready to do our bidding. How do we evaluate them?

That is not a trivial problem, and the solution is not obvious. As we’ve seen in previous problems in this series, different projects require different treatment. The best metric could differ depending on the specific requirements and limitations of your project.

All in all, there are three main options to choose from. Let’s introduce them, shall we?

The contestants, in all their grace and glory:

#machine-learning #hierarchical #performance-metrics #ensemble-learning #metrics

Before We Dive In

Oh So Many Metrics

towardsdatascience.com

Hierarchical Performance Metrics and Where to Find Them