Introduction to Bagged Trees

Without diving into the specifics just yet, it’s important that you have some foundation understanding of decision trees.

From the evaluation approach of each algorithm to the algorithms themselves, there are many similarities.

If you aren’t already familiar with decision trees I’d recommend a quick refresher here.

With that said, get ready to become a bagged tree expert! Bagged trees are famous for improving the predictive capability of a single decision tree and an incredibly useful algorithm for your machine learning tool belt.

What are Bagged Trees & What Makes Them So Effective?

Why use bagged trees

The main idea between bagged trees is that rather than depending on a single decision tree, you are depending on many many decision trees, which allows you to leverage the insight of many models.

Bias-variance trade-off

When considering the performance of a model, we often consider what’s known as the bias-variance trade-off of our output. Variance has to do with how our model handles small errors and how much that potentially throws off our model and bias results in under-fitting. The model effectively makes incorrect assumptions around the relationships between variables.

You could say the issue with variation is while your model may be directionally correct, it’s not very accurate, while if your model is very biased, while there could be low variation; it could be directionally incorrect entirely.

The biggest issue with a decision tree in general is that they have high variance. The issue this presents is that any minor change to the data can result in major changes to the model and future predictions.

The reason this comes into play here is that one of the benefits of bagged trees, is it helps minimize variation while holding bias consistent.

Why not use bagged trees

One of the main issues with bagged trees is that they are incredibly difficult to interpret. In the decision trees lesson, we learned that a major benefit of decision trees is that they were considerably easy to interpret. Bagged trees prove opposite in this regard as it’s process lends to complexity. I’ll explain that more in depth shortly.

#computer-science #data-science #data analysis

A Machine Learning Algorithm Every Data Scientist Needs:  Bagged Trees
1.10 GEEK