This article aims to present to the readers the code and the intuition behind the regression tree algorithm in Python. I find looking through the code of an algorithm a very good educational tool to understand what is happening under the hood. I hope that the readers will this useful too.

The algorithm that is explained is the regression tree algorithm. It is used to model the relationship between a continuous variable Y and a set of features X:

Y = f(X)

The function f is a set of rules of features and feature values that does the “best” job of explaining the Y variable given features X.

