Hyperparameters of Decision Trees Explained with Visualizations

Decision tree is a widely-used supervised learning algorithm which is suitable for both classification and regression tasks. Decision trees serve as building blocks for some prominent ensemble learning algorithms such as random forests, GBDT, and XGBOOST.

A decision tree builds upon iteratively asking questions to partition data. For instance, the following figure represents a decision tree used as a model to predict customer churn.

Image for post

Decision trees are prevalent in the field of machine learning due to their success as well as being straightforward. Some of the features that make them highly efficient:

Easy to understand and interpret
Can handle both numerical and categorical data
Requires little or no preprocessing such as normalization or dummy encoding

On the downside, decision trees are prone to overfitting. They can easily become over-complex which prevents them from generalizing well to the structure in the dataset. In that case, the model is likely to end up **overfitting **which is a serious issue in machine learning.

To overcome this issue, we need to carefully adjust the hyperparameters of decision trees. In this post, we will try to gain a comprehensive understanding of these hyperparameters using tree visualizations.

#data-science #artificial-intelligence #programming #machine-learning #deep learning

towardsdatascience.com

Hyperparameters of Decision Trees Explained with Visualizations