Random Forest: Hyperparameters and how to fine-tune them

Random Forest: Hyperparameters and how to fine-tune them

How to optimise one of the most used Machine Learning models. In this quick article, we will explore some of the nitty-gritty optimisations of Random Forests, along with what each hyper-parameter is, and which ones are worth optimising.

Random Forest are an awesome kind of Machine Learning models. They solve many of the problems of individual Decision trees, and are always a candidate to be the most accurate one of the models tried when building a certain application.

If you don’t know what Decision Trees or Random Forest are do not have an ounce of worry; I got you covered with the following articles. Take a quick look and come back here.

In this quick article, we will explore some of the nitty-gritty optimisations of Random Forests, along with what each hyper-parameter is, and which ones are worth optimising.

Lets go!

Hyper-parameter considerations, tips and tricks

The most important hyper-parameters of a Random Forest that can be tuned are:

  • The Nº of Decision Trees in the forest (in Scikit-learn this parameter is called nestimators_)
  • The criteria with which to split on each node (Gini or Entropy for a classification task, or the MSE or MAE for regression)
  • The maximum depth of the individual trees. The larger an individual tree, the more chance it has of overfitting the training data, however, as in Random Forests we have many individual trees, this is not such a big problem.
  • The minimum samples to split on at an internal node of the trees. Playing with this parameter and the previous one we could regularise the individual trees if needed.
  • Maximum number of leaf nodes. In Random Forest this is not so important, but in an individual Decision Tree it can greatly help reduce over-fitting as well and also help increase the explainability of the tree by reducing the possible number of paths to leaf nodes. Learn how to use Decision Trees to build explainable ML models here.
  • Number of random features to include at each node for splitting.
  • The size of the bootstrapped dataset to train each Decision Tree with.

Alright, now that we know where we should look to optimise and tune our Random Forest, lets see what touching some of these parameters does.

data-science artificial-intelligence machine-learning startup technology

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

Artificial Intelligence vs Machine Learning vs Data Science

Artificial Intelligence, Machine Learning, and Data Science are amongst a few terms that have become extremely popular amongst professionals in almost all the fields.

Most popular Data Science and Machine Learning courses — July 2020

Most popular Data Science and Machine Learning courses — August 2020. This list was last updated in August 2020 — and will be updated regularly so as to keep it relevant

AI(Artificial Intelligence): The Business Benefits of Machine Learning

Enroll now at CETPA, the best Institute in India for Artificial Intelligence Online Training Course and Certification for students & working professionals & avail 50% instant discount.

Data science vs. Machine Learning vs. Artificial Intelligence

In this tutorial on "Data Science vs Machine Learning vs Artificial Intelligence," we are going to cover the whole relationship between them and how they are different from each other.

Comparison of Data Science Vs Machine Learning Vs Artificial Intelligence

Explore the differences between Data Science, Machine Learning, Artificial Intelligence. Understand how DS, ML, and AI is extremely inter-related. Choose the Right career path!