AdaBoost Algorithm is a boosting method that works by combining weak learners into strong learners. A good way for a prediction model to correct its predecessor is to give more attention to the training samples where the predecessor did not fit well. This can result in a new prediction model which will focus much on the hard instances. This technique is used by an AdaBoost Algorithm. In this article, I will take you through the AdaBoost Algorithm in Machine Learning.

Training a Base Classifier

To use an AdaBoost classification algorithm, we first need to train a base classification model. So, to explain this algorithm, I will first train a Decision Tree algorithm as our base classification model. I will start by importing the necessary packages to train a DecisionTreeClassifier:

import sys
assert sys.version_info >= (3, 5)
​
## Scikit-Learn ≥0.20 is required
import sklearn
assert sklearn.__version__ >= "0.20"
​
## Common imports
import numpy as np
import os
​
## to make this notebook's output stable across runs
np.random.seed(42)
​
## To plot pretty figures
%matplotlib inline
import matplotlib as mpl
import matplotlib.pyplot as plt
mpl.rc('axes', labelsize=14)
mpl.rc('xtick', labelsize=12)
mpl.rc('ytick', labelsize=12)

Now I will use the Decision Tree algorithm to train a base classification:

from sklearn.model_selection import train_test_split
from sklearn.datasets import make_moons
​
X, y = make_moons(n_samples=500, noise=0.30, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)

AdaBoost Algorithm in Machine Learning

The AdaBoost Algorithm increases the relative weight of less classified training samples. Then it trains another classification model by using the new updates weights of classified training samples and again predicts on the training set. Let’s have a look at how we can implement this algorithm:

from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import AdaBoostClassifier
​
ada_clf = AdaBoostClassifier(
    DecisionTreeClassifier(max_depth=1), n_estimators=200,
    algorithm="SAMME.R", learning_rate=0.5, random_state=42)
ada_clf.fit(X_train, y_train)

AdaBoostClassifier(algorithm=‘SAMME.R’,
base_estimator=DecisionTreeClassifier(ccp_alpha=0.0,
class_weight=None,
criterion=‘gini’,
max_depth=1,
max_features=None,
max_leaf_nodes=None,
min_impurity_decrease=0.0,
min_impurity_split=None,
min_samples_leaf=1,
min_samples_split=2,
min_weight_fraction_leaf=0.0,
presort=‘deprecated’,
random_state=None,
splitter=‘best’),
learning_rate=0.5, n_estimators=200, random_state=42)

#by aman kharwal #algorithms

AdaBoost Algorithm | Data Science | Machine Learning | Python
2.30 GEEK