Tuning the model is the way to supercharge the model to increase their performance. Let us look into an example where there is a comparison between the untuned XGBoost model and tuned XGBoost model based on their RMSE score. Later, you will know about the description of the hyperparameters in XGBoost.
Below is the code example for untuned parameters in XGBoost model:
#Importing necessary libraries
import pandas as pd
import numpy as np
import xgboost as xg
#Load the data
house = pd.read_csv("ames_housing_trimmed_pricessed.csv")
X,y = house[house.columns.tolist()[:-1]],
house[house.columns.tolist()[-1]]
#Converting it into DMatrix
house_dmatrix = xgb.DMatrix(data = X, label = y)
#Parameter configuration
param_untuned = {"objective":"reg:linear"}
cv_untuned_rmse = xg.cv(dtrain = house_dmatrix, params = param_untuned, nfold = 4,
metrics = "rmse", as_pandas = True, seed= 123)
print("RMSE Untuned: %f" %((cv_untuned_rmse["test-rmse-mean"]).tail(1)))
view raw
tune_1.py hosted with ❤ by GitHub
Output: 34624.229980
Now let us look to the value of RMSE when the parameters are tuned to some extent:
#Importing necessary libraries
import pandas as pd
import numpy as np
import xgboost as xg
#Load the data
house = pd.read_csv("ames_housing_trimmed_pricessed.csv")
X,y = house[house.columns.tolist()[:-1]],
house[house.columns.tolist()[-1]]
#Converting it into DMatrix
house_dmatrix = xgb.DMatrix(data = X, label = y)
#Parameter Configuration
param_tuned = {"objective":"reg:linear", 'colsample_bytree': 0.3,
'learning_rate': 0.1, 'max_depth': 5}
cv_tuned_rmse = xg.cv(dtrain = house_dmatrix, params = param_tuned, nfold = 4,
num_boost_round = 200, metrics = "rmse", as_pandas = True, seed= 123)
print("RMSE Tuned: %f" %((cv_tuned_rmse["test-rmse-mean"]).tail(1)))
view raw
tune_2.py hosted with ❤ by GitHub
Output: 29812.683594
It can be seen that there is around 15% reduction in the RMSE score when the parameters got tuned.
#machine-learning #hyperparameter #artificial-intelligence #hyperparameter-tuning #xgboost #deep learning