XGBoost is most commonly used for classification or regression-based problems, whereby features are incorporated into the model to predict an outcome of interest.

That said, XGBoost can also be used for time series forecasting. This is done by using lags of the time series of interest as separate features in the model. Let’s see how XGBRegressor can be used to help us predict hotel cancellations.

Data Processing

The below analysis is based on data from Antonio, Almeida and Nunes (2019): Hotel booking demand datasets.

The purpose of building a time series forecasting model with XGBoost is to allow the hotel in question to predict the number of hotel cancellations on a weekly basis.

The data is first split into training and validation partitions:

train_size = int(len(df) * 0.8)
val_size = len(df) - train_size
train, val = df[0:train_size,:], df[train_size:len(df),:]
Given that we are working with a [tree-based model](https://github.com/dmlc/xgboost/issues/357), the features are not normalized with MinMaxScaler under this example.

#gradient-boosting #machine-learning #data-science #timeseries

Predicting Weekly Hotel Cancellations with XGBRegressor
1.55 GEEK