Implementing an XGBoost Model in R

Implementing an XGBoost Model in R

Using XGBoost to predict hotel cancellations. An XGBoost model is built in R to predict incidences of customers cancelling their hotel booking. The analysis is based on data from Antonio, Almeida and Nunes (2019): Hotel booking demand datasets.

In this example, an XGBoost model is built in R to predict incidences of customers cancelling their hotel booking. The analysis is based on data from Antonio, Almeida and Nunes (2019): Hotel booking demand datasets.

The H1 dataset is used for training and validation, while H2 is used for testing purposes.

Background

In order to predict customers that will cancel their booking (where variable IsCanceled = 1 means a cancellation, and IsCanceled = 0 means the customer follows through with the booking), an XGBoost model is built in R with the following features:

  • Lead time
  • Country of origin
  • Market segment
  • Deposit type
  • Customer type
  • Required car parking spaces
  • Week of arrival

Data Manipulation

In order to make the data suitable for analysis with the XGBoost model in R — some data manipulation procedures are required.

Firstly, the xgboost *and *Matrix libraries are loaded:

require(xgboost)
library(Matrix)

A data frame of features is formed through defining the variables as.numeric, and also defining in factor format where appropriate. The data frame is then converted into *Matrix *format.

leadtime<-as.numeric(H1$LeadTime)
country<-as.numeric(factor(H1$Country))
marketsegment<-as.numeric(factor(H1$MarketSegment))
deposittype<-as.numeric(factor(H1$DepositType))
customertype<-as.numeric(factor(H1$CustomerType))
rcps<-as.numeric(H1$RequiredCarParkingSpaces)
week<-as.numeric(H1$ArrivalDateWeekNumber)
df<-data.frame(leadtime,country,marketsegment,deposittype,customertype,rcps,week)
attach(df)
df<-as.matrix(df)

machine-learning data-science boosting rstats r-programming

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

Why You Should Learn R — Learn Data Science with Dataquest

Why should you learn R programming when you're aiming to learn data science? Here are six reasons why R is the right language for you.

Most popular Data Science and Machine Learning courses — July 2020

Most popular Data Science and Machine Learning courses — August 2020. This list was last updated in August 2020 — and will be updated regularly so as to keep it relevant

Pipelines in Machine Learning | Data Science | Machine Learning | Python

Machine Learning Pipelines performs a complete workflow with an ordered sequence of the process involved in a Machine Learning task. The Pipelines can also

Data Science Projects | Data Science | Machine Learning | Python

Practice your skills in Data Science with Python, by learning and then trying all these hands-on, interactive projects, that I have posted for you.

Data Science Projects | Data Science | Machine Learning | Python

Practice your skills in Data Science with Python, by learning and then trying all these hands-on, interactive projects, that I have posted for you.