About the project

The dataset stores information — 2008 to 2015 — of a marketing sales’ operation (telemarketing) implemented by a portuguese bank’s marketing team in order to attract customers to subscribe term deposits, classifying the results as ‘yes’ and ‘no’ into a binary categorical variable.

Until that time, the strategy was to reach the maximum number of clients, indiscriminately, and try to sell them the financial product over the phone. However, that approach, in addition of spending more resources was also very uncomfortable for many clients disturbed by this type of action.

In order to determine the costs of the campaign, the marketing team has reached to a conclusion:

  • For each customer identified as a good candidate and therefore defined as target but doesn’t subscribe the deposit, the bank had a cost of 500 EUR.
  • For each customer identified as a bad candidate and excluded from the target but would subscribe the product, the bank had a cost of 2.000 EUR.

Machine Learning problem and objectives

We’re facing a binary classification problem. The goal is to train the best machine learning model that should be able to predict the optimal number of candidates to be targeted in order to reduce to the minimum costs and maximize efficiency.

Project structure

The project is divided into 3 categories:

  1. EDA: Exploratory Data Analysis
  2. Data Wrangling: Cleaning and Feature Engeneering
  3. Machine Learning: Predictive Modelling

In this article, I’ll be focusing only on the first section, the **Exploratory Data Analysis **(EDA).

Performance Metric

The metric used for evaluation is the **total costs **since the objective is to determine minimum costs of the campaign.

You will find the entire code of this project here.

The ‘bank_marketing_campaign.csv’ dataset can be downloaded here.

#predictive-analytics #exploratory-data-analysis #machine-learning #data-analysis #visualization

Machine Learning: costs prediction of a Marketing Campaign
1.20 GEEK