This article is the 2nd out of 3 of a Machine Learning — binary classification — project which goal is to train the best machine learning model that should be able to predict the optimal number of candidates to be targeted on a Marketing Campaign, to reduce to the minimum costs and to maximize efficiency.
To determine the costs of the campaign, the marketing team has concluded:
The metric used for evaluation is the **total costs **since the objective is to determine the minimum costs of the marketing campaign.
In this article, the focus is on the second section only, the Cleaning & Feature Selection.
In the first post, we have conducted the Exploratory Data Analysis that has allowed us to look further and beyond the initial dataset. EDA can be a very time-consuming task and rarely is a one-time-walk-through but although we may find ourselves going back to early sections changing and trying a few different approaches quite often, the detailed analysis usually pays and gives us a ton of information about the data and the variables’ behavior.
Let’s step into the first section and take a brief overview.
[0] Number of clients that haven’t subscribed the term deposit: 36548
[1] Number of clients that have subscribed the term deposit: 4640
#data-cleaning #feature-engineering #feature-selection #machine-learning