Why Predict Customer Churn?

Getting new customers is much more expensive than retaining existing ones. Some studies have shown that it costs six to seven times more to acquire a new customer than to keep an existing one.

According to BeyondPhilosophy.com:

Loyal customers reduce costs associated with consumer education and marketing, especially when they become Net Promoters for your organization.”

Hence it is important to be able to proactively determine the customers most at risk of leaving and take preventative measures against this through understanding their needs and providing positive customer experience.

Methodology

The project is divided into 3 stages:

  1. Data Cleaning and Exploratory Data Analysis.
  2. Model Selection and Threshold Tuning.
  3. Result Interpretation.

Data Cleaning and Exploratory Data Analysis

Data is obtained from Kaggle, IBM Data Sets. The data set has some imbalance with 26.5% churn.

Data is first checked for unique customer ID. Blank spaces are replaced with 0 and columns are changed to numerical type whenever applicable.

EDA is carried out to understand the data. A feature like gender has little impact on churn and will be dropped.

#logistic-regression #random-forest-classifiers #feature-selection #data-science #random-forest

Predicting Customer Churn in the Telecommunications Industry
1.15 GEEK