Introduction

As every HR professional would tell you, recruiting and training new employees is substantially more expensive than retaining existing talent. Experience is king and those who depart frequently take precious knowledge of the in’s and out’s of the organisation.

Consider the following simulated dataset from elitedatascience.com containing the details of over 14,000 employees, some of whom have already left the company. Each employee had data about their department, salary, happiness, workload, experience and whether or not they had left the company. I used a few industry-standard data science tools — Python and JupyterLab — to perform analytics, display visualisations and train some machine learning algorithms that predicts employee churn.

The objective is to deliver to HR a high-performing trained model (.pkl file) that they could use on their permanent employees to identify those at-risk. This is an example of machine learning providing actionable business insights.

Here’s a snapshot of the dataset. There are 14,249 employees and 10 columns.

Here’s what we’ll do.

We’ll perform some exploratory data analysisdata cleaning and feature engineering, followed by the training and tuning of some standard classification algorithms: logistic regressionsrandom forests and gradient-boosted trees.

We’ll assess the common performance metrics for binary classification, including cross-validation scoresaccuracyconfusion matricesprecisionrecallF1ROC curves and AUROC.

A list of ranked feature importances will be presented for the winning model.

#churn-prediction #classification #machine-learning #employee-retention #data-science

Will your employee leave? A machine learning model
1.35 GEEK