How Data Scientists Build Machine Learning Models in Real Life

The web is already flooded by data science and machine learning related resources nowadays. There are numerous blogs, websites, YouTube videos, and forums that are providing useful information regarding data science related topics. Now it has become tedious to choose the right material for any data science quest.

When I started my journey in data science, a few years back, I faced the same dilemma. But one thing I observed in most of these resources they are not complete. You have to traverse through a number of resources to get exhaustive information.

Also, I saw a lack of real-life perspective of the articles written on machine learning models. So I thought of writing a post on the overall picture of building a machine learning model for any use case in real-life.

To perform any data science project, a data scientist needs to go through several steps. Broadly these steps can be presented as:

Formulation of the data science problem from the given business problem

2. Data source exploration and data collection

3. Exploration of the variables (EDA)

4. Model building

5. Model evaluation

6. Model deployment

Steps 1 and 2 depend on the context of the problem. Step 6 depends more on the business requirement and the available infrastructure. And steps 2, 3, 4, and 5 are the sole responsibilities of a data scientist.

In this post, I shall discuss how to build a classification model end-to-end. I shall take you through the entire journey of a data scientist in any project that requires building a classification model. I shall try to organize this post in such a way that it can be readily adapted for similar situations.

I used a Random Forst model to describe the methods. Even if you use any other classifier, the execution will be pretty similar.

#data-science #real-life-experiences #machine-learning #predictive-modeling #classification

towardsdatascience.com

How Data Scientists Build Machine Learning Models in Real Life