Nowadays, learning from data to gain business insights is common for almost every industry. These insights include— predictability, customer churn behavior, forecasting, etc… Machine learning is the key player in generating these insights.
Building a good ML model requires a lot of experiments that involves multiple iterations of different algorithms over data, creating new variables, adding more data etc… As the number of iterations grows, it becomes harder to keep track of these experiments.
In this article I will talk about a system to effectively version control machine learning project. I will also share some tools that will help you in easily implementing this system.
Whenever there is a change in the modeling data, you create a new version of the model. ML models are trained on the modeling data. As the modeling data changes, model parameters will also change. You change the data when you do the following:
#dvc #data-science #machine-learning #version-control #git