Using FDS, an open-source tool, to version control your machine learning project fast & easy.

FDS, Fast Data Science, is an  open-source tool that makes version control for machine learning fast & easy. It combines Git and DVC under one roof, and takes care of code, data, and model versioning.

FDS will help you:

  • Avoid mistakes, by recommending where each file should be tracked, using a smart version control wizard 🧙‍♂️.
  • Automate repetitive tasks, by unifying commands (e.g. git status + dvc status = fds status)
  • Make version control faster, easier to use, and more friendly, by providing a human-centric UX — want to save a new version and push it to your shared remote, just fds save and you’re good to go.

This blog is a step-by-step guide on how to version your machine learning project using FDS. We’ll Track the “Pneumonia-Detection” project, in which we train a TensorFlow model to classify between sick and healthy chest X-ray images. The data set used in this project was taken from the “ Chest X-Ray Images (Pneumonia)” Kaggle competition. By following the steps detailed below, you will gain hands-on experience using FDS.

Pneumonia detection data example, image by author

In this blog, we will cover how to perform the following actions using FDS:

  • Initialize and configure Git and DVC in our local machine.
  • Track the project files using both Git and DVC with a single command.
  • Push the files to Git and DVC remotes.
  • Automatically version, track, commit, and push all project files to the remote with a single command

#data-science #git

Faster Machine Learning Versioning and Tracking: Example using FDS
1.25 GEEK