Jupyter notebooks are fantastic in many ways but collaboration is not so easy with them. In this article we’ll look at all the tools you can leverage to make notebooks play nicely with modern version control systems like git!

Why is Jupyter version control so hard?

The software world has converged on git as it’s version control tool of choice. Git is designed to work primarily for human-readable text files. Whereas Jupyter is a rich JSON document with source code, markdown, HTML, images all rolled into a single .ipynb file.

Git doesn’t handle rich documents like notebooks very well. E.g. git merge for long nested JSON documents is humanly impossible, git diff for binary image string is horrible (shown below).

Image for post

What’s required from notebook version control?

Here’s what we need from a modern version control system -

  • Ability to create checkpoints / commits
  • Quickly checkout any of the past notebook versions
  • See what changed from one version to another (a.k.a visual diff for notebooks)
  • Multiple people can work on a single notebook with easy merge conflict resolution
  • Ability to provide feedback & ask questions about a specific notebook cell

That’s our wishlist! This blogpost is going to introduce you to all the important tools that can help you achieve these.

Disclaimer: I’m the author of two of the tools listed below (ReviewNB & GitPlus) but this is an unbiased review of all the useful tools in this space.

nbdime

nbdime is an open source library for diffing and merging notebooks locally. You can set this up to work with local git client so that git diff & git merge commands use nbdime for .ipynb files. With nbdime you can —

  • Run git diff to see how notebook has changed before committing
  • Easily merge remote changes with your locally edited notebook

JupyterLab Git Extensions

Following JupyterLab extensions are useful for notebook version control. You can install these on your local JupyterLab.

  • jupyterlab-git can be used to browse GitHub repositories, look at visual diffs of changed files, and push your commits
  • GitPlus can be used to push commits and create pull requests on GitHub directly from JupyterLab UI

#data-science #jupyter-notebook #version-control #git #github

How to version control Jupyter Notebooks
22.65 GEEK