Jupyter notebooks are fantastic in many ways but collaboration is not so easy with them. In this article we’ll look at all the tools you can leverage to make notebooks play nicely with modern version control systems like git!
The software world has converged on git as it’s version control tool of choice. Git is designed to work primarily for human-readable text files. Whereas Jupyter is a rich JSON document with source code, markdown, HTML, images all rolled into a single .ipynb file.
Git doesn’t handle rich documents like notebooks very well. E.g. git merge for long nested JSON documents is humanly impossible, git diff for binary image string is horrible (shown below).
Here’s what we need from a modern version control system -
That’s our wishlist! This blogpost is going to introduce you to all the important tools that can help you achieve these.
Disclaimer: I’m the author of two of the tools listed below (ReviewNB & GitPlus) but this is an unbiased review of all the useful tools in this space.
nbdime is an open source library for diffing and merging notebooks locally. You can set this up to work with local git client so that git diff
& git merge
commands use nbdime for .ipynb files. With nbdime you can —
git diff
to see how notebook has changed before committingFollowing JupyterLab extensions are useful for notebook version control. You can install these on your local JupyterLab.
#data-science #jupyter-notebook #version-control #git #github