Let’s look at all the tools you can leverage to make notebooks play nicely with modern version control systems like git. Jupyter notebooks are fantastic in many ways but collaboration is not so easy with them. In this article we’ll look at all the tools you can leverage to make notebooks play nicely with modern version control systems like git
Jupyter notebooks are fantastic in many ways but collaboration is not so easy with them. In this article we’ll look at all the tools you can leverage to make notebooks play nicely with modern version control systems like git!
The software world has converged on git as it’s version control tool of choice. Git is designed to work primarily for human-readable text files. Whereas Jupyter is a rich JSON document with source code, markdown, HTML, images all rolled into a single .ipynb file.
Git doesn’t handle rich documents like notebooks very well. E.g. git merge for long nested JSON documents is humanly impossible, git diff for binary image string is horrible (shown below).
Here’s what we need from a modern version control system -
That’s our wishlist! This blogpost is going to introduce you to all the important tools that can help you achieve these.
Disclaimer: I’m the author of two of the tools listed below (ReviewNB & GitPlus) but this is an unbiased review of all the useful tools in this space.
nbdime is an open source library for diffing and merging notebooks locally. You can set this up to work with local git client so that
git diff &
git merge commands use nbdime for .ipynb files. With nbdime you can —
git diffto see how notebook has changed before committing
Following JupyterLab extensions are useful for notebook version control. You can install these on your local JupyterLab.
Git-101 for Jupyter notebook users!This is a Git-101 for Jupyter users that are not familiar with Git / GitHub. It’s a hands on tutorial & is meant to be comprehensive. Feel free to skip a section if you are already familiar with the steps. At the end you’ll be able to, Push your notebooks to a GitHub repository Start versioning your notebooks + learn how to revert to a specific notebook version
Why Jupyter Notebooks are the Future of Data Science. How Jupyter Notebooks played an important role in the incredible rise in popularity of Data Science and why they are its future.
Online Data Science Training in Noida at CETPA, best institute in India for Data Science Online Course and Certification. Call now at 9911417779 to avail 50% discount.
Data Science and Analytics market evolves to adapt to the constantly changing economic and business environments. Our latest survey report suggests that as the overall Data Science and Analytics market evolves to adapt to the constantly changing economic and business environments, data scientists and AI practitioners should be aware of the skills and tools that the broader community is working on. A good grip in these skills will further help data science enthusiasts to get the best jobs that various industries in their data science functions are offering.
JupyterLab extension to push commits & create pull requests on GitHub. There’s no easy way to version control notebooks from Jupyter UI. Of course you can drop down to command line & learn a bunch of git commands to version control your notebooks. But not everyone using Jupyter is proficient at git. Hence I built GitPlus, a JupyterLab extension that provides the ability to commit notebooks & create GitHub pull requests directly from JupyterLab UI.