Note: This article is specifically written to an audience using Astronomer’s version of Airflow, but it can be adapted to any Airflow instance setup via Docker.
The complexity of our DAGs has increased over the last year and I’ve been finding it harder to iterate and debug code within a DAG interactively in an easy and repeatable manor. So I did some research and put together some steps that will help someone run a Jupyter Notebook that can inherit all the native hooks, connections and variables of an Airflow instance.
Having this interactive notebook allows you to develop ideas and share them more easily in a state that most resembles the final form of a DAG that can be explored without DAG triggering.
Here are the steps to getting this setup on your own local environment:
Add the following to packages.txt
in your Airflow project root directory. These packages are to support the running of Jupyter within your docker image.
build-base
python3-dev
zeromq-dev
Add the jupyter package to your requirements.txt
which also resides in the airflow project root directory
jupyter
Now we’ll need to either create or add to a file nameddocker-compose.override.yml
within the Airflow projectroot directory. This file allows you to append/modify the default docker-compose file for the Astronomer Airflow image. We’re going to open the default port of 8888 for our Jupyter Notebook on the webserver docker container.
version '2'
services:
webserver:
ports:
- 0.0.0.0:8888:8888
Start your local instance with astro dev start
if currently running, first run astro dev stop
#astronomer #etl #python #airflow #jupyter-notebook