Airflow is a powerful and flexible workflow automation and scheduling system, powered by Python with a rich set of integrations and tools available out-of-the-box.

Although the vast amount of documentation and large configuration files make learning Airflow look like a daunting task, it is easy to get either a simple or more complex configuration up and running quickly for developers to begin writing code and learn how the product actually works. In this article I will demonstrate how to use Airflow with Docker to achieve this using a public Github repo I authored here:

https://github.com/suburbanmtman/airflow-intro

Simplified Airflow Concepts

DAG — directed acyclic graph. How to run a workflow, visible in the dashboard on the web interface.

Worker — one or more systems responsible for running the code in the workflow task queues

Webserver — displays UI for managing Airflow, manages user requests for running tasks, and receives updates from DAG runs via workers

Scheduler — determines if a Task needs to be run and triggers work to be processed by a Worker

Operator — a step of what actually gets run inside a DAG

Task — an instantiated Operator created by the scheduler, a single unit of work

**Task Instance** — stored state of a task

#python #apache-airflow #docker-compose #docker #introduction

Write Code in Airflow Within Minutes
1.15 GEEK