A. Introduction

Creating a data pipeline is one thing; bringing it into production is another. This is especially true for a modern data pipeline in which multiple services are used for advanced analytics. Examples are transforming unstructured data to structured data, training of ML models and embedding OCR. Integration of multiple services can be complicated and deployment to production has to be controlled. In this blog, an example project is provided as follows:

  • 1. Setup an Azure DevOps project for contineous deployment
  • 2. Deploy Azure resources of data pipeline using infrastructure as code
  • 3. Run and monitor data pipeline

The code from the project can be found here, the steps of the modern data pipeline are depicted below.

Image for post

1. High level dataflow, image by author

The architecture of the project will be discussed in the next chapter. Subsequently, a tutorial is provided how to deploy and run the project.

#software-development #devops #data-engineering #programming #azure

How to bring your modern data pipeline to production
1.45 GEEK