Data Observability, Part II: How to Build Your Own Data Quality Monitors Using SQL

In this article series, we walk through how you can create your own data observability monitors from scratch, mapping to five key pillars of data health. Part I can be found here.

Part II of this series was adapted from Barr Moses and Ryan Kearns’ O’Reilly training, Managing Data Downtime: Applying Observability to Your Data Pipelines, the industry’s first-ever course on data observability. The associated exercises are available here, and the adapted code shown in this article is available here.

As the world’s appetite for data increases, robust data pipelines are all the more imperative. When data breaks — whether from schema changes, null values, duplication, or otherwise — data engineers need to know.

#2021 feb tutorials # overviews #data engineering #data quality #data science #data science platform #sql

kdnuggets.com

Data Observability, Part II: How to Build Your Own Data Quality Monitors Using SQL