The concept of a data mesh provides new ways to address common problems around managing data at scale.  Zhamak Dehghani has  provided additional clarity around the four principles of a data mesh, with a corresponding logical architecture and organizational structure. Her article is intended as a follow up to previous  articles,  presentations, and  podcasts that introduced people to data mesh and domain-oriented data.

Dehghani emphasizes the “great divide” between operational data and analytical data. Traditionally, a data pipeline of ETL jobs (extract, transform, and load) spans this divide between transactional data used for running the business, and data lakes and data warehouses used to provide insights about the business. Data mesh acknowledges the need for these two distinct viewpoints and use cases, but instead of organizing teams and architectures along technology boundaries, data mesh unites them by focusing on domains.

By following this topology, analytical data is able to scale in the way microservices and self-contained databases have allowed transactional data to scale. To achieve the promise of scale, along with quality and integrity, Dehghani lays out four principles of a data mesh:

1. Domain-oriented decentralized data ownership and architecture

2. Data as a product

3. Self-serve data infrastructure as a platform

4. Federated computational governance

#data pipelines #microservices #data mesh #ai

Data Mesh Principles and Logical Architecture Defined
2.30 GEEK