Data Mesh Architecture Language
Learn the vocabulary of Data Mesh: domain ownership, data as a product, federated governance, data product design, and migration from centralised architectures.
- Data Mesh Principles Vocabulary
- Data Product Design Language
- Federated Governance Discussion Language
- Data Domain Ownership Language
- Data Mesh Migration Language
Frequently Asked Questions
What is a data product and how do teams describe it in English?
A data product is a self-contained, discoverable unit of data that a domain team owns and publishes for others to consume. Teams describe it using phrases like "the orders domain exposes a data product for downstream analytics" or "we treat our customer dataset as a first-class product with an SLA." Key attributes discussed include discoverability, addressability, trustworthiness, self-description, interoperability, and native accessibility — often called the "eight qualities of a data product."
How do you explain domain ownership in a data mesh context?
Domain ownership means the team closest to the data — the one that generates and understands it — is responsible for its quality and availability. In conversations, engineers say things like "the payments team owns the transaction data product" or "we are shifting data stewardship from the central data team to the originating domain." This is contrasted with centralised data warehouses where a single platform team owns everything.
What vocabulary describes the self-serve data platform in data mesh?
The self-serve data platform is the infrastructure layer that allows domain teams to build and manage data products without needing deep platform expertise. Key phrases include "the platform provides a data product template," "we offer infrastructure as a service for data teams," and "teams can provision pipelines via a self-serve portal." The platform is described as "domain-agnostic" because it serves all domains equally without embedding business logic.
How is federated computational governance discussed in data mesh?
Federated computational governance balances central policy with local domain autonomy. Teams say "we enforce global policies such as PII tagging and data retention at the platform level, while domains decide their own schema evolution strategy." The word "computational" signals that policies are implemented as code and checked automatically, not through manual audits. A governance council with domain representatives is often called a "data mesh guild" or "cross-domain governance team."
What language is used when migrating from a data warehouse to a data mesh?
Migration discussions use phrases like "we are decentralising our monolithic data warehouse," "the analytics team is transitioning from pull-based ETL to push-based data products," and "we are piloting the mesh approach with the logistics domain before broader rollout." The term "strangler fig pattern" is borrowed from microservices to describe gradually migrating pipelines while keeping the old warehouse running in parallel.
How do engineers describe data contracts in English?
A data contract is a formal agreement between a data producer and its consumers defining the schema, semantics, SLA, and quality expectations of a data product. Developers say "the producer has broken the data contract by removing a required field" or "we version our data contracts to avoid breaking downstream consumers." Tools like Great Expectations or dbt contracts are mentioned when discussing implementation.
What terms are used to describe data product discoverability?
Discoverability refers to how easily consumers can find and understand available data products. Teams discuss "data catalogues" (such as DataHub or Atlan), "metadata registries," and "data product marketplaces." Phrases include "the catalogue indexes all data products with their owners and SLAs" and "consumers can search by domain, data type, or freshness." A well-tagged data product is called "rich in metadata" or "self-describing."
How do teams talk about data mesh interoperability?
Interoperability means data products across domains can work together without tight coupling. Engineers use phrases like "we standardise on Parquet for output format to ensure cross-domain compatibility" or "all data products expose a REST or gRPC port alongside a batch interface." The principle is sometimes called "polylingual interface" — each data product may be built with different technologies as long as it adheres to the platform's standard access protocols.
What English phrases describe data mesh maturity levels?
Teams assess maturity using language such as "we are at the data mesh crawl phase — domains are identified but data products are not yet published" or "we have reached the walk phase with three live data products and automated quality checks." More mature organisations are described as "running a mesh" with dozens of interconnected data products and a self-serve platform that requires minimal central support.
How do data mesh engineers communicate about SLAs for data products?
Data product SLAs (Service Level Agreements) define commitments on freshness, availability, and quality. Engineers say "the orders data product guarantees T+1 freshness" or "we have a 99.5% uptime SLA for the customer profile dataset." When SLAs are breached, teams use language like "we are in SLA violation for the inventory feed" and "a postmortem is scheduled to address the latency spike." SLA definitions are often encoded in the data product's specification file.