Vocabulary for Data Mesh Architecture: Data Products, Domains and Governance
Master data mesh vocabulary: data domain, data product, federated governance, self-serve data platform, and data ownership for modern data architecture discussions.
Data mesh, popularised by Zhamak Dehghani, is an architectural paradigm that decentralises data ownership from a central data team to the domain teams that produce the data. If you work in data engineering, analytics, or platform roles, fluency in this vocabulary is increasingly expected.
The Four Principles of Data Mesh
Data mesh rests on four principles. Knowing the vocabulary for each is essential:
- Domain-oriented decentralised data ownership
- Data as a product
- Self-serve data infrastructure as a platform
- Federated computational governance
Data Domain
A data domain is a bounded area of the business aligned to an organisational capability (e.g., “Customer,” “Orders,” “Inventory”). Each domain owns the data it produces.
Key phrases
- “The Orders domain owns the canonical order lifecycle data.”
- “We aligned each data domain to an existing engineering team.”
- “The domain is responsible for the quality and availability of its data products.”
- “We identified eight data domains during the decomposition exercise.”
Domain producer vs. consumer
- Domain producer — the team that generates and owns the data
- Domain consumer — any team that uses data published by another domain
“The Marketing team consumes data from the Customer domain — they are not the producer.”
Data Product
A data product is the primary unit of data mesh. It is a discoverable, addressable, trustworthy dataset that a domain team publishes and maintains as a product — not a by-product.
Qualities of a data product
(Dehghani’s eight qualities):
- Discoverable — findable in a data catalogue
- Addressable — has a stable, unique identifier
- Trustworthy — has defined SLOs and quality checks
- Self-describing — includes schema, metadata, and documentation
- Interoperable — follows shared standards
- Natively accessible — reachable via standard interfaces
- Secure — access-controlled
- Valuable on its own — not just an extract for one downstream use case
In conversation
- “Each team publishes one or more data products from their domain.”
- “The data product exposes a stable schema and has an SLO of 99.9% availability.”
- “We treat our events table as a first-class data product, not just a log dump.”
- “Consumers subscribe to the data product through the self-serve platform.”
Federated Computational Governance
Federated governance means that governance policies (schema standards, quality rules, access controls) are set globally but enforced locally by each domain team — rather than by a central data team.
Key vocabulary
- Global policies — organisation-wide standards that all domains must follow
- Local enforcement — each domain implements the policies in its own pipeline
- Computational governance — policies are encoded in software and enforced automatically, not manually audited
- Data contract — a formal agreement between producer and consumer on schema, quality, and SLAs
In sentences
- “The governance team sets the data quality standards; each domain implements them.”
- “Our data contracts define the schema version, SLO, and breaking-change notification process.”
- “Governance is federated — there is no central team approving every data change.”
- “The platform enforces access policies automatically based on data sensitivity tags.”
Self-Serve Data Platform
The self-serve data platform is the infrastructure layer that enables domain teams to build and publish data products without needing a central data engineering team’s help for every task.
What it provides
- Data pipeline templates
- Schema registry
- Data catalogue
- Quality monitoring dashboards
- Access control interfaces
Key phrases
- “The platform team provides the tooling; domain teams use it autonomously.”
- “A domain team can onboard a new data product in under a day using the platform templates.”
- “The platform abstracts the complexity of storage, compute, and cataloguing.”
- “We measure platform adoption by the number of data products published per quarter.”
Data Contract Vocabulary
Data contracts are a rapidly evolving practice in data mesh. Key terms:
| Term | Meaning |
|---|---|
| Producer | The team publishing the data |
| Consumer | The team using the data |
| Schema version | The version of the data structure (SemVer) |
| Breaking change | A change that is incompatible with existing consumers |
| Backward compatible | New schema works with old consumers |
| SLO | Service Level Objective for data freshness, quality, availability |
| Data quality dimension | Completeness, accuracy, timeliness, uniqueness |
Using data contract language
- “We introduced a breaking change to the Orders schema — all consumers must update.”
- “The contract specifies data freshness of no more than 15 minutes.”
- “Consumers agreed to the contract before their pipeline was provisioned.”
Comparing Data Mesh to Other Paradigms
In discussions, you may need to compare data mesh to existing approaches:
| Approach | Description | Data mesh contrast |
|---|---|---|
| Data warehouse | Centralised storage, managed by one team | Data mesh distributes ownership |
| Data lake | Centralised storage, schema-on-read | Data mesh adds product thinking and governance |
| Data lakehouse | Hybrid warehouse+lake | Data mesh is an organisational pattern, not a storage technology |
| Data fabric | Metadata-driven, often centralised | Data mesh is decentralised; fabric is often centralised |
Key Takeaways
- Data domain — a business-aligned unit of data ownership
- Data product — a published, trustworthy dataset with SLOs, schema, and discoverability
- Federated governance — global standards, locally enforced by domain teams
- Self-serve platform — the tooling infrastructure that enables autonomous domain teams
- Data contract — a formal producer-consumer agreement on schema, quality, and SLAs
- Key verbs: publish, consume, own, enforce, expose, subscribe, onboard