5 exercises — Practice DataOps vocabulary in English: data quality SLAs, data contracts, data observability, lineage, schema drift, and data incidents.
Core DataOps vocabulary clusters
Data quality: data quality SLA, freshness, completeness, accuracy, consistency, validity, data quality score
Data contracts: data contract, schema registry, backward/forward compatibility, breaking change, consumer-driven contract
Observability: data observability, data lineage, column-level lineage, anomaly detection, data freshness, volume spike
Incidents: data incident, silent failure, data downtime, SLA breach, data quality alert, incident classification
Tools: Monte Carlo, Great Expectations, dbt tests, Apache Atlas, OpenLineage, Marquez, Soda Core
0 / 5 completed
1 / 5
A data engineering lead introduces data contracts: "A data contract is a formal agreement between data producers and consumers about the structure and semantics of a dataset. It specifies the schema, data types, expected freshness, null rate, and SLA. If the producer makes a breaking change — like renaming a column or changing a type — they violate the contract. We version our contracts and require producers to get consumer sign-off before breaking changes." What is a data contract and what problem does it solve?
Data contract: a specification (usually YAML or JSON) that defines what a dataset looks like and how it will behave. Elements: schema (column names, types, nullable), semantics (what does this column mean?), freshness SLA (updated every hour), quality rules (null rate <1%, no negative values), versioning policy. Problem solved: Silent breaking changes — without contracts, a producer renames a column, and downstream pipelines break silently. Contracts make the agreement explicit and violations detectable. Schema drift — unintended schema changes (new columns, type changes, removed fields) propagating downstream. Data contract vocabulary: Producer — the team/service that generates the data. Consumer — the team/pipeline that reads the data. Breaking change — a schema change that breaks existing consumers: rename, remove, or type-change of existing fields. Non-breaking change — adding optional new fields with defaults. Contract testing — validating that actual data matches the contract definition. Tools: Soda Core, Great Expectations, dbt tests, Monte Carlo. Formats: OpenDataMesh, Bitol, custom YAML. In conversation: "We caught 3 breaking schema changes in staging because our contract tests ran against the new pipeline output — without contracts, those would have reached production."
2 / 5
A data platform engineer presents data observability to the team: "Data observability is the ability to understand the health of your data at all times. The five pillars are: freshness (is the data current?), volume (is the expected amount of data arriving?), distribution (are the values within expected ranges?), schema (have the columns changed?), and lineage (what upstream data did this come from?). When we get an alert, lineage tells us which source caused the issue." What is data lineage and why is it critical for incident response?
Data lineage: a directed graph showing how data flows from sources through transformations to final consumers. Types: Table-level lineage — which tables feed which tables. Column-level lineage — which source columns map to which output columns (through joins, transformations). More precise but harder to compute. Why it matters: Impact analysis — if source table X changes, which dashboards are affected? Root cause analysis — dashboard shows wrong numbers; lineage traces back to the upstream table with the data quality issue. Compliance — GDPR right to erasure: find all places a user's data was used. Data observability five pillars (Monte Carlo): Freshness — when was the table last updated? Alert if stale. Volume — is the row count within expected range? A 90% drop = likely upstream failure. Distribution — are column values within expected ranges? Null rate spike, unexpected categories. Schema — did columns change? Lineage — upstream/downstream visibility. Tools: Monte Carlo (commercial), OpenLineage (open standard), Marquez (open-source lineage service), Apache Atlas, DataHub. In conversation: "The dashboard was showing wrong revenue figures. With column-level lineage, we traced it back to a currency conversion bug in the exchange_rates table in 10 minutes — without lineage it would have taken hours."
3 / 5
A data quality engineer explains SLA measurement: "We have a data quality SLA for our orders table: freshness within 1 hour, null rate on order_id under 0.01%, row count within ±20% of a 7-day rolling average. When any of these breach, we page the on-call data engineer and open a data incident. Last quarter we had 99.2% data uptime — 3 SLA breaches each lasting under 4 hours." What is data downtime in the context of data observability?
Data downtime: coined by Monte Carlo; the period during which data is unreliable — inaccurate, missing, stale, or duplicated — from the perspective of data consumers. Analogous to service downtime in SRE. Data uptime = (1 - data downtime / total time) × 100%. Data SLA vocabulary: Data quality SLA — agreed thresholds for quality dimensions: freshness, completeness, accuracy, consistency. Data incident — a breach of a data quality SLA that requires investigation and remediation. Silent failure — a data quality issue that goes undetected because no monitoring is in place. The most dangerous type. SLA breach — a quality dimension falls outside the contracted threshold. Data incident severity: P0 (business-critical dashboard wrong), P1 (important data late), P2 (minor quality issue). MTTR for data — mean time to restore data quality. Data freshness SLA — the maximum acceptable delay between source data creation and availability in the warehouse. Typical: hourly for operations, daily for analytics. In conversation: "Our stakeholders treat stale data like a service outage — we track data downtime the same way SRE tracks uptime because the business impact is comparable."
4 / 5
A DataOps engineer introduces automated data quality testing: "We run dbt tests on every pipeline run. We have schema tests — not_null, unique, accepted_values, relationships — and custom data tests written in SQL. Great Expectations lets us define expectations: 'column X should have values between 0 and 1000', 'null rate should be under 5%'. We run these in CI so bad data never reaches production — it breaks the pipeline and alerts the on-call." What is a data test in a DataOps pipeline and why is it important?
Data test: an automated assertion about a data quality property that runs as part of the data pipeline. Fails the pipeline if the assertion is violated. dbt test types: not_null — asserts no null values in a column. unique — asserts all values are unique. accepted_values — asserts values are in a defined list. relationships — asserts referential integrity (foreign key exists in the referenced table). Custom tests — SQL queries that return failing rows; zero rows = test passes. Great Expectations vocabulary: Expectation — a testable assertion: "expect_column_values_to_be_between", "expect_column_mean_to_be_between". Expectation suite — a collection of expectations for a dataset. Checkpoint — a run of an expectation suite against a data batch. Data docs — Great Expectations generates HTML documentation of test results. Soda Core: another open-source data quality testing framework with YAML-defined checks. Data contract testing — validating that actual data matches the published data contract schema and rules. Pipeline vocabulary: Fail fast — break the pipeline immediately on quality issues rather than letting bad data propagate. Quarantine table — rows that failed quality checks go to a quarantine table for investigation rather than being silently dropped. In conversation: "Before DataOps, bad data would silently flow through and surface 3 days later in a stakeholder meeting. Now the pipeline breaks and we know within minutes."
5 / 5
A data architect explains schema management in event streaming: "All our Kafka messages are serialised with Avro and validated against schemas in the Confluent Schema Registry. Producers register a schema before publishing. Consumers look up the schema by ID from the message header. We enforce backward compatibility — you can add optional fields, but you can't remove or rename existing fields. A breaking change requires a new schema version and a migration plan." What is schema drift and how does DataOps tooling detect it?
Schema drift: unintended or unexpected changes to a data schema that propagate through the pipeline. Can be: additive (new columns — usually safe), subtractive (removed columns — breaks consumers), type change (integer → string — breaks consumers), rename (same data, new name — breaks consumers). Detection methods: Schema registry enforcement — Kafka Schema Registry rejects incompatible schema changes. dbt schema tests — dbt can assert that expected columns exist with expected types. Data observability schema monitoring — tools like Monte Carlo alert when the column count, types, or names change between pipeline runs. Contract testing — validating actual data against the published data contract schema. Schema evolution vocabulary: Backward compatible — new schema can be read by old code. Forward compatible — old schema can be read by new code. Schema migration — planned, versioned change to a schema; communicated to all consumers in advance. Blue-green schema deployment — run old and new schema in parallel; migrate consumers; retire old schema. Dead letter queue (DLQ) — in streaming pipelines, messages that fail schema validation are routed to a DLQ for investigation rather than blocking the pipeline. In conversation: "The pipeline broke because someone added a NOT NULL column to the source table without telling us — schema drift, and no contract to catch it."