Data engineers need precise vocabulary to discuss pipeline operations. This quiz focuses on the collocations used to monitor, alert, backfill, and validate data systems.
0 / 5 completed
1 / 5
Fill in: 'The on-call engineer is responsible for ___ the ingestion pipeline across all data sources.'
We 'monitor a pipeline' — 'monitor' is the data engineering standard for continuously tracking the health and throughput of a data system. 'Watch' is informal; 'check' implies intermittent manual inspection; 'observe' is used in scientific contexts and lacks the tooling connotation of pipeline operations.
2 / 5
Fill in: 'Our alerting system is configured to ___ anomalies in row counts before they affect dashboards.'
We 'detect anomalies' — 'detect' is the technical standard for automated systems surfacing unexpected patterns in data. 'Find' is informal; 'spot' is casual and implies manual discovery; 'identify' is close but 'detect' is preferred when the actor is an automated system rather than a person.
3 / 5
Fill in: 'The monitoring tool will ___ a PagerDuty alert if the pipeline SLA breaches the four-hour threshold.'
We 'trigger an alert' — 'trigger' is the canonical collocation for an automated condition causing a notification to be sent. 'Send an alert' is also common and focuses on the transmission; 'fire an alert' is informal and used loosely; 'raise an alert' is more common in British English incident management but less standard in data tooling.
4 / 5
Fill in: 'After fixing the bug, we had to ___ three days of missing transaction data from the source system.'
We 'backfill data' — 'backfill' is the specific data engineering term for populating a pipeline with historical records that were missed or dropped. 'Reload' and 'reprocess' are close but imply re-running an existing process, not specifically filling a gap; 'recover' is too broad and associated with disaster recovery.
5 / 5
Fill in: 'Every time a new data source is added, we must ___ the schema against the agreed data contract.'
We 'validate a schema' — 'validate' is the precise data engineering term for confirming that a schema conforms to defined rules or contracts. 'Test' implies running automated test cases; 'verify' is close but more common in quality assurance; 'check' is informal and does not convey the formal contract-based comparison implied here.