5 collocation exercises on data warehouse modelling and pipelines.
0 / 5 completed
1 / 5
A data engineer will ___ a table to fit the schema.
You model a table — designing its columns, types, keys and relationships to fit a logical schema. Data modelling is a core warehouse discipline, and the verb model collocates with table, schema and data. Shape off, sculpt up and frame down are not used. Good modelling, whether star schema or normalised, determines how efficiently queries run and how easily analysts can understand the data.
2 / 5
In a star schema, you ___ a fact table.
You build a fact table — the central table holding measurable events, surrounded by dimension tables. The natural verb is build (you also build a dimension or build a pipeline). Erect up, assemble out and forge on are not idiomatic. Fact tables typically store numeric metrics and foreign keys to dimensions, forming the backbone of analytical queries in a dimensional warehouse design.
3 / 5
For large tables, you ___ the data by date.
You partition a table — physically splitting it (often by date) so queries scan only the relevant partitions. The verb partition collocates with table and data, and partitioning dramatically improves performance and cost on warehouses like BigQuery or Snowflake. Slice off, segment up and chunk out are not the standard terms. Choosing a good partition key, usually a date or high-cardinality filter column, is a key design decision.
4 / 5
To pre-compute a complex query result, you ___ a view.
You materialise a view — storing the query result physically so it can be read quickly instead of recomputed each time. A materialised view trades storage and refresh cost for fast reads. The verb is materialise (US: materialize). Solidify up, harden off and concrete out are not used. Materialised views are ideal for expensive aggregations that many dashboards query repeatedly.
5 / 5
To run a pipeline regularly, you ___ a job.
You schedule a job — configuring it to run automatically at set times or intervals, often via a tool like Airflow, dbt Cloud or cron. The collocation schedule a job (or schedule a run) is standard. Time off, clock up and set out are wrong here. Scheduled jobs keep warehouse tables fresh by re-running transformations as new source data arrives on a predictable cadence.