5 exercises — practise answering Data Warehouse Architect interview questions in professional technical English.
0 / 5 completed
1 / 5
The interviewer asks: "Compare the Kimball and Inmon approaches to data warehouse design, and explain when you would choose each." Which answer best demonstrates Data Warehouse Architect expertise?
Option B is strongest because it accurately describes Inmon's 3NF EDW with derived data marts, Kimball's star schema with the Bus Architecture and conformed dimensions, decision criteria for choosing each, and the hybrid CIF pattern. Option A is technically accurate but provides no detail on how each approach works, what the tradeoffs are, or when to choose each. Option C states a correct observation (Kimball's widespread adoption) but provides no architectural explanation or decision framework. Option D correctly associates Inmon with EDW and Kimball with data marts but presents them as solving different problems rather than competing philosophies, which misses the key tension. Data warehouse architect interview best practice: always mention conformed dimensions as the mechanism that makes the Kimball Bus Architecture work — it is the concept that separates superficial knowledge from genuine design experience.
2 / 5
The interviewer asks: "Explain Slowly Changing Dimensions Types 1, 2, and 3 with a concrete business example, and describe when each is appropriate." Which answer best demonstrates Data Warehouse Architect expertise?
Option B is strongest because it provides a concrete Customer dimension example for all three types, explains the surrogate key mechanism for Type 2, describes effective/expiry dates and is_current flags, explains how fact table joins preserve historical accuracy, names appropriate use cases for each type, and mentions Type 4 and Type 6 as advanced patterns. Option A gives correct one-line definitions but no example, no surrogate key explanation, no join mechanics, and no decision guidance. Option C correctly recommends Type 2 as a default but provides no explanation of the mechanism or when Types 1 and 3 are preferred. Option D makes a true but trivial observation about why SCDs matter without explaining any of the types. Data warehouse architect interview best practice: explain the surrogate key and historical join mechanism for Type 2 — this is the detail that proves you have implemented SCD Type 2 rather than just read about it.
3 / 5
The interviewer asks: "Compare Delta Lake, Apache Iceberg, and Apache Hudi as data lakehouse table formats. What problems do they solve and how do they differ?" Which answer best demonstrates Data Warehouse Architect expertise?
Option B is strongest because it explains the common problem all three solve, then differentiates them on metadata architecture (Delta log JSON vs Iceberg manifest tree vs Hudi upsert model), partition evolution, Z-ordering, CDC optimisation, and engine compatibility — with concrete provenance (Databricks, Netflix, Uber) and a justified personal recommendation. Option A correctly identifies the shared features but provides no differentiation, which is precisely what the question asks for. Option C identifies one key difference (Databricks vs Apache provenance and Spark coupling) but provides no detail on metadata architecture, partition evolution, or CDC capabilities. Option D names a project choice without explaining the architectural reasoning or comparison. Data warehouse architect interview best practice: Iceberg's manifest tree metadata architecture and hidden partitioning are the most technically differentiating features — mentioning these signals that you have engaged with the format's design, not just installed it.
4 / 5
The interviewer asks: "Describe the relationship between fact tables and dimension tables in dimensional modelling, and explain what conformed dimensions are." Which answer best demonstrates Data Warehouse Architect expertise?
Option B is strongest because it explains grain selection as the key modelling decision, describes fact table measures as additive, explains dimension denormalisation rationale, highlights the Date dimension as always pre-populated, defines conformed dimensions with a concrete Sales+Returns example, names the Bus Architecture, and describes the local dimension anti-pattern. Option A gives a correct but superficial description that any analyst might give — it shows no architectural understanding. Option C correctly describes the visual structure of a star schema but provides no explanation of grain, measures, denormalisation rationale, or conformed dimensions. Option D gives a correct but incomplete definition of conformed dimensions without explaining why they matter for cross-domain analysis or the Bus Architecture. Data warehouse architect interview best practice: always explain grain selection first when discussing fact tables — it is the foundational decision that determines everything else, and experienced architects recognise its central importance.
5 / 5
The interviewer asks: "What query optimisation techniques would you apply to a slow analytical query on a large data warehouse table?" Which answer best demonstrates Data Warehouse Architect expertise?
Option B is strongest because it covers execution plan analysis with engine-specific tools (Redshift EXPLAIN, BigQuery Query Execution Graph, Snowflake Query Profile), partitioning, projection pruning, clustering/sort keys by engine, materialised views, partition pruning pitfalls with function wrapping, join order optimisation in distributed engines, and data skew with salting. Option A suggests indexing (less relevant in most columnar warehouses) and "proper joins" with no specific optimisation techniques. Option C identifies materialised views as one technique but misses the full diagnostic and multi-layer optimisation approach. Option D correctly identifies columnar storage as beneficial but describes a storage format choice rather than query optimisation techniques — by the time a query is slow, the storage format is already set. Data warehouse architect interview best practice: always mention execution plan analysis as the first step — optimising without understanding the plan is guessing, and interviewers want to see a systematic diagnostic approach.