Intermediate Interview Prep #data-engineering #pipelines #etl #sql

Data Engineer Interview Questions

5 exercises — practice structuring strong English answers to data engineering interview questions: ETL vs ELT, data lineage, pipeline reliability, partitioning, and real-time event processing.

How to structure data engineering interview answers
  • ETL/ELT questions: define both → state the decision criterion (compute location) → mention re-transformation from raw data → name tools (dbt, Informatica)
  • Lineage questions: three values — trust, impact analysis, compliance → name column-level vs table-level → cite dbt and catalog tools
  • Reliability questions: organise by layer — infrastructure, quality, orchestration, observability → idempotency is always relevant
  • Partitioning questions: define partition pruning → quantify the benefit → name clustering → give the anti-pattern (high-cardinality)
  • Real-time questions: name all three layers → give tool decision criteria → address exactly-once vs at-least-once + watermarks
0 / 5 completed
1 / 5
The interviewer asks: "What is the difference between ETL and ELT, and when would you choose one over the other?"
Which answer demonstrates the clearest data engineering thinking?