Lakehouse: blends a data lake's cheap, open-format object storage with warehouse capabilities like ACID transactions, schema enforcement, and SQL, avoiding the need for separate lake and warehouse systems.
2 / 5
What role does an open table format like Delta Lake or Apache Iceberg play?
Open table format: a metadata layer over Parquet files giving ACID transactions, time travel, and schema evolution to data sitting in cheap object storage, turning raw files into reliable tables.
3 / 5
What is schema-on-read versus schema-on-write?
Schema-on-read: data lakes store raw files and impose schema at query time, offering flexibility. Warehouses use schema-on-write, validating structure on ingest. Lakehouses can blend both with enforcement options.
4 / 5
What does time travel mean in a lakehouse table?
Time travel: open table formats version each write, so you can query the table as of an earlier snapshot or timestamp. This aids auditing, reproducing reports, and recovering from bad writes.
5 / 5
What is the medallion architecture (bronze/silver/gold)?
Medallion architecture: organizes a lakehouse into bronze (raw ingested), silver (cleaned and conformed), and gold (business-level aggregates) layers, making pipelines clearer and data quality progressively higher.