Practice data warehouse vocabulary: star schema, fact tables, dimension tables, grain, slowly changing dimensions (SCD), and surrogate keys.
0 / 5 completed
1 / 5
What is a 'star schema' in data warehousing?
A star schema has one central fact table (containing measurable events like transactions) connected to multiple dimension tables (like Date, Customer, Product). It's denormalized for query performance — fewer JOINs needed for analytics.
2 / 5
What is a 'fact table' in a data warehouse?
A fact table stores the measurable events of a business process — each row is typically one transaction, event, or occurrence. It contains foreign keys linking to dimension tables and numeric measures. Example: an Orders fact table with columns for date_key, customer_key, product_key, quantity, revenue.
3 / 5
What does 'the grain of this table is one row per transaction' mean?
'Grain' is the most important concept when designing a fact table — it defines exactly what one row represents. Stating the grain explicitly (e.g., 'one row per order line item') prevents ambiguity and ensures consistent aggregation in queries.
4 / 5
What is a Slowly Changing Dimension (SCD)?
An SCD handles dimension attributes that change over time. SCD Type 1 overwrites (no history). SCD Type 2 adds a new row with a new effective date (full history preserved). Type 2 is most common — it lets you analyze 'what was the customer's address at the time of the order.'
5 / 5
Why do data warehouses use 'surrogate keys' instead of natural keys from source systems?
Surrogate keys (typically auto-incremented integers) provide stability — if a source system changes its natural key format, the warehouse is unaffected. They also enable SCD Type 2 (multiple rows per entity with different keys), and integer JOINs are faster than string JOINs.