5 exercises — window functions, LAG/LEAD, CTEs, EXPLAIN ANALYZE, and materialized views — advanced SQL vocabulary for backend engineers and DBAs.
0 / 5 completed
1 / 5
"We use window functions to rank users by revenue without collapsing rows." Which SQL clause defines the window (partition and ordering) for a window function?
A window function performs a calculation across a set of rows related to the current row — unlike GROUP BY it does not collapse rows. The OVER() clause defines the window: PARTITION BY splits rows into groups (like GROUP BY but without collapsing), ORDER BY sets the row order within each partition. Examples: ROW_NUMBER() assigns a unique sequential integer; RANK() leaves gaps after ties; DENSE_RANK() does not leave gaps.
2 / 5
A colleague writes: LAG(salary, 1) OVER (ORDER BY hire_date). What does LAG() return for each row?
LAG(expr, offset, default) returns the value of expr from offset rows before the current row in the window — useful for comparing a value to the previous period. LEAD(expr, offset, default) does the opposite: it looks ahead by offset rows. Both return NULL (or the supplied default) when there is no such row. Classic use case: calculating month-over-month or day-over-day deltas without a self-join.
3 / 5
"We restructured the multi-join query into a _____, using a WITH clause to name an intermediate result set that we could reference twice in the main query."
A Common Table Expression (CTE) is defined with WITH cte_name AS (SELECT …) before the main query. It improves readability and can be referenced multiple times. A recursive CTE has two parts joined by UNION ALL: the anchor member (base case, no self-reference) and the recursive member (references the CTE itself). Recursive CTEs are used for hierarchical data like org charts or bill-of-materials trees.
4 / 5
A senior engineer says: "Run EXPLAIN ANALYZE on that query — check the actual rows vs estimated rows." What does the "actual rows" figure tell you?
EXPLAIN shows the query plan with estimated row counts and costs. EXPLAIN ANALYZE actually executes the query and adds actual rows (real), actual time (ms per loop), loops (how many times the node ran), plus total planning time and execution time at the bottom. A large discrepancy between estimated and actual rows usually means stale statistics — fix with ANALYZE table_name. The output also shows which index (if any) was used at each scan node.
5 / 5
"We schedule REFRESH MATERIALIZED VIEW CONCURRENTLY every hour." Why use CONCURRENTLY?
A materialized view stores the query result on disk — unlike a regular view which re-runs the query each time. REFRESH MATERIALIZED VIEW view_name re-computes and replaces the data but takes an exclusive lock, blocking reads during the refresh. REFRESH MATERIALIZED VIEW CONCURRENTLY performs a non-blocking refresh: it computes the new data set in the background and swaps it in without locking out readers. Requirement: the materialized view must have a unique index.