5 exercises on advanced PostgreSQL internals — concurrency, storage, and query planning.
0 / 5 completed
1 / 5
What is MVCC in PostgreSQL?
MVCC (Multi-Version Concurrency Control) lets PostgreSQL serve concurrent transactions without readers blocking writers or vice versa. Each row update creates a new version (tuple) rather than overwriting in place, and every transaction sees a consistent snapshot of the data as of its start. Visibility is tracked via transaction IDs (xmin/xmax). The cost is accumulated dead tuples from old versions, which must later be reclaimed. MVCC is what gives Postgres its strong isolation guarantees with high concurrency.
2 / 5
What does VACUUM do in PostgreSQL?
Because MVCC leaves behind dead tuples from updates and deletes, VACUUM reclaims that space so it can be reused, preventing unbounded table bloat. Plain VACUUM marks space reusable in place, while VACUUM FULL rewrites the table to shrink it on disk (taking an exclusive lock). ANALYZE, often run alongside, refreshes the statistics the query planner relies on. The background autovacuum daemon runs these automatically, and it also prevents transaction-ID wraparound, a critical maintenance concern.
3 / 5
What is the WAL in PostgreSQL?
The WAL (Write-Ahead Log) implements durability: before any change is written to the actual data files, it is first appended to a sequential log on disk. If the server crashes, Postgres replays the WAL to recover committed transactions and reach a consistent state. This write-ahead approach also turns random data-file writes into fast sequential log writes. The WAL underpins point-in-time recovery, streaming replication (replicas consume the WAL stream), and logical replication.
4 / 5
When would you use a GIN index instead of a B-tree in PostgreSQL?
PostgreSQL offers several index types for different access patterns. The default B-tree handles equality and range comparisons on scalar values. A GIN (Generalized Inverted Index) excels when a single row contains many indexable elements — full-text search vectors, JSONB documents, and arrays — mapping each element to the rows containing it. A GiST index supports geometric and nearest-neighbour queries (e.g. PostGIS). Choosing the right index type for the query workload is key to performance.
5 / 5
What does EXPLAIN ANALYZE show in PostgreSQL?
EXPLAIN displays the query plan the planner has chosen — the sequence of scans, joins, and sorts with estimated costs and row counts. Adding ANALYZE actually executes the query and reports the real timings and row counts at each node, letting you compare estimates against reality. Large gaps between estimated and actual rows often point to stale statistics or missing indexes. Reading these plans is the primary skill for diagnosing slow queries — spotting sequential scans, nested-loop blowups, and bad join orders.