🗄️ Database & SQL Language
7 exercise sets. Master the English vocabulary backend developers, DBAs, and data engineers use in schema reviews, query optimization discussions, and database incident communication.
SQL Query Description
Describing SQL operations in English: joins, aggregations, subqueries, filters, and window functions.
Schema Design Vocabulary
Normalization, denormalization, relationships, constraints, and data modelling vocabulary.
Database Migration Language
Migration vocabulary: up/down migrations, zero-downtime, backfill, breaking vs. non-breaking schema changes.
Query Optimization Discussion
EXPLAIN output, sequential scan, index scan, N+1 problem, query plan vocabulary for reviews and discussions.
Indexing Strategy Language
B-tree, partial, composite, covering index types — when to use each and how to discuss trade-offs.
Database Incident Language
Replication lag, deadlock, connection pool exhaustion, failover — communicating database incidents.
Transactions & Concurrency
ACID, isolation levels, optimistic vs. pessimistic locking, phantom reads — advanced database vocabulary.
Frequently Asked Questions
How do engineers describe the different types of SQL JOINs in technical discussions?
An INNER JOIN returns only rows where both tables have a matching key, while a LEFT JOIN returns all rows from the left table plus matching rows from the right. Engineers also use FULL OUTER JOIN to return all rows from both tables regardless of matches, and CROSS JOIN to produce a Cartesian product — each type is chosen based on the data completeness requirements of the query.
What does ACID stand for and how is it explained in database design conversations?
ACID stands for Atomicity (all operations in a transaction succeed or none do), Consistency (the database moves from one valid state to another), Isolation (concurrent transactions do not interfere with each other), and Durability (committed transactions survive system failures). Engineers invoke ACID compliance when arguing for relational databases over NoSQL alternatives in use cases requiring strong correctness guarantees.
What vocabulary is used to discuss database query optimisation?
Query optimisation discussions centre on the query execution plan, which engineers examine using the EXPLAIN or EXPLAIN ANALYSE command. Key terms include sequential scan (reading the entire table), index scan (using an index to locate rows efficiently), nested loop join, and hash join — engineers say a query "hits an index" when the planner can avoid a full table scan.
How do engineers communicate about database normalisation and denormalisation trade-offs?
Normalisation organises data to eliminate redundancy by splitting related data into separate tables linked by foreign keys, while denormalisation intentionally introduces redundancy to improve read performance. Engineers say a schema is in "third normal form" (3NF) when it eliminates transitive dependencies, and argue for denormalisation with phrases like "we're willing to accept write anomalies in exchange for faster aggregation queries".
What is a "phantom read" and how is it discussed in transaction isolation contexts?
A phantom read occurs when a transaction re-executes a query and finds rows that were inserted by another concurrent transaction after the first read. It is the anomaly prevented by the SERIALIZABLE isolation level but allowed at REPEATABLE READ. Engineers reference phantom reads when choosing isolation levels, trading consistency guarantees against concurrency throughput.
How do engineers describe zero-downtime database migrations?
A zero-downtime migration is a schema change that can be applied to a live database without taking the application offline. Common techniques include the expand-contract pattern (add the new column, backfill data, deploy code that uses it, then remove the old column in a separate migration), and using nullable columns or default values to avoid locking issues during large table alterations.
What does "connection pool exhaustion" mean in a database incident?
Connection pool exhaustion occurs when all database connections in the pool are in use and new requests must wait, causing application latency or timeouts. Engineers describe it in incidents by saying "we hit the PG connection ceiling" or "the pool was saturated" and the fix typically involves tuning the pool size, adding a connection pooler like PgBouncer, or reducing query latency to release connections faster.
How is replication lag explained in database architecture discussions?
Replication lag is the delay between a write being committed on the primary database server and that change becoming visible on a replica. Engineers say a system "reads from replicas with eventual consistency" when lag is acceptable, and describe use cases that require "reading from primary" when zero lag is necessary — for example, immediately after a write that the same user will read back.
What is the N+1 query problem and how do engineers identify it?
The N+1 problem occurs when an application executes one query to fetch a list of records and then an additional query for each record to fetch related data, resulting in N+1 total database calls. Engineers identify it by looking for "sequential queries in a loop" in query logs or ORM profiling tools, and resolve it by using JOIN queries or eager loading to fetch all related data in a single round trip.
How do engineers discuss optimistic vs. pessimistic locking strategies?
Pessimistic locking acquires a lock before reading data to prevent any concurrent modification, using constructs like SELECT FOR UPDATE in SQL. Optimistic locking assumes conflicts are rare and checks a version number or timestamp at write time, failing the transaction if another write has occurred since the read. Engineers choose optimistic locking for read-heavy workloads and pessimistic locking when conflict probability is high and the cost of retrying is significant.