INNER JOIN
Returns only rows that match in BOTH tables. The default — JOIN with no qualifier means INNER JOIN.
SELECT u.name, o.total
FROM users u
JOIN orders o ON o.user_id = u.id; 💡 Users with no orders disappear from the result.
37 of the SQL words that come up in code review and interviews — what each one means, a query example, and the gotcha you'll wish someone had told you.
Last reviewed:
INNER JOINReturns only rows that match in BOTH tables. The default — JOIN with no qualifier means INNER JOIN.
SELECT u.name, o.total
FROM users u
JOIN orders o ON o.user_id = u.id; 💡 Users with no orders disappear from the result.
LEFT (OUTER) JOINKeeps every row from the left table; right-table columns are NULL where there is no match.
SELECT u.name, o.total
FROM users u
LEFT JOIN orders o ON o.user_id = u.id; 💡 Filter o.id IS NULL to find users with no orders (an anti-join).
RIGHT (OUTER) JOINMirror of LEFT JOIN — keeps every row from the right table. Rare in practice; people just flip the tables.
SELECT u.name, o.total
FROM users u
RIGHT JOIN orders o ON o.user_id = u.id; FULL OUTER JOINKeeps unmatched rows from BOTH sides, filling missing columns with NULL.
SELECT u.name, o.total
FROM users u
FULL OUTER JOIN orders o ON o.user_id = u.id; 💡 MySQL lacks it natively — emulate with LEFT JOIN UNION RIGHT JOIN.
CROSS JOINThe Cartesian product — every row of A paired with every row of B. No ON clause.
SELECT s.size, c.color
FROM sizes s
CROSS JOIN colors c; 💡 A missing join condition silently becomes a CROSS JOIN — the classic accidental row explosion.
SELF JOINA table joined to itself, using aliases, to relate rows within the same table (e.g. employee → manager).
SELECT e.name, m.name AS manager
FROM employees e
JOIN employees m ON e.manager_id = m.id; GROUP BYCollapses rows that share the same key into one row per group, so aggregate functions run per group.
SELECT user_id, COUNT(*) AS orders
FROM orders
GROUP BY user_id; 💡 Every non-aggregated column in SELECT must appear in GROUP BY (in strict SQL).
HAVINGFilters AFTER grouping, on aggregate results. WHERE filters rows before grouping; HAVING filters groups.
SELECT user_id, COUNT(*) AS n
FROM orders
GROUP BY user_id
HAVING COUNT(*) > 5; COUNT / SUM / AVG / MIN / MAXThe standard aggregate functions. COUNT(*) counts rows; COUNT(col) ignores NULLs; the rest reduce a column to one value.
SELECT COUNT(*), SUM(total), AVG(total), MIN(total), MAX(total)
FROM orders; 💡 AVG and SUM skip NULLs, which can surprise you.
DISTINCTRemoves duplicate rows from the result, or counts unique values with COUNT(DISTINCT col).
SELECT COUNT(DISTINCT country) FROM users; OVERTurns an aggregate or window function into a window — it computes per row WITHOUT collapsing rows like GROUP BY does.
SELECT name, salary,
AVG(salary) OVER () AS company_avg
FROM employees; PARTITION BYSplits rows into groups for a window function — like GROUP BY, but the rows are kept.
SELECT dept, name, salary,
AVG(salary) OVER (PARTITION BY dept) AS dept_avg
FROM employees; ROW_NUMBERAssigns a unique sequential number per row within each partition, in the ORDER BY order.
SELECT name, dept,
ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC) AS rn
FROM employees; 💡 Filter rn = 1 in an outer query to get the top row per group.
RANK / DENSE_RANKRank rows by an ORDER BY. RANK leaves gaps after ties (1,1,3); DENSE_RANK does not (1,1,2).
SELECT name, score,
RANK() OVER (ORDER BY score DESC) AS r,
DENSE_RANK() OVER (ORDER BY score DESC) AS dr
FROM players; LAG / LEADRead a value from a previous (LAG) or following (LEAD) row in the window — perfect for diffs over time.
SELECT day, revenue,
revenue - LAG(revenue) OVER (ORDER BY day) AS delta
FROM daily_sales; running totalA cumulative sum computed with a window frame — SUM(...) OVER with rows from the start up to the current row.
SELECT day, amount,
SUM(amount) OVER (ORDER BY day
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running
FROM ledger; ACIDThe four guarantees a transactional database gives: Atomicity, Consistency, Isolation, Durability.
-- Atomic: both or neither
BEGIN;
UPDATE accounts SET bal = bal - 100 WHERE id = 1;
UPDATE accounts SET bal = bal + 100 WHERE id = 2;
COMMIT; BEGIN / COMMIT / ROLLBACKStart a transaction, then make all its changes permanent (COMMIT) or throw them all away (ROLLBACK).
BEGIN;
DELETE FROM logs WHERE created_at < now() - interval '90 days';
ROLLBACK; -- changed my mind READ UNCOMMITTEDWeakest isolation level — a transaction can read another’s uncommitted changes (dirty reads allowed).
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; 💡 Postgres treats this the same as READ COMMITTED — it never allows dirty reads.
READ COMMITTEDEach statement only sees data committed before it started. The default in PostgreSQL and Oracle.
SET TRANSACTION ISOLATION LEVEL READ COMMITTED; 💡 No dirty reads, but the same query can return different rows twice (non-repeatable read).
REPEATABLE READRows you read once stay the same for the whole transaction. MySQL InnoDB’s default.
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; 💡 Prevents dirty and non-repeatable reads; phantom rows can still appear in standard SQL.
SERIALIZABLEStrongest level — transactions behave as if run one after another. Prevents phantoms at the cost of more aborts/locks.
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE; dirty readReading data another transaction has written but NOT committed — it might be rolled back, so you read a value that never existed.
-- T1 updates row, T2 reads it, T1 rolls back
-- T2 saw a phantom value phantom readRe-running a range query returns new rows that another committed transaction inserted in between.
SELECT COUNT(*) FROM orders WHERE total > 100;
-- run twice: count grows because a row was inserted B-tree indexThe default, general-purpose index. Balanced tree that supports equality and range queries and ORDER BY.
CREATE INDEX idx_orders_created
ON orders (created_at); 💡 Great for =, <, >, BETWEEN, and sorting. Useless for arbitrary LIKE '%x%'.
hash indexMaps a hashed key to a location — fast for exact equality only, no ranges or sorting.
CREATE INDEX idx_sessions_token
ON sessions USING hash (token); GIN indexGeneralised Inverted iNdex (Postgres) — for columns holding many values per row: arrays, JSONB, full-text.
CREATE INDEX idx_docs_body
ON docs USING gin (to_tsvector('english', body)); GiST indexGeneralised Search Tree (Postgres) — for geometric, range, and nearest-neighbour queries (PostGIS, ranges).
CREATE INDEX idx_shapes_geom
ON shapes USING gist (geom); composite indexAn index on multiple columns. Order matters — it can serve queries that use a leftmost prefix of the columns.
CREATE INDEX idx_orders_user_date
ON orders (user_id, created_at); 💡 This index helps WHERE user_id = ? and WHERE user_id = ? AND created_at > ?, but not created_at alone.
covering indexAn index that contains every column a query needs, so the database answers from the index alone — an index-only scan.
CREATE INDEX idx_cover
ON orders (user_id) INCLUDE (total, status); EXPLAIN / EXPLAIN ANALYZEShows the query plan (estimated). EXPLAIN ANALYZE actually runs the query and reports real timing and row counts.
EXPLAIN ANALYZE
SELECT * FROM orders WHERE user_id = 42; 💡 Look for "Seq Scan" on big tables — usually a missing index.
CTE (WITH)A named, inline result set defined with WITH that you can reference like a table. Improves readability and enables recursion.
WITH recent AS (
SELECT * FROM orders WHERE created_at > now() - interval '7 days'
)
SELECT user_id, COUNT(*) FROM recent GROUP BY user_id; subqueryA query nested inside another — in SELECT, FROM, or WHERE. A correlated subquery references the outer row.
SELECT name FROM users u
WHERE (SELECT COUNT(*) FROM orders o WHERE o.user_id = u.id) > 0; UNION vs UNION ALLBoth stack result sets. UNION removes duplicates (and pays for a sort/hash); UNION ALL keeps everything and is faster.
SELECT id FROM active
UNION ALL
SELECT id FROM archived; 💡 Use UNION ALL unless you actually need de-duplication.
EXISTSA boolean test — true if a subquery returns any row. Often faster than IN for large or correlated checks.
SELECT name FROM users u
WHERE EXISTS (
SELECT 1 FROM orders o WHERE o.user_id = u.id
); COALESCEReturns the first non-NULL argument. The standard way to supply a default for NULL values.
SELECT COALESCE(nickname, name, 'Anonymous') AS display
FROM users; CASESQL’s if/else expression — return different values per row based on conditions. Handy for bucketing and pivoting.
SELECT name,
CASE WHEN age < 18 THEN 'minor'
WHEN age < 65 THEN 'adult'
ELSE 'senior' END AS bracket
FROM people;