SQL Performance Tuning Vocabulary Explained

50 essential SQL performance tuning terms for DBAs and developers: query plans, indexes, statistics, lock types, and optimiser vocabulary — with plain English explanations.

SQL performance tuning requires a specialised vocabulary spanning query execution, indexing, statistics, locking, and the query optimiser. This vocabulary appears in documentation, incident reports, database monitoring tools, and performance reviews. This guide covers the 50 terms you need to discuss query performance fluently as a DBA or senior developer.


Query Execution Basics

Query Plan / Execution Plan

A query plan (or execution plan) is the set of steps the database engine chooses to execute a SQL query — which indexes to use, in what order to join tables, what algorithms to apply.

The query plan is the starting point for all performance analysis.

“Before we start tuning, let’s look at the execution plan — EXPLAIN ANALYZE will show us exactly where the cost is.”

Query Planner / Query Optimiser

The query optimiser is the component of the database engine that generates the query plan by evaluating possible execution strategies and choosing the cheapest one based on statistics.

Cost

In query plans, cost is the query optimiser’s estimate of the computational resources required to execute an operation — not wall-clock time. Lower cost = more efficient plan.

“The estimated cost of this table scan is 45,000 — that’s why the query is slow.”

Actual vs Estimated Rows

When you run EXPLAIN ANALYZE, you see estimated rows (what the planner predicted) and actual rows (what was processed). A large discrepancy indicates stale statistics.

“Estimated rows: 12. Actual rows: 847,000. The planner massively underestimated — we need to update statistics.”


Scan Types

Sequential Scan (Full Table Scan)

A sequential scan reads every row in a table from beginning to end. Appropriate when a large percentage of rows satisfy the WHERE clause.

“This query triggers a sequential scan on a 200M-row table — it’s taking 45 seconds. We need an index.”

Index Scan

An index scan traverses the index tree to find matching rows, then fetches the actual rows from the table. Efficient for selective queries (few matching rows).

Index-Only Scan

An index-only scan satisfies the query entirely from the index without accessing the table — possible only when the index covers all required columns (covering index).

“The index-only scan is perfect here — the query only needs the columns in the index, so we never touch the table.”

Bitmap Heap Scan

A bitmap heap scan (PostgreSQL) first builds a bitmap from the index of matching row locations, then fetches the actual rows in physical order. Efficient for moderately selective queries.

Loose Index Scan

A loose index scan (MySQL) reads one row per index key boundary — very efficient for SELECT DISTINCT or GROUP BY on an indexed column.


Joins

Nested Loop Join

A nested loop join iterates over the outer table and, for each row, searches the inner table for matches. Efficient when the inner side is small and indexed.

“Nested loop join is fine here — the inner table is small and the join column is indexed.”

Hash Join

A hash join builds a hash table from the smaller table, then probes it for each row of the larger table. Efficient for large, unsorted tables without suitable indexes.

Merge Join (Sort Merge Join)

A merge join requires both input sets to be sorted on the join key. Efficient when data is already sorted (e.g., an ordered index scan feeds directly into the join).

Cross Join / Cartesian Product

A cross join combines every row of one table with every row of another — N × M rows. Almost always a bug when seen in an unintended query plan.

“This query is doing an unintended cross join — 1,000 × 50,000 = 50 million rows being processed. We’re missing a JOIN condition.”


Indexes

B-Tree Index

A B-tree index is the default index type in most databases — a balanced tree supporting equality, range, ORDER BY, and MIN/MAX operations efficiently.

Composite Index

A composite index (multi-column index) covers multiple columns. Column order matters: (a, b) supports queries on a alone or a AND b, but not b alone.

“The composite index on (customer_id, created_at) supports our most common queries — filter by customer, then sort by date.”

Covering Index

A covering index includes all columns required by a query — enabling index-only scans and eliminating table fetches.

Partial Index

A partial index indexes only a subset of rows that satisfy a WHERE clause — smaller and more efficient for specific query patterns.

“We created a partial index on (user_id) WHERE status = 'pending' — only pending orders need fast lookup.”

Unused Index

An unused index consumes write overhead (every INSERT/UPDATE/DELETE maintains it) without providing query benefit. Regular audits should remove them.

Index Bloat

Index bloat is the accumulation of dead pages in an index due to updates and deletes — increases index size and scan time. Regular VACUUM or reorganisation fixes it.


Statistics

Table Statistics

Statistics are metadata the query optimiser uses to estimate row counts, value distributions, and selectivity. Commands: ANALYZE (PostgreSQL), UPDATE STATISTICS (SQL Server), ANALYZE TABLE (MySQL).

Selectivity

Selectivity is the fraction of rows a predicate matches. High selectivity (0.01 = 1% of rows) → index scan efficient. Low selectivity (0.9 = 90% of rows) → sequential scan may be better.

Cardinality

Cardinality is the number of distinct values in a column. A boolean has cardinality 2; a UUID has high cardinality (millions). High cardinality columns are better candidates for indexes.

Statistics Staleness

When statistics are stale (not updated after significant data changes), the optimiser makes bad decisions based on outdated estimates.

“The statistics haven’t been updated since we loaded 30 million rows last night. ANALYZE needs to run before the morning reports.”


Locking

Lock

A lock controls concurrent access to a database resource. Without locks, concurrent writes could corrupt data.

Shared Lock (Read Lock)

A shared lock allows concurrent reads but prevents writes. Multiple sessions can hold shared locks simultaneously.

Exclusive Lock (Write Lock)

An exclusive lock allows writes and prevents all other access — no concurrent reads or writes while it’s held.

Deadlock

A deadlock occurs when two sessions each hold a lock the other needs — both are blocked indefinitely. The database detects and resolves it by killing one transaction.

“We have a recurring deadlock between the order processing and inventory update transactions — they’re acquiring locks in opposite order.”

Lock Contention

Lock contention is the blocking of one session while waiting for another session’s lock to be released. High contention is a major source of performance degradation.

“The replica lag incident was caused by lock contention — a single analytical query held a table lock for 2 hours.”

Lock Wait Timeout

A lock wait timeout is the maximum time a session waits for a lock before giving up with an error. Configurable per session or globally.

MVCC (Multi-Version Concurrency Control)

MVCC is a concurrency control mechanism where reads never block writes and writes never block reads — each transaction sees a snapshot of data as of its start time. Used by PostgreSQL, Oracle, MySQL (InnoDB), and others.


Query Optimisation Techniques

Query Rewrite

Query rewriting changes the logical structure of a SQL query to enable a more efficient execution plan — e.g., replacing a correlated subquery with a JOIN.

Push Down Predicate

Predicate pushdown moves filter conditions (WHERE clauses) as close to the data source as possible — reducing rows processed early and passing less data through the pipeline.

Partition Pruning

Partition pruning allows the optimiser to skip entire partitions that cannot satisfy the WHERE clause — critical for performance on partitioned tables.

“With monthly partitioning and partition pruning, a query for last week’s orders only scans 2 of 36 partitions.”

Materialised View

A materialised view is a precomputed, stored result of a query — refreshed periodically or on demand. Dramatically speeds up complex aggregations.

“The executive dashboard query was taking 90 seconds — we replaced it with a materialised view that pre-aggregates daily. Now it returns in 200ms.”

Query Cache

A query cache stores the results of identical queries — subsequent executions return the cached result. Less common in modern databases due to invalidation complexity.


Useful Phrases

Diagnosing performance:

  • “Let’s run EXPLAIN ANALYZE on this query — I want to see actual vs estimated rows.”
  • “This sequential scan on a 200M-row table is our bottleneck — we need to add an index on customer_id.”
  • “The statistics for this table are stale — that’s why the planner chose a nested loop instead of a hash join.”

Explaining findings in incident reports:

  • “The root cause was lock contention — a long-running query held an exclusive lock and blocked replication for 2 hours.”
  • “After adding the covering index, P99 latency dropped from 4.2s to 180ms.”

Practice

Build your DBA vocabulary with the Database Administration exercise set and the DBA learning path.