Question 1

What does EXPLAIN ANALYZE actually tell you about a slow query?

Accepted Answer

EXPLAIN ANALYZE executes the query and returns the actual execution plan with real row counts and timing. Key things to look for are sequential scans on large tables, high actual rows vs. estimated rows, and nodes with a high cost value. The actual time field shows milliseconds spent per node, so you can pinpoint exactly which join, sort, or scan is the bottleneck.

Question 2

How do you explain the N+1 query problem to a non-technical stakeholder?

Accepted Answer

A clear way to phrase it: instead of asking the database one question that returns all the data needed, the code asks one question per record — so for 500 records the application fires 501 database round-trips instead of 1. The fix is typically eager loading or a single JOIN query.

Question 3

What is the difference between a clustered and a non-clustered index?

Accepted Answer

A clustered index physically orders the table rows on disk according to the index key — a table can have only one. A non-clustered index is a separate structure that stores the key values and pointers back to the actual rows, requiring an extra bookmark lookup step.

Question 4

What vocabulary is used when discussing connection pooling?

Accepted Answer

Common terms include: pool size, min/max connections, connection timeout, idle timeout, and pool exhaustion. For example: we tuned the pool size from 10 to 25 after observing frequent pool exhaustion during peak traffic.

Question 5

How do you describe index bloat in a code review or incident debrief?

Accepted Answer

Index bloat refers to unused space inside an index structure caused by frequent updates and deletes that leave dead tuples behind. You might say: the index has grown significantly due to write amplification, and we need to run REINDEX CONCURRENTLY to reclaim space without locking.

Question 6

What does cardinality mean in a database index context?

Accepted Answer

Cardinality refers to the number of distinct values in a column. High cardinality columns make excellent index candidates. Low cardinality columns — such as a boolean flag — often make a partial index more useful than a full index.

Question 7

What English phrases are used to discuss query plan regressions?

Accepted Answer

A query plan regression occurs when the optimizer chooses a worse execution plan after a statistics update, data growth, or version upgrade. Useful phrases include: the planner switched from an index scan to a sequential scan after the table grew, or we pinned the plan using a query hint to prevent regression.

Question 8

How do you communicate replication lag to a product team?

Accepted Answer

Translate into impact: replication lag means there is a delay between a write on the primary database and that write becoming visible on the read replica. For features that write and immediately read, users may see stale data. Business-friendly vocabulary includes propagation delay, eventual consistency, and staleness window.

Question 9

What is the meaning of covering index and when would you recommend one?

Accepted Answer

A covering index includes all columns a query needs so the engine never visits the base table rows. Recommend one when profiling shows frequent key lookups. Example: adding a covering index on customer_id INCLUDE email, status to eliminate the key lookup that accounts for 60% of the query cost.

Question 10

How should you frame a database capacity planning discussion?

Accepted Answer

Structure the conversation around storage growth rate, IOPS headroom, and connection headroom. A useful framing: at our current data growth rate of 8 GB per month, we will exhaust the current allocation in approximately 14 months, so we should review partitioning and archival strategies in Q3.

Database Optimization Language Exercises

Frequently Asked Questions