Advanced Interview Prep #dbre #database #reliability #postgres

Database Reliability Engineer Interview Questions

5 exercises — practice structured English answers for DBRE interviews: SLOs for databases, replication incidents, capacity planning, zero-downtime migration, and disaster recovery.

How to structure DBRE interview answers
  • Database SLOs: define SLIs (replication lag, query latency, connection availability) → set SLO windows → error budget
  • Replication lag: root causes (write amplification, long transactions, network) → diagnosis commands → mitigation steps
  • Schema migration: online DDL tools (gh-ost, pt-osc) → lock-free vs. locked → test on replica first
  • Capacity planning: growth rate (rows, bytes, IOPS) → projection window → storage + compute + connection headroom
  • RTO/RPO: define target → PITR capability → failover time → backup validation
0 / 5 completed
1 / 5
The interviewer asks: "How would you define and measure an SLO for a Postgres primary/replica setup?"
Which answer is most complete?