Advanced Listening #architecture #system-design #trade-offs

Architecture Decision Discussions

Read 3 architecture review transcripts — database technology selection, sharding analysis, and event-driven trade-offs — then answer comprehension questions about the reasoning and decisions.

How to follow an architecture discussion in English
  • Problem first: speakers usually state the problem before proposing solutions — identify it early
  • Trade-off signals: "the trade-off I'm accepting", "the risk is", "the downside" — mark the key tension
  • Numbers matter: specific figures (800 queries/day, 90GB, p95 120ms) anchor abstract arguments — note them
  • Rejection reasoning: when an option is rejected, the speaker will explain why — this is the core of the argument
0 / 3 completed
1 / 3
📄 Transcript
[Architecture decision discussion — internal engineering sync. Team lead presenting a proposed system change.]
Lead: "Okay, I want to walk through the proposal to move our search functionality from Elasticsearch to a Postgres full-text search setup. I know this sounds counterintuitive, so let me explain the reasoning.
Our current Elasticsearch cluster handles roughly 800 queries per day peak. We're not a search-first product — search is one feature among many, and it's used maybe 12% of sessions. The operational overhead of running a separate Elasticsearch cluster — the separate nodes, the schema management, the index tuning, the upgrade path — is disproportionate to how much search matters to our users.
Postgres full-text search using tsvector and tsquery, combined with pg_trgm for fuzzy matching, covers our query patterns. We've benchmarked it: for our dataset size — 2.4 million records — and our query patterns — primarily prefix and phrase matching — Postgres performs within acceptable latency bounds, under 120ms p95.
The trade-off I'm accepting: Elasticsearch has significantly richer relevance ranking, synonym handling, and multi-language tokenisation. If we ever need those — and nothing in our product roadmap currently requires them — we'd need to reintroduce a dedicated search solution. That's a deliberate architectural bet: we'd rather have operational simplicity now and pay the migration cost later if we need to.
Engineer: "What about the index rebuild time if we need to re-index the full dataset?"
Lead: "Good question. pg_trgm index build on 2.4M records in our test environment took about 11 minutes with the table locked for the last 2. We'd do a CONCURRENTLY index build in production — no table lock, roughly 18 minutes. That's acceptable for us."
What is the lead's core architectural argument, and what specific trade-off do they explicitly accept?