5 exercises — Practise communicating performance findings: presenting results, diagnosing metrics, translating for stakeholders, and articulating trade-offs.
0 / 5 completed
1 / 5
You have completed a performance optimisation and need to present results to your team. Which opening sentence is most effective?
Performance results must be specific, measured, and contextual.
A strong performance result statement includes:
• Metric name: p99 latency (not just "speed")
• Before and after values: 1,240 ms → 187 ms
• Magnitude of improvement: 85% reduction
• Scope: specific endpoint (/checkout)
• Test conditions: under 500 concurrent users
Why "p99" matters:
• Average (mean) hides outliers — a fast average can mask terrible tail latency
• p99 (99th percentile) = "99% of requests completed in X ms or less"
• p95, p99, p99.9 are standard metrics in SRE and performance engineering
Key vocabulary:
• p99 latency — the 99th percentile response time; 1% of requests take longer
• baseline — the "before" measurement used for comparison
• concurrent users — users making requests simultaneously during a load test
• percentage reduction — (old - new) / old × 100; the standard way to express improvement
2 / 5
A stakeholder asks: "The new feature is live, but our monitoring shows a 40% increase in CPU usage. Should we be worried?"
Which response demonstrates the best performance engineering communication?
Performance diagnosis always starts with correlation analysis before drawing conclusions.
The right framework for evaluating a metric increase: 1. Is the increase proportional to traffic?
• CPU up 40% + traffic up 40% = linear scaling = expected
• CPU up 40% + traffic flat = efficiency regression = investigate
• CPU up 40% + traffic up 10% = super-linear growth = potential issue
2. Are other metrics affected?
• CPU spike + error rate up → likely a real problem
• CPU spike + latency unchanged + errors unchanged → may be acceptable
3. Is the absolute value concerning?
• 40% of what baseline? 40% of 10% → 14% total; 40% of 70% → approaching saturation
• "Headroom" — how close are we to capacity limits?
Key vocabulary:
• correlation — the relationship between two metrics (does metric A move with metric B?)
• proportional scaling — CPU/memory growing linearly with load (expected behaviour)
• super-linear growth — resource usage growing faster than the load driving it (a warning sign)
• headroom — the remaining capacity before a resource becomes a bottleneck
3 / 5
After a load test, you are presenting findings to leadership. Which way of communicating the results is most effective for a non-technical audience?
Non-technical stakeholders need business context, not technical metrics. Translate performance findings into user experience and business impact.
Translation formula for leadership presentations:
• p99 latency 2,340ms → "checkout takes 2.3 seconds" (user experience)
• "10× traffic spike" → "Black Friday scenario" (business context)
• "connection pool saturation" → "we need to scale the database layer" (actionable)
• Technical root cause → "to fix this, we need X" (investment request)
The STAR format for performance findings:
• Situation: current state and test scenario
• Target: the performance goal (e.g., "under 1 second for 95% of users")
• Assessment: did we meet the target? Where did we fail?
• Recommendation: what investment is needed to meet the target?
Key vocabulary for leadership communication:
• "user experience" — use this instead of "latency" when talking to non-technical stakeholders
• "under load" — simpler than "under concurrent traffic conditions"
• "scale" — acceptable technical term that non-technical audiences understand
• "target" or "SLA" — clearer than "SLO" for non-engineering audiences
4 / 5
During a performance review, a colleague asks: "Why is the memory usage trending upward over 72 hours but the CPU is flat?"
Which explanation is most accurate?
Trending resource usage over time (while the application is otherwise stable) is the classic signature of a resource leak.
Memory leak diagnostic pattern:
• CPU flat → the application isn't doing more work
• Memory growing → memory isn't being released after use
• This combination = strong signal of a memory leak or unbounded cache
Common causes in web applications:
• Event listeners added but never removed when a component is destroyed
• Closures holding references to large objects that can't be garbage collected
• Caches without eviction — storing items forever without a max-size policy (e.g., LRU eviction)
• Database connections opened but not returned to the pool
• WebSocket/stream objects not cleaned up on disconnect
How to investigate:
• Take heap dumps at T+0 and T+72h; compare retained objects
• Look for objects that are growing in count
• Use memory profiler (Java: JProfiler; Node: --inspect + Chrome DevTools; Python: memory_profiler)
Key vocabulary:
• memory leak — a bug where memory is allocated but never freed, causing gradual consumption
• heap — the region of memory where dynamically allocated objects live
• garbage collection (GC) — the automatic process of reclaiming memory from objects no longer in use
• eviction policy — the rule that determines when cached items are removed (e.g., LRU, TTL)
5 / 5
You are writing a performance optimisation proposal. Which sentence best describes the trade-off between a proposed caching layer?
Effective performance proposals quantify both the benefit AND the trade-offs — this is how engineers build credibility with technical and non-technical reviewers.
The anatomy of a strong performance trade-off statement:
• What: Redis cache in front of the database
• Benefit (quantified): read latency 45ms → 2ms
• Scope: frequently accessed product data (hot path)
• Cost (technical): eventual consistency, 60s TTL
• Cost (financial): ~$180/month for the Redis instance
• Conclusion: the trade-off is justified for hot-path reads
Why acknowledging trade-offs is important:
• Caching introduces cache invalidation complexity
• TTL creates a window where users see stale data
• Cache misses (cold start, TTL expiry) result in "thundering herd" problems if not handled
• Memory cost and operational overhead of another service
Language patterns for trade-off discussions:
• "at the cost of…" — introduces the downside
• "the trade-off is justified because…" — signals that you've weighed both sides
• "for the hot path / for read-heavy workloads" — scoping the benefit to where it applies
Key vocabulary:
• eventual consistency — a model where reads may return stale data for a short window after a write
• TTL (Time-to-Live) — how long a cached item is considered valid before being discarded
• hot path — the most frequently executed code path or most-accessed data
• thundering herd — a situation where many cache misses simultaneously overwhelm the backend