Writing a Technical Case Study
3 exercises — problem framing with root cause analysis, solution sections that document design debates, and results that include what didn't go as expected. The structure behind case studies that engineers actually learn from.
0 / 3 completed
1 / 3
You're writing a technical case study about re-architecting a payment processing system. Which Problem section most effectively sets up the case study narrative?
Technical case study problem sections: quantify the symptoms, diagnose the root cause structurally, and explain why you chose the approach you chose over the alternatives you considered.
Option B models this structure precisely:
① Specific numbers as symptoms — "8.4 seconds p99 vs 2-second SLA", "4.2% fraud detection timeouts", "45-minute manual recovery" — symptoms are what the business experienced. Numbers make them falsifiable and comparable to the reader's situation. "Performance issues" (Option A) is not a symptom; it's a category.
② Root cause framed as a structural insight, not a blame — "the root cause was not load; it was coupling" — this is the intellectual pivot that makes case studies worth reading. The insight that email notification delays caused fraud detection timeouts is specific and non-obvious. Readers building synchronous call chains will immediately check their own systems.
③ Options considered before the chosen path — "horizontal scaling (4× cost), vendor replacement (18 months), async re-architecture" — showing rejected options is as valuable as showing the chosen one. Every reader evaluating the same problem has the same three options; knowing why you ruled out scaling and vendor replacement tells them whether their ruling should match yours.
Option C ("slow, unreliable, not built for scale") describes the emotional experience of the problem, not the engineering diagnosis. Option D contextualises the problem in "typical growing pains" — which is normalising rather than explaining.
Option B models this structure precisely:
① Specific numbers as symptoms — "8.4 seconds p99 vs 2-second SLA", "4.2% fraud detection timeouts", "45-minute manual recovery" — symptoms are what the business experienced. Numbers make them falsifiable and comparable to the reader's situation. "Performance issues" (Option A) is not a symptom; it's a category.
② Root cause framed as a structural insight, not a blame — "the root cause was not load; it was coupling" — this is the intellectual pivot that makes case studies worth reading. The insight that email notification delays caused fraud detection timeouts is specific and non-obvious. Readers building synchronous call chains will immediately check their own systems.
③ Options considered before the chosen path — "horizontal scaling (4× cost), vendor replacement (18 months), async re-architecture" — showing rejected options is as valuable as showing the chosen one. Every reader evaluating the same problem has the same three options; knowing why you ruled out scaling and vendor replacement tells them whether their ruling should match yours.
Option C ("slow, unreliable, not built for scale") describes the emotional experience of the problem, not the engineering diagnosis. Option D contextualises the problem in "typical growing pains" — which is normalising rather than explaining.