5 exercises — choose the best-structured answer to FinOps and Cloud Cost Engineering interview questions. Focus on rightsizing, tagging, Reserved Instances, and unit economics.
Risk-stratify rightsizing: stateful vs stateless changes carry very different risk
Total spend is a vanity metric: unit economics reveals efficiency, not just cost
Self-service scales, bottlenecks don\'t: teams must own their cost optimisation
0 / 5 completed
1 / 5
The interviewer asks: "What is rightsizing and how do you approach it without impacting performance?" Which answer is the most operationally complete?
Option B is the strongest: names the specific utilisation thresholds and time window (30-day, CPU < 20%, memory < 40%), introduces risk categorisation (stateless vs stateful), gives a concrete safe approach for stateful services (staging + load testing at peak throughput), adds the critical non-obvious check for memory pressure via swap/OOM events (CPU alone misleads), and ends with the governing principle (minimum cost at required SLA, not minimum cost). Option A is the naive definition. Option C names real tools but relies on them uncritically — advisors miss stateful risk and memory pressure. Option D describes the right time window but has no risk stratification or memory pressure insight.
2 / 5
The interviewer asks: "How do you build a cloud tagging strategy that actually works?" Choose the answer that shows the most implementation maturity.
Option B is the strongest: establishes the core principle (enforcement not documentation), names three concrete implementation layers with specific technologies (SCPs/Azure Policy, Lambda/Azure Function, billing hierarchy), identifies a non-obvious failure mode (tag drift on replacement/scaling events), and adds an operational insight about routing violations to team Slack channels rather than central FinOps — the accountability mechanism matters as much as the detection. Option A is the typical starting point but stops at documentation. Option C is actually quite good but misses tag drift and the accountability routing insight. Option D gives the right framing (taxonomy → enforcement → audit) but no implementation specifics.
3 / 5
The interviewer asks: "When would you choose Reserved Instances over Savings Plans, and vice versa?" Which answer demonstrates the sharpest commercial understanding?
Option B is the strongest: frames the decision around two explicit variables (stability and flexibility), gives a specific RI use case with a named example (PostgreSQL RDS), quantifies the RI discount (up to 72%), explains the failure mode (wrong family/region wastes the discount), correctly identifies Compute Savings Plans as the most flexible variant, introduces the layered coverage model (base load RIs + mid-tier Savings Plans + on-demand spikes), and ends with a critical risk management principle (unused RI commitment cannot be refunded — be conservative). Option A is correct but advocates Savings Plans for everything, missing the higher RI discount for stable workloads. Option C is the "mix" answer without the layered framework. Option D describes a real-world practice correctly but without the selection reasoning.
4 / 5
The interviewer asks: "How do you build cost visibility across multiple cloud accounts without becoming a bottleneck?" Choose the most scalable answer.
Option B is the strongest: establishes the design principle (self-service to avoid bottleneck), names the specific data source (AWS CUR), describes the pipeline architecture (normalisation → data warehouse), specifies what each team dashboard shows (current vs forecast, trend, top drivers, anomaly alerts), distinguishes ML-based anomaly detection from threshold alerts, and articulates the operating model clearly (FinOps sets standards, teams own optimisation). The explicit distinction between FinOps as standards-setter and teams as optimisation owners is what prevents bottlenecks. Option A is a minimal correct answer. Option C names real tools (CloudHealth, Apptio) but doesn't explain the accountability model. Option D describes account separation correctly but monthly reviews with each team lead is a bottleneck at scale.
5 / 5
The interviewer asks: "What is unit economics in cloud cost management and why does it matter more than total spend?" Which answer shows the most business fluency?
Option B is the strongest: defines unit economics as a ratio (cost per unit of business value), gives three business-model-specific examples (platform, SaaS, data pipeline), makes the key insight that total spend is a vanity metric for a growing company (doubling in size doubles spend — that's healthy if unit cost is flat), shows how unit economics identifies the root cause (which service grows faster than user count), contrasts the wrong optimisation (cut total spend) with the right one (find the inefficiency), and adds the OKR embed as an operational integration point. Option A is the correct definition but has no insight. Option C restates the question as the answer. Option D describes tracking unit metrics correctly but misses the insight that total spend is misleading in a growing business.