Advanced Interview #ml-infrastructure #feature-pipelines #model-serving #interview-prep

ML Infrastructure Engineer Interview Questions

5 exercises — choose the best-structured answer to common ML infrastructure interview questions. Focus on feature pipelines, GPU efficiency, model serving, and drift monitoring.

Structure for ML infrastructure design answers

Separate training from serving: pipelines and SLAs differ significantly
Name components precisely: feature store, model registry, serving runtime, monitoring
Cover operational dimensions: GPU utilisation, scaling policy, latency SLO, cost
Address data and model drift: they are distinct and need separate detection strategies

0 / 5 completed

1 / 5

The interviewer asks: "Design an online feature pipeline for a real-time personalisation system serving 50,000 requests per second."
Choose the answer that covers the critical design dimensions.

2 / 5

The interviewer asks: "Our GPU cluster is at 45% utilisation. How would you investigate this and what would you do about it?"
Choose the most systematic diagnostic answer.

3 / 5

The interviewer asks: "Compare batch inference, online inference, and streaming inference — when would you use each?"
Which answer covers the key design considerations for each pattern?

4 / 5

The interviewer asks: "How do you detect and respond to model drift in production?"
Choose the most complete monitoring and response strategy.

5 / 5

The interviewer asks: "How would you reduce the cost of a training job currently taking 8 hours on 32 A100 GPUs?"
Which answer gives the most practical cost-reduction strategy?