Learn the vocabulary of a feature computed differently between a training pipeline and a live serving pipeline.
0 / 5 completed
1 / 5
At standup, a dev mentions a model performing noticeably worse in production than its offline test metrics suggested, traced to a feature being computed differently by the live inference pipeline than it was during training. What is this problem called?
Training-serving skew is a model performing noticeably worse in production than its offline test metrics suggested, traced to a feature being computed differently by the live inference pipeline than it was during training. Model drift describes a model's performance degrading over time as the real-world data distribution genuinely shifts, which is a related but distinct problem from a computation mismatch between two pipelines. This pipeline-to-pipeline inconsistency is exactly what training-serving skew names and diagnoses.
2 / 5
During a design review, the team wants the exact same code and logic to compute a feature for both the training pipeline and the live serving pipeline, rather than maintaining two separate implementations. Which capability supports this?
A shared feature store computes a feature identically for both the training pipeline and the live serving pipeline, using the exact same code and logic rather than two separately maintained implementations that can quietly drift apart. Maintaining two entirely separate implementations leaves the door open for a subtle inconsistency to creep in unnoticed. This shared computation logic is the core architectural fix for eliminating training-serving skew at its source.
3 / 5
In a code review, a dev notices a bug where the training pipeline computes a feature named 'avg_purchase' as an average over the last 30 days, while the online serving pipeline computes the same-named feature as an average over the last 7 days. What does this represent?
A feature-definition inconsistency between the training and serving pipelines, like the same-named feature meaning a 30-day average in one pipeline and a 7-day average in the other, is a textbook cause of training-serving skew, since the model was trained on one distribution of values but sees a different one live. A feature computed identically by both pipelines carries no such inconsistency, which is exactly the goal a shared feature store is meant to achieve. Spotting this kind of naming-versus-definition mismatch in a code review is one of the more effective ways to catch training-serving skew before it reaches production.
4 / 5
An incident report shows a model's live accuracy dropped sharply right after launch despite strong offline test metrics, and the root cause was traced to the serving pipeline computing a key feature differently than the training pipeline had computed the same-named feature. What practice would prevent this?
Unifying feature computation behind a shared feature store or pipeline ensures training and serving always compute the same feature identically, eliminating exactly the kind of inconsistency that caused the accuracy drop in this incident. Continuing to maintain separate implementations for training and serving leaves that same risk open to recur with any future feature. This unification is the standard architectural fix once training-serving skew has been identified as the root cause of a production performance gap.
5 / 5
During a PR review, a teammate asks why the team unifies feature computation behind a shared feature store instead of maintaining separate, independently optimized feature code for the training pipeline and the serving pipeline. What is the reasoning?
Separate, independently maintained feature code can quietly drift into computing the same-named feature differently over time, as each pipeline is updated on its own schedule without the other necessarily being kept in sync. A shared feature store guarantees both pipelines compute the feature identically, since they're both calling the exact same underlying logic. The tradeoff is the added upfront engineering investment of building and maintaining a shared feature store, rather than letting each pipeline's feature code evolve independently.