Practice the vocabulary of consistently serving features to training and live inference.
0 / 5 completed
1 / 5
At standup, a machine learning engineer mentions a centralized system that stores and serves the same computed features consistently to both model training and live production inference. What is this system called?
A feature store centralizes the computation, storage, and serving of features so the same feature values and computation logic are consistently available to both model training and live production inference. Computing features separately and independently for training versus inference risks a subtle mismatch, known as training-serving skew, where the two pipelines produce slightly different values for what's supposed to be the same feature. A feature store exists specifically to eliminate that inconsistency by serving from one shared, authoritative source.
2 / 5
During a design review, the team wants a feature's value to be available with very low latency at the exact moment a live prediction request comes in. Which capability supports this?
Online feature serving with low-latency lookup makes a precomputed or freshly computed feature value available almost instantly at the exact moment a live prediction request needs it, which is essential for a real-time inference use case like fraud detection. Retrieving a feature value through a slow batch job is far too slow for that kind of live, time-sensitive request. This low-latency serving path is typically paired with a separate, higher-throughput offline path used for bulk model training.
3 / 5
In a code review, a dev notices the feature store tracks exactly which version of a feature's computation logic produced a specific stored value. What does this represent?
Feature versioning and lineage tracking records exactly which version of a feature's computation logic produced a specific stored value, making it possible to trace a model's behavior back to the precise feature definition it was trained or served with. Overwriting a value with no record of its origin makes debugging a model's unexpected behavior far harder, since the underlying feature computation might have silently changed. This traceability is especially important when a feature's computation logic evolves over time.
4 / 5
An incident report shows a production model's predictions degraded because the online feature serving path used slightly different computation logic than the offline path used during training. What practice would prevent this?
Ensuring both the online and offline feature computation paths share the same underlying feature definition directly prevents the training-serving skew that degrades a model's real-world prediction quality. Maintaining separate, independently written logic for each path is exactly what creates the risk of a subtle, hard-to-detect mismatch. This shared-definition guarantee is one of a feature store's core value propositions, not an incidental detail.
5 / 5
During a PR review, a teammate asks why the team adopted a feature store instead of having each model's training pipeline compute its own features independently. What is the reasoning?
Each model's training pipeline computing its own features independently risks a subtle divergence from whatever logic the live inference path actually uses, degrading the model's real-world accuracy in a way that's often hard to detect. A feature store centralizes that computation so both paths draw from the same consistent, authoritative source. The tradeoff is the upfront investment in building and maintaining the feature store's infrastructure across an organization's models.