ML Platform Vocabulary: Feature Stores, Model Registries, and MLOps Pipelines

Learn the advanced English vocabulary MLOps and ML platform engineers use when discussing feature stores, model registries, experiment tracking, and serving infrastructure.

Machine learning has its own dialect — and it evolves fast. If you join an ML platform team speaking only generic software engineering English, conversations about drift thresholds and shadow deployments will feel like a foreign language. This guide closes that gap, covering the vocabulary that separates ML infrastructure engineers from the rest.

Feature Stores and Data for Training

The feature store is a centralised repository for storing and serving ML features — the engineered inputs to a model. A feature store has two components: the offline store (historical features for training, typically backed by a data warehouse or object storage) and the online store (low-latency features for real-time inference, backed by key-value stores like Redis or DynamoDB).

Feature serving is the process of retrieving features at prediction time. Engineers say: “Point-in-time correctness is critical — the offline store must serve features as they existed at label time, not today’s values.”

Related terms: feature engineering (transforming raw data into model inputs), feature pipeline (the job that computes and writes features), and feature freshness (how recent the features are — stale features degrade model quality).

Model Registries and Experiment Tracking

An experiment in ML is a logged trial of training a model with a specific configuration. Each trial is called a run. Tools like MLflow, Weights & Biases, and Neptune track runs automatically, recording metrics (accuracy, F1, AUC), parameters (learning rate, batch size, number of layers), and artifacts (trained model files, evaluation plots, preprocessors).

The model registry is the system of record for trained models. It stores versioned model artifacts and tracks their lifecycle stages — typically Staging, Production, and Archived. You will hear: “Promote the champion model from Staging to Production once the shadow evaluation passes the acceptance threshold.”

Hyperparameter tuning (also called hyperparameter optimisation, HPO) is the automated search for the best model configuration. Engineers distinguish hyperparameters (set before training, like learning rate) from model parameters (learned during training, like neural network weights).

Pipelines and Serving Infrastructure

An ML pipeline is an orchestrated sequence of steps — data ingestion, feature computation, training, evaluation, and registration — that runs end-to-end. Tools include Kubeflow Pipelines, Metaflow, and Vertex AI Pipelines. A typical standup update: “The nightly training pipeline failed at the evaluation step — the new validation dataset had schema drift.”

Batch inference processes large datasets offline and writes predictions to a store. Real-time inference (also called online inference) returns predictions synchronously within milliseconds. The choice shapes your serving infrastructure entirely.

Model drift occurs when a deployed model’s performance degrades because the real-world distribution has shifted. There are two types: concept drift (the relationship between features and labels changes) and data drift (the input feature distribution changes). A retraining trigger is the rule or signal that kicks off a new training run — for example, when a drift metric crosses a threshold.

Deployment Strategies

A/B testing models means routing a percentage of traffic to a new model candidate and comparing metrics against the baseline. A shadow deployment (also called shadow mode) sends real traffic to a new model but discards its predictions — you observe latency and error rates without affecting users. Engineers say: “Run the new recommendation model in shadow for a week before we cut over.”

A champion/challenger setup keeps the current best model (champion) in production while routing a small traffic slice to a new candidate (challenger) for comparison.

Next Steps

Pick one MLOps tool your team uses — MLflow, SageMaker, Vertex AI — and read its official documentation for 20 minutes using the vocabulary from this article as a lens. Every time you encounter a term you cannot define in English, write it down and find a real example sentence from engineering blogs or GitHub issues. Active vocabulary only lands through deliberate exposure.