Model Versioning

Overview

Model Versioning in enterprise MLOps is the process of tracking machine learning models not just as files, but as versioned software artifacts with complete lineage and lifecycle management. Unlike simple code versioning, model versioning must capture the code, data, configuration, and environment that produced a specific model artifact.

In production systems, this is solved using a Model Registry, which acts as a central repository to manage model deployment stages (e.g., Staging, Production) and ensures reproducibility.

Key Ideas / Intuition

1. The Model Registry

A Model Registry is like “Git for binaries + Metadata” or a “Package Manager for ML”.

2. The “Recipe” (Lineage)

A versioned model is not just a .pkl or .onnx file. It is a tuple: If any of these change, you have a new model version. Enterprise systems automate the tracking of this lineage so you can answer: “Which dataset trained the model running in production right now?“

3. Lifecycle & Promotion

Models are not static; they move through stages.

  • Registration: A training run finishes, and the artifact is saved as Version 1.
  • Staging: Version 1 is promoted to “Staging” for integration testing.
  • Production: After passing tests, Version 1 is promoted to “Production” (replacing Version 0).
  • Archived: Version 0 is retired but kept for rollback capability.

Immutable Artifacts: You never rebuild a model for production. You build once (during training), register it, and that exact same binary is promoted through environments.

4. Decoupling Deployment from Training

By using a registry, inference services don’t need to hardcode paths like s3://bucket/model_v2.pt. Instead, they query the registry:

“Give me the latest model marked as ‘Production’ for ‘FraudDetection’.” This allows you to update the model in the background (by promoting a new version) without changing the serving code.

Practical Application

Tooling

  • MLflow Model Registry: The industry standard for open-source (and Databricks) model management.
  • AWS SageMaker Model Registry: Deeply integrated with the AWS ecosystem.
  • Weights & Biases (WandB): Excellent for experiment tracking that transitions into a registry.
  • Azure ML: Similar concept to SageMaker.
  • BentoML: Focuses on serving but includes a model store.

Best Practices

  • One Version, One ID: Every registered model must have a unique immutable ID.
  • Automated Registration: CI/CD pipelines should automatically register models that pass a certain metric threshold during training.
  • Gated Promotion: Moving a model from “Staging” to “Production” should require a manual approval or a rigorous automated test suite (Canary evaluation).
  • Semantic Versioning: While registries use integers (v1, v2), you should also use tags or descriptions for semantic meaning (e.g., v1.2.0-bert-large).

Trade-offs

  • Complexity: Adds a new infrastructure component (the Registry server).
  • Storage: Storing every model version can be expensive if models are huge (LLMs). Retention policies are needed.

Comparisons

FeatureGitModel Registry
Primary UnitSource Code (Text)Model Artifacts (Binary)
Versioning LogicDiffs / MergesImmutable Snapshots / Linear History
MetadataCommit Messages, AuthorMetrics (Accuracy), Hyperparams, Data Lineage
LifecycleBranches (Dev/Main)Stages (None, Staging, Prod, Archived)
Use CaseDeveloping the training scriptManaging the deployable asset

Resources


Back to: 03 - MLOps & Infrastructure Index