Concept beginner · 3 min read

What are wandb artifacts

Quick answer

wandb artifacts are versioned, trackable data objects in Weights & Biases that manage datasets, models, and other files to enable reproducible machine learning workflows. They provide a structured way to store, share, and reuse data and model checkpoints across experiments.

wandb artifacts are versioned data objects that track and manage datasets, models, and files to ensure reproducibility and collaboration in machine learning projects.

How it works

wandb artifacts function as version-controlled containers for datasets, model checkpoints, and other files used in ML workflows. Similar to how Git manages code versions, artifacts track changes and lineage of data and models, enabling teams to reproduce experiments reliably. Each artifact has metadata, version history, and can be linked to runs, making it easy to trace inputs and outputs across training and evaluation.

Concrete example

Below is a Python example demonstrating how to create and log a wandb artifact for a dataset and then use it in a training run.

python

import wandb

# Initialize a W&B run
wandb.init(project="my-ml-project")

# Create an artifact for a dataset
artifact = wandb.Artifact(name="training-data", type="dataset")

# Add a local file to the artifact
artifact.add_file("data/train.csv")

# Log the artifact to W&B
wandb.log_artifact(artifact)

# Later, to use the artifact in another run
run = wandb.init(project="my-ml-project")
artifact = run.use_artifact("training-data:latest")
artifact_dir = artifact.download()

print(f"Dataset downloaded to: {artifact_dir}")

output

Dataset downloaded to: ./wandb/artifacts/training-data:v0

When to use it

Use wandb artifacts when you need to track and version datasets, model checkpoints, or any files critical to your ML experiments. They are essential for reproducibility, collaboration, and auditability in projects where data or models evolve over time. Avoid using artifacts for ephemeral or temporary files that do not impact experiment results.

Key terms

Term	Definition
Artifact	A versioned data object in W&B representing datasets, models, or files.
Versioning	Tracking changes and history of an artifact over time.
Lineage	The relationship and provenance between artifacts and runs.
Run	An execution instance of an experiment logged in W&B.
Metadata	Descriptive information attached to artifacts for context.

✅

Key Takeaways

wandb artifacts enable version control for datasets and models, improving reproducibility.
Artifacts link data and models to experiment runs, providing clear lineage and audit trails.
Use artifacts to share and reuse data across teams and projects efficiently.

Verified 2026-04

Verify ↗