How to Intermediate · 3 min read

How to set up DSPy with local models

Q: How to set up DSPy with local models

To set up dspy with local models, first install dspy and a compatible local model library like transformers. Then instantiate dspy.LM with the local model path or identifier, and configure dspy to use it for inference without requiring an API key.

Quick answer

To set up dspy with local models, first install dspy and a compatible local model library like transformers. Then instantiate dspy.LM with the local model path or identifier, and configure dspy to use it for inference without requiring an API key.

PREREQUISITES

Python 3.8+
pip install dspy>=0.3.0
pip install transformers>=4.30.0
Local model files downloaded or accessible

Setup

Install dspy and transformers to enable local model usage. Ensure you have a compatible local model checkpoint downloaded, such as a Hugging Face model.

bash

pip install dspy transformers

Step by step

This example shows how to load a local Hugging Face model with dspy and run a simple prompt inference.

python

import dspy
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load local model and tokenizer
model_name_or_path = "gpt2"  # Replace with your local model path

tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
model = AutoModelForCausalLM.from_pretrained(model_name_or_path)

# Wrap model for dspy
lm = dspy.LM(
    "transformers/gpt2",  # Use dspy's transformers integration
    model=model,
    tokenizer=tokenizer,
    device="cuda" if torch.cuda.is_available() else "cpu"
)

dspy.configure(lm=lm)

# Define a signature
class QA(dspy.Signature):
    question: str = dspy.InputField()
    answer: str = dspy.OutputField()

# Create a predictor
qa = dspy.Predict(QA)

# Run inference
result = qa(question="What is DSPy?")
print(result.answer)

output

DSPy is a declarative Python library for AI programming that simplifies calling language models and structuring prompts.

Common variations

Use different local models by changing model_name_or_path to your downloaded checkpoint.
Run on CPU by setting device="cpu" explicitly.
Use async calls with dspy by defining async functions and calling await qa(question="...").

Troubleshooting

If you see model not found, verify the local model path and that files are downloaded correctly.
For CUDA errors, ensure your GPU drivers and PyTorch CUDA version match.
If dspy inference is slow, check device placement and batch size.

✅

Key Takeaways

Install dspy and transformers to enable local model integration.
Use dspy.LM with local model and tokenizer objects for offline inference.
Configure device (CPU/GPU) explicitly for best performance.
Define structured signatures with dspy.Signature for clean prompt management.
Troubleshoot common errors by verifying model paths and environment setup.

Verified 2026-04 · transformers/gpt2

Verify ↗