How to set up DSPy with local models
Quick answer
To set up
dspy with local models, first install dspy and a compatible local model library like transformers. Then instantiate dspy.LM with the local model path or identifier, and configure dspy to use it for inference without requiring an API key.PREREQUISITES
Python 3.8+pip install dspy>=0.3.0pip install transformers>=4.30.0Local model files downloaded or accessible
Setup
Install dspy and transformers to enable local model usage. Ensure you have a compatible local model checkpoint downloaded, such as a Hugging Face model.
pip install dspy transformers Step by step
This example shows how to load a local Hugging Face model with dspy and run a simple prompt inference.
import dspy
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load local model and tokenizer
model_name_or_path = "gpt2" # Replace with your local model path
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
model = AutoModelForCausalLM.from_pretrained(model_name_or_path)
# Wrap model for dspy
lm = dspy.LM(
"transformers/gpt2", # Use dspy's transformers integration
model=model,
tokenizer=tokenizer,
device="cuda" if torch.cuda.is_available() else "cpu"
)
dspy.configure(lm=lm)
# Define a signature
class QA(dspy.Signature):
question: str = dspy.InputField()
answer: str = dspy.OutputField()
# Create a predictor
qa = dspy.Predict(QA)
# Run inference
result = qa(question="What is DSPy?")
print(result.answer) output
DSPy is a declarative Python library for AI programming that simplifies calling language models and structuring prompts.
Common variations
- Use different local models by changing
model_name_or_pathto your downloaded checkpoint. - Run on CPU by setting
device="cpu"explicitly. - Use async calls with
dspyby defining async functions and callingawait qa(question="...").
Troubleshooting
- If you see
model not found, verify the local model path and that files are downloaded correctly. - For CUDA errors, ensure your GPU drivers and PyTorch CUDA version match.
- If
dspyinference is slow, check device placement and batch size.
Key Takeaways
- Install
dspyandtransformersto enable local model integration. - Use
dspy.LMwith local model and tokenizer objects for offline inference. - Configure device (CPU/GPU) explicitly for best performance.
- Define structured signatures with
dspy.Signaturefor clean prompt management. - Troubleshoot common errors by verifying model paths and environment setup.