How to evaluate DSPy programs
Quick answer
To evaluate
dspy programs, define a dspy.Signature class representing inputs and outputs, then create a dspy.Predict instance and call it with input arguments. The evaluation returns structured output fields accessible as attributes. Use dspy.configure with an LM instance to connect to an LLM like OpenAI.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0pip install dspy
Setup
Install the required packages and set your OpenAI API key as an environment variable.
pip install openai dspy Step by step
Define a dspy.Signature class with input and output fields, configure the dspy client with an OpenAI LLM, then create a dspy.Predict instance to evaluate the program by calling it with inputs.
import os
from openai import OpenAI
import dspy
# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Configure dspy with the OpenAI LLM
lm = dspy.LM("openai/gpt-4o-mini", client=client)
dspy.configure(lm=lm)
# Define a DSPy signature for a simple QA task
class QA(dspy.Signature):
question: str = dspy.InputField()
answer: str = dspy.OutputField()
# Create a Predict instance
qa = dspy.Predict(QA)
# Evaluate the DSPy program
result = qa(question="What is Retrieval-Augmented Generation?")
# Access the output
print("Answer:", result.answer) output
Answer: Retrieval-Augmented Generation (RAG) is a technique that combines retrieval of relevant documents with generative language models to produce more accurate and context-aware responses.
Common variations
- Use different models by changing the
dspy.LMmodel string, e.g.,"openai/gpt-4o-mini". - Use
dspy.ChainOfThoughtfor step-by-step reasoning. - Run asynchronously by integrating with async OpenAI clients and adapting
dspycalls accordingly.
Troubleshooting
- If you get authentication errors, verify your
OPENAI_API_KEYenvironment variable is set correctly. - If outputs are empty or incorrect, check your
dspy.Signaturefields and ensure the model supports the requested task. - For rate limits, consider retrying with exponential backoff or upgrading your API plan.
Key Takeaways
- Use
dspy.Predictwith a defineddspy.Signatureto evaluate DSPy programs. - Configure
dspywith an LLM instance likedspy.LM("openai/gpt-4o-mini")for OpenAI integration. - Access outputs as attributes on the returned result object after calling the predict instance.
- Adjust models and use chain-of-thought variants for more complex reasoning tasks.
- Always set your API key securely via environment variables to avoid authentication issues.