How to beginner · 3 min read

How to evaluate DSPy programs

Q: How to evaluate DSPy programs

To evaluate dspy programs, define a dspy.Signature class representing inputs and outputs, then create a dspy.Predict instance and call it with input arguments. The evaluation returns structured output fields accessible as attributes. Use dspy.configure with an LM instance to connect to an LLM like OpenAI.

Quick answer

To evaluate dspy programs, define a dspy.Signature class representing inputs and outputs, then create a dspy.Predict instance and call it with input arguments. The evaluation returns structured output fields accessible as attributes. Use dspy.configure with an LM instance to connect to an LLM like OpenAI.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0
pip install dspy

Setup

Install the required packages and set your OpenAI API key as an environment variable.

bash

pip install openai dspy

Step by step

Define a dspy.Signature class with input and output fields, configure the dspy client with an OpenAI LLM, then create a dspy.Predict instance to evaluate the program by calling it with inputs.

python

import os
from openai import OpenAI
import dspy

# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Configure dspy with the OpenAI LLM
lm = dspy.LM("openai/gpt-4o-mini", client=client)
dspy.configure(lm=lm)

# Define a DSPy signature for a simple QA task
class QA(dspy.Signature):
    question: str = dspy.InputField()
    answer: str = dspy.OutputField()

# Create a Predict instance
qa = dspy.Predict(QA)

# Evaluate the DSPy program
result = qa(question="What is Retrieval-Augmented Generation?")

# Access the output
print("Answer:", result.answer)

output

Answer: Retrieval-Augmented Generation (RAG) is a technique that combines retrieval of relevant documents with generative language models to produce more accurate and context-aware responses.

Common variations

Use different models by changing the dspy.LM model string, e.g., "openai/gpt-4o-mini".
Use dspy.ChainOfThought for step-by-step reasoning.
Run asynchronously by integrating with async OpenAI clients and adapting dspy calls accordingly.

Troubleshooting

If you get authentication errors, verify your OPENAI_API_KEY environment variable is set correctly.
If outputs are empty or incorrect, check your dspy.Signature fields and ensure the model supports the requested task.
For rate limits, consider retrying with exponential backoff or upgrading your API plan.

✅

Key Takeaways

Use dspy.Predict with a defined dspy.Signature to evaluate DSPy programs.
Configure dspy with an LLM instance like dspy.LM("openai/gpt-4o-mini") for OpenAI integration.
Access outputs as attributes on the returned result object after calling the predict instance.
Adjust models and use chain-of-thought variants for more complex reasoning tasks.
Always set your API key securely via environment variables to avoid authentication issues.

Verified 2026-04 · gpt-4o-mini

Verify ↗