How to beginner · 3 min read

How to build a QA pipeline with DSPy

Quick answer
Use dspy to define a structured signature for your QA task and connect it to an LLM like openai/gpt-4o-mini. Then invoke the pipeline by passing a question to get a clean, typed answer. This approach ensures type safety and easy integration with OpenAI models.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install dspy openai>=1.0

Setup

Install dspy and openai packages, and set your OpenAI API key as an environment variable.

  • Run pip install dspy openai
  • Set environment variable OPENAI_API_KEY with your OpenAI API key
bash
pip install dspy openai

Step by step

Define a dspy.Signature class for the QA task, configure the LLM, create a dspy.Predict instance, and call it with a question.

python
import os
import dspy

# Configure the LLM with your OpenAI API key and model
lm = dspy.LM("openai/gpt-4o-mini", api_key=os.environ["OPENAI_API_KEY"])
dspy.configure(lm=lm)

# Define the QA signature with input and output fields
class QA(dspy.Signature):
    question: str = dspy.InputField()
    answer: str = dspy.OutputField()

# Create a prediction pipeline
qa = dspy.Predict(QA)

# Run the pipeline with a question
result = qa(question="What is Retrieval-Augmented Generation (RAG)?")
print("Answer:", result.answer)
output
Answer: Retrieval-Augmented Generation (RAG) is a technique that combines retrieval of relevant documents with generative language models to produce accurate and context-aware answers.

Common variations

  • Use dspy.ChainOfThought for step-by-step reasoning in answers.
  • Switch to a different OpenAI model by changing "openai/gpt-4o-mini" to another supported model.
  • Integrate with Anthropic models by configuring dspy.LM with an Anthropic client.
python
from openai import OpenAI
import dspy
import os

# Example: Using ChainOfThought for reasoning
lm = dspy.LM("openai/gpt-4o-mini", api_key=os.environ["OPENAI_API_KEY"])
dspy.configure(lm=lm)

class QA(dspy.Signature):
    question: str = dspy.InputField()
    answer: str = dspy.OutputField()

qa_cot = dspy.ChainOfThought(QA)
result = qa_cot(question="Explain how a neural network learns.")
print("Answer with reasoning:", result.answer)
output
Answer with reasoning: A neural network learns by adjusting its weights through backpropagation, minimizing the error between predicted and actual outputs using gradient descent.

Troubleshooting

  • If you get an authentication error, verify your OPENAI_API_KEY environment variable is set correctly.
  • If responses are empty or incorrect, check your model name and ensure it supports the task.
  • For slow responses, consider using smaller models like openai/gpt-4o-mini or adjust dspy.LM parameters.

Key Takeaways

  • Define structured input/output with dspy.Signature for clean QA pipelines.
  • Use dspy.Predict to easily invoke LLMs with typed inputs and outputs.
  • Switch models or add chain-of-thought reasoning with minimal code changes.
  • Always set your API key securely via environment variables.
  • Check model compatibility and API key if you encounter errors.
Verified 2026-04 · openai/gpt-4o-mini
Verify ↗