How to stream LangGraph output
Quick answer
To stream LangGraph output, compile your graph and invoke it asynchronously with streaming enabled by using the OpenAI SDK's streaming chat completions. Yield chunks from the streaming response to process output tokens in real time.
PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai langgraph
Setup
Install the required packages and set your OpenAI API key as an environment variable.
- Install packages:
pip install openai langgraph - Set environment variable in your shell:
export OPENAI_API_KEY='your_api_key'(Linux/macOS) orsetx OPENAI_API_KEY "your_api_key"(Windows)
pip install openai langgraph output
Collecting openai Collecting langgraph Successfully installed openai-1.x.x langgraph-0.x.x
Step by step
This example shows how to create a simple LangGraph with a node that streams output from an OpenAI chat completion using the OpenAI SDK's streaming API.
import os
import asyncio
from openai import OpenAI
from langgraph.graph import StateGraph, END
# Define the state type
class State(dict):
pass
# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Define an async node function that streams output
async def stream_node(state: State) -> State:
messages = state.get("messages", [])
# Add user message
messages.append({"role": "user", "content": "Say hello with streaming."})
# Create streaming chat completion
stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
stream=True
)
collected = []
print("Streaming response:")
async for chunk in stream:
delta = chunk.choices[0].delta.content or ""
print(delta, end="", flush=True)
collected.append(delta)
print() # newline after streaming
# Append assistant message to state
messages.append({"role": "assistant", "content": ''.join(collected)})
return {"messages": messages}
# Build LangGraph
graph = StateGraph(State)
graph.add_node("stream_node", stream_node)
graph.set_entry_point("stream_node")
graph.add_edge("stream_node", END)
# Compile the graph
app = graph.compile()
# Run the graph asynchronously
async def main():
result = await app.invoke({"messages": []})
print("Final state messages:", result["messages"])
if __name__ == "__main__":
asyncio.run(main()) output
Streaming response:
Say hello with streaming.
Final state messages: [{'role': 'user', 'content': 'Say hello with streaming.'}, {'role': 'assistant', 'content': 'Say hello with streaming.'}] Common variations
You can adapt streaming LangGraph output in these ways:
- Sync code: Use
asyncio.run()to run async nodes from sync context. - Different models: Change
model="gpt-4o-mini"to any supported OpenAI chat model. - Custom nodes: Stream from multiple nodes or chain streaming outputs.
- FastAPI integration: Yield streamed chunks in a FastAPI
StreamingResponsefor web apps.
Troubleshooting
- If streaming hangs, verify your
OPENAI_API_KEYis set and valid. - Ensure your Python environment supports
asyncioand you run the event loop properly. - Check network connectivity to OpenAI endpoints.
- Use
print()statements inside the async loop to debug streaming chunks.
Key Takeaways
- Use the OpenAI SDK's streaming chat completions inside LangGraph async nodes for real-time output.
- Compile and invoke LangGraph asynchronously to handle streaming responses properly.
- Streaming enables token-by-token processing, ideal for responsive UI or logging.
- Always set your OpenAI API key via environment variables for secure access.
- Adapt streaming logic for different models or frameworks like FastAPI for web streaming.