How to Intermediate · 3 min read

How to stream LangGraph output

Quick answer

To stream LangGraph output, compile your graph and invoke it asynchronously with streaming enabled by using the OpenAI SDK's streaming chat completions. Yield chunks from the streaming response to process output tokens in real time.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai langgraph

Setup

Install the required packages and set your OpenAI API key as an environment variable.

Install packages: pip install openai langgraph
Set environment variable in your shell: export OPENAI_API_KEY='your_api_key' (Linux/macOS) or setx OPENAI_API_KEY "your_api_key" (Windows)

bash

pip install openai langgraph

output

Collecting openai
Collecting langgraph
Successfully installed openai-1.x.x langgraph-0.x.x

Step by step

This example shows how to create a simple LangGraph with a node that streams output from an OpenAI chat completion using the OpenAI SDK's streaming API.

python

import os
import asyncio
from openai import OpenAI
from langgraph.graph import StateGraph, END

# Define the state type
class State(dict):
    pass

# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Define an async node function that streams output
async def stream_node(state: State) -> State:
    messages = state.get("messages", [])
    # Add user message
    messages.append({"role": "user", "content": "Say hello with streaming."})

    # Create streaming chat completion
    stream = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages,
        stream=True
    )

    collected = []
    print("Streaming response:")
    async for chunk in stream:
        delta = chunk.choices[0].delta.content or ""
        print(delta, end="", flush=True)
        collected.append(delta)

    print()  # newline after streaming

    # Append assistant message to state
    messages.append({"role": "assistant", "content": ''.join(collected)})
    return {"messages": messages}

# Build LangGraph
graph = StateGraph(State)
graph.add_node("stream_node", stream_node)
graph.set_entry_point("stream_node")
graph.add_edge("stream_node", END)

# Compile the graph
app = graph.compile()

# Run the graph asynchronously
async def main():
    result = await app.invoke({"messages": []})
    print("Final state messages:", result["messages"])

if __name__ == "__main__":
    asyncio.run(main())

output

Streaming response:
Say hello with streaming.
Final state messages: [{'role': 'user', 'content': 'Say hello with streaming.'}, {'role': 'assistant', 'content': 'Say hello with streaming.'}]

Common variations

You can adapt streaming LangGraph output in these ways:

Sync code: Use asyncio.run() to run async nodes from sync context.
Different models: Change model="gpt-4o-mini" to any supported OpenAI chat model.
Custom nodes: Stream from multiple nodes or chain streaming outputs.
FastAPI integration: Yield streamed chunks in a FastAPI StreamingResponse for web apps.

Troubleshooting

If streaming hangs, verify your OPENAI_API_KEY is set and valid.
Ensure your Python environment supports asyncio and you run the event loop properly.
Check network connectivity to OpenAI endpoints.
Use print() statements inside the async loop to debug streaming chunks.

✅

Key Takeaways

Use the OpenAI SDK's streaming chat completions inside LangGraph async nodes for real-time output.
Compile and invoke LangGraph asynchronously to handle streaming responses properly.
Streaming enables token-by-token processing, ideal for responsive UI or logging.
Always set your OpenAI API key via environment variables for secure access.
Adapt streaming logic for different models or frameworks like FastAPI for web streaming.

Verified 2026-04 · gpt-4o-mini

Verify ↗