How to intermediate · 3 min read

How to stream LangGraph outputs

Quick answer
Use the StateGraph from langgraph to build your graph, then compile it and invoke with an async generator or callback to stream outputs incrementally. Streaming requires handling partial results from the graph's nodes as they execute.

PREREQUISITES

  • Python 3.8+
  • pip install langgraph
  • OpenAI API key set in environment variable
  • Basic async Python knowledge

Setup

Install the langgraph package and set your OpenAI API key in the environment. This example assumes you have Python 3.8 or newer.

  • Install LangGraph: pip install langgraph
  • Set API key: export OPENAI_API_KEY=your_api_key (Linux/macOS) or set OPENAI_API_KEY=your_api_key (Windows)
bash
pip install langgraph
output
Collecting langgraph
  Downloading langgraph-0.1.0-py3-none-any.whl (15 kB)
Installing collected packages: langgraph
Successfully installed langgraph-0.1.0

Step by step

This example shows how to define a simple StateGraph with a node that yields partial outputs asynchronously, then compile and invoke it with streaming of results.

python
import os
import asyncio
from langgraph.graph import StateGraph, END
from typing import TypedDict

class State(TypedDict):
    messages: list[str]

async def my_node(state: State) -> State:
    # Simulate streaming partial outputs
    for i in range(3):
        state["messages"].append(f"partial output {i+1}")
        yield state
        await asyncio.sleep(0.5)
    # Final output appended
    state["messages"].append("final output")
    yield state

# Create graph and add node
graph = StateGraph(State)
graph.add_node("my_node", my_node)
graph.set_entry_point("my_node")
graph.add_edge("my_node", END)

# Compile the graph
app = graph.compile()

async def main():
    # Invoke the graph with streaming
    async for partial_state in app.invoke_stream({"messages": []}):
        print("Streamed messages:", partial_state["messages"])

if __name__ == "__main__":
    asyncio.run(main())
output
Streamed messages: ['partial output 1']
Streamed messages: ['partial output 1', 'partial output 2']
Streamed messages: ['partial output 1', 'partial output 2', 'partial output 3']
Streamed messages: ['partial output 1', 'partial output 2', 'partial output 3', 'final output']

Common variations

You can use synchronous invocation with app.invoke() for non-streaming results or switch to different LLM backends by configuring the graph's AI service. Async streaming is preferred for real-time UI updates.

  • Use app.invoke() for batch results.
  • Integrate with OpenAI or Anthropic by setting environment variables.
  • Handle exceptions inside async nodes for robust streaming.

Troubleshooting

  • If streaming does not yield partial results, ensure your node functions use async def and yield properly.
  • Check your Python version supports async generators (3.6+).
  • Verify your OpenAI API key is set correctly in os.environ["OPENAI_API_KEY"].
  • Use logging inside nodes to debug streaming behavior.

Key Takeaways

  • Use async generator nodes in LangGraph to stream partial outputs.
  • Compile your graph and invoke with invoke_stream() for real-time results.
  • Set your OpenAI API key in environment variables for AI integration.
  • Streaming enables responsive UI updates and incremental processing.
  • Debug streaming by verifying async yields and environment setup.
Verified 2026-04 · gpt-4o, claude-3-5-sonnet-20241022
Verify ↗