How to intermediate · 3 min read

How to stream tokens from LangGraph

Q: How to stream tokens from LangGraph

To stream tokens from LangGraph, use the compiled app's invoke method with an async generator or callback that yields partial outputs as they arrive. This requires integrating LangGraph with an async event loop or streaming handler to process tokens incrementally.

Quick answer

To stream tokens from LangGraph, use the compiled app's invoke method with an async generator or callback that yields partial outputs as they arrive. This requires integrating LangGraph with an async event loop or streaming handler to process tokens incrementally.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install langgraph openai asyncio

Setup

Install the langgraph package and ensure you have an OpenAI API key set in your environment. You also need asyncio for asynchronous streaming.

Run:

bash

pip install langgraph openai

output

Collecting langgraph
Collecting openai
Installing collected packages: langgraph, openai
Successfully installed langgraph-0.1.0 openai-1.0.0

Step by step

This example shows how to define a simple LangGraph node, compile it, and stream tokens from the invoke method asynchronously.

python

import os
import asyncio
from langgraph.graph import StateGraph, END

# Define the state type
class State(dict):
    pass

# Define a node function that yields tokens asynchronously
async def my_node(state: State):
    prompt = state.get("messages", [""])[-1]
    # Simulate streaming tokens from an LLM
    tokens = ["Hello", ", ", "this", " is", " LangGraph", " streaming."]
    for token in tokens:
        await asyncio.sleep(0.2)  # simulate delay
        yield {"messages": state.get("messages", []) + [token]}

# Create graph and add node
graph = StateGraph(State)
graph.add_node("my_node", my_node)
graph.set_entry_point("my_node")
graph.add_edge("my_node", END)

# Compile the graph
app = graph.compile()

async def main():
    # Start streaming invocation
    async for partial_state in app.invoke({"messages": ["Start"]}):
        print("Streamed token:", partial_state["messages"][-1])

if __name__ == "__main__":
    asyncio.run(main())

output

Streamed token: Hello
Streamed token: , 
Streamed token: this
Streamed token:  is
Streamed token:  LangGraph
Streamed token:  streaming.

Common variations

You can integrate LangGraph streaming with different async frameworks like FastAPI or use synchronous wrappers if streaming is not needed.

Use different LLM providers inside nodes for token streaming.
Combine with OpenAI SDK for real-time token generation.
Use StateGraph with typed dicts for structured state management.

Troubleshooting

If streaming does not start, ensure your node function is declared async and yields partial states. Also verify your environment supports asyncio event loops.

For errors about missing langgraph, confirm the package is installed and imported correctly.

✅

Key Takeaways

Use async generator node functions in LangGraph to stream tokens incrementally.
Invoke the compiled LangGraph app with async iteration to receive streamed partial outputs.
Integrate LangGraph streaming with async frameworks for real-time applications.

Verified 2026-04

Verify ↗