How to Intermediate · 4 min read

How to stream LLM to React frontend

Q: How to stream LLM to React frontend

Use the OpenAI SDK's chat.completions.create method with stream=True in a Python backend (e.g., FastAPI) to stream tokens. Then, forward these tokens via Server-Sent Events (SSE) to your React frontend, which listens and appends the streamed content in real time.

Quick answer

Use the OpenAI SDK's chat.completions.create method with stream=True in a Python backend (e.g., FastAPI) to stream tokens. Then, forward these tokens via Server-Sent Events (SSE) to your React frontend, which listens and appends the streamed content in real time.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0 fastapi uvicorn
React 18+ environment

Setup backend streaming server

Install required Python packages and set your OpenAI API key in environment variables. Use FastAPI to create an SSE endpoint that streams tokens from the OpenAI chat.completions.create method with stream=True.

bash

pip install openai fastapi uvicorn

output

Collecting openai
Collecting fastapi
Collecting uvicorn
Successfully installed openai fastapi uvicorn

Step by step backend and React frontend

This example shows a minimal FastAPI backend streaming OpenAI chat completions and a React frontend consuming the stream via EventSource.

python

import os
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
from openai import OpenAI
import asyncio

app = FastAPI()
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

async def event_generator(messages):
    stream = client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
        stream=True
    )
    async for chunk in stream:
        content = chunk.choices[0].delta.content
        if content:
            yield f"data: {content}\n\n"

@app.get("/stream")
async def stream():
    messages = [{"role": "user", "content": "Explain streaming LLM to React frontend."}]
    return StreamingResponse(event_generator(messages), media_type="text/event-stream")

# React frontend example (App.js):
# import React, { useEffect, useState } from 'react';
# function App() {
#   const [text, setText] = useState('');
#   useEffect(() => {
#     const eventSource = new EventSource('http://localhost:8000/stream');
#     eventSource.onmessage = e => {
#       setText(prev => prev + e.data);
#     };
#     return () => eventSource.close();
#   }, []);
#   return <div><h1>Streaming LLM Output</h1><pre>{text}</pre></div>;
# }
# export default App

output

INFO:     Started server process [12345]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)

Common variations

Use async client calls for better concurrency.
Switch model to gpt-4o-mini or claude-sonnet-4-5 depending on provider.
Use WebSocket instead of SSE for bidirectional streaming.

Troubleshooting streaming issues

If the stream hangs, check your API key and network connectivity.
Ensure CORS is configured on FastAPI to allow your React frontend origin.
Use browser devtools to verify EventSource connection and data flow.

✅

Key Takeaways

Use OpenAI SDK's stream=True to get token-by-token output from LLMs.
Forward streamed tokens via SSE from Python backend to React frontend for real-time display.
React's EventSource API is simple and effective for consuming SSE streams.
Configure CORS and environment variables properly to avoid common connection issues.

Verified 2026-04 · gpt-4o, gpt-4o-mini, claude-sonnet-4-5

Verify ↗