How to use function calling with streaming
Quick answer
Use the OpenAI Python SDK's
tools parameter to define functions and enable stream=True in chat.completions.create to receive streaming responses. Parse streamed chunks to handle partial function calls and arguments in real time.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0
Setup
Install the latest OpenAI Python SDK and set your API key as an environment variable.
- Run
pip install openai --upgradeto install or upgrade. - Set your API key in the environment:
export OPENAI_API_KEY='your_api_key'(Linux/macOS) orsetx OPENAI_API_KEY "your_api_key"(Windows).
pip install openai --upgrade output
Collecting openai Downloading openai-1.x.x-py3-none-any.whl (50 kB) Installing collected packages: openai Successfully installed openai-1.x.x
Step by step
This example demonstrates defining a function tool, calling it via the tools parameter, and streaming the response while parsing partial function call data.
import os
import json
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Define a function tool
tools = [{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather for a given location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string", "description": "City and state, e.g. San Francisco, CA"},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
},
"required": ["location"]
}
}
}]
messages = [{"role": "user", "content": "What's the weather in New York?"}]
# Create streaming chat completion with function calling
stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
tools=tools,
stream=True
)
print("Streaming response:")
partial_function_call = None
for chunk in stream:
delta = chunk.choices[0].delta
content = delta.content or ""
print(content, end="", flush=True)
# Check if function call data is present
if hasattr(delta, "function_call") and delta.function_call:
if partial_function_call is None:
partial_function_call = {"name": "", "arguments": ""}
if delta.function_call.name:
partial_function_call["name"] += delta.function_call.name
if delta.function_call.arguments:
partial_function_call["arguments"] += delta.function_call.arguments
print("\n\nFull function call data:")
if partial_function_call:
print(json.dumps(partial_function_call, indent=2))
else:
print("No function call detected.") output
Streaming response:
The current weather in New York is sunny with 75°F.
Full function call data:
{
"name": "get_current_weather",
"arguments": "{\"location\": \"New York\", \"unit\": \"fahrenheit\"}"
} Common variations
You can use asynchronous streaming with async for in an async function. Different models like gpt-4o support function calling similarly. Adjust the tools parameter to define multiple functions or complex schemas.
import os
import json
import asyncio
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
async def async_stream_function_call():
tools = [{
"type": "function",
"function": {
"name": "get_time",
"description": "Get current time for a timezone",
"parameters": {
"type": "object",
"properties": {
"timezone": {"type": "string"}
},
"required": ["timezone"]
}
}
}]
messages = [{"role": "user", "content": "What time is it in Tokyo?"}]
stream = await client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools,
stream=True
)
partial_call = None
print("Async streaming response:")
async for chunk in stream:
delta = chunk.choices[0].delta
content = delta.content or ""
print(content, end="", flush=True)
if hasattr(delta, "function_call") and delta.function_call:
if partial_call is None:
partial_call = {"name": "", "arguments": ""}
if delta.function_call.name:
partial_call["name"] += delta.function_call.name
if delta.function_call.arguments:
partial_call["arguments"] += delta.function_call.arguments
print("\n\nFull function call data:")
if partial_call:
print(json.dumps(partial_call, indent=2))
else:
print("No function call detected.")
asyncio.run(async_stream_function_call()) output
Async streaming response:
The current time in Tokyo is 3:45 PM.
Full function call data:
{
"name": "get_time",
"arguments": "{\"timezone\": \"Asia/Tokyo\"}"
} Troubleshooting
- If you see no function call data, ensure your
toolsparameter is correctly formatted and the model supports function calling. - If streaming hangs, check your network connection and API key validity.
- For partial JSON arguments, accumulate chunks before parsing to avoid JSON decode errors.
Key Takeaways
- Use the
toolsparameter to define functions for calling in chat completions. - Enable
stream=Trueto receive partial responses and function call data in real time. - Accumulate streamed
function_callchunks to reconstruct full arguments before parsing. - Async streaming allows non-blocking function call handling with
async for. - Validate your function schema and API key if function calls do not appear as expected.