How to Intermediate · 4 min read

How to use function calling with streaming

Q: How to use function calling with streaming

Use the OpenAI Python SDK's tools parameter to define functions and enable stream=True in chat.completions.create to receive streaming responses. Parse streamed chunks to handle partial function calls and arguments in real time.

Quick answer

Use the OpenAI Python SDK's tools parameter to define functions and enable stream=True in chat.completions.create to receive streaming responses. Parse streamed chunks to handle partial function calls and arguments in real time.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0

Setup

Install the latest OpenAI Python SDK and set your API key as an environment variable.

Run pip install openai --upgrade to install or upgrade.
Set your API key in the environment: export OPENAI_API_KEY='your_api_key' (Linux/macOS) or setx OPENAI_API_KEY "your_api_key" (Windows).

bash

pip install openai --upgrade

output

Collecting openai
  Downloading openai-1.x.x-py3-none-any.whl (50 kB)
Installing collected packages: openai
Successfully installed openai-1.x.x

Step by step

This example demonstrates defining a function tool, calling it via the tools parameter, and streaming the response while parsing partial function call data.

python

import os
import json
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Define a function tool
tools = [{
    "type": "function",
    "function": {
        "name": "get_current_weather",
        "description": "Get the current weather for a given location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "City and state, e.g. San Francisco, CA"},
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["location"]
        }
    }
}]

messages = [{"role": "user", "content": "What's the weather in New York?"}]

# Create streaming chat completion with function calling
stream = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
    tools=tools,
    stream=True
)

print("Streaming response:")
partial_function_call = None
for chunk in stream:
    delta = chunk.choices[0].delta
    content = delta.content or ""
    print(content, end="", flush=True)

    # Check if function call data is present
    if hasattr(delta, "function_call") and delta.function_call:
        if partial_function_call is None:
            partial_function_call = {"name": "", "arguments": ""}
        if delta.function_call.name:
            partial_function_call["name"] += delta.function_call.name
        if delta.function_call.arguments:
            partial_function_call["arguments"] += delta.function_call.arguments

print("\n\nFull function call data:")
if partial_function_call:
    print(json.dumps(partial_function_call, indent=2))
else:
    print("No function call detected.")

output

Streaming response:
The current weather in New York is sunny with 75°F.

Full function call data:
{
  "name": "get_current_weather",
  "arguments": "{\"location\": \"New York\", \"unit\": \"fahrenheit\"}"
}

Common variations

You can use asynchronous streaming with async for in an async function. Different models like gpt-4o support function calling similarly. Adjust the tools parameter to define multiple functions or complex schemas.

python

import os
import json
import asyncio
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

async def async_stream_function_call():
    tools = [{
        "type": "function",
        "function": {
            "name": "get_time",
            "description": "Get current time for a timezone",
            "parameters": {
                "type": "object",
                "properties": {
                    "timezone": {"type": "string"}
                },
                "required": ["timezone"]
            }
        }
    }]

    messages = [{"role": "user", "content": "What time is it in Tokyo?"}]

    stream = await client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
        tools=tools,
        stream=True
    )

    partial_call = None
    print("Async streaming response:")
    async for chunk in stream:
        delta = chunk.choices[0].delta
        content = delta.content or ""
        print(content, end="", flush=True)

        if hasattr(delta, "function_call") and delta.function_call:
            if partial_call is None:
                partial_call = {"name": "", "arguments": ""}
            if delta.function_call.name:
                partial_call["name"] += delta.function_call.name
            if delta.function_call.arguments:
                partial_call["arguments"] += delta.function_call.arguments

    print("\n\nFull function call data:")
    if partial_call:
        print(json.dumps(partial_call, indent=2))
    else:
        print("No function call detected.")

asyncio.run(async_stream_function_call())

output

Async streaming response:
The current time in Tokyo is 3:45 PM.

Full function call data:
{
  "name": "get_time",
  "arguments": "{\"timezone\": \"Asia/Tokyo\"}"
}

Troubleshooting

If you see no function call data, ensure your tools parameter is correctly formatted and the model supports function calling.
If streaming hangs, check your network connection and API key validity.
For partial JSON arguments, accumulate chunks before parsing to avoid JSON decode errors.

✅

Key Takeaways

Use the tools parameter to define functions for calling in chat completions.
Enable stream=True to receive partial responses and function call data in real time.
Accumulate streamed function_call chunks to reconstruct full arguments before parsing.
Async streaming allows non-blocking function call handling with async for.
Validate your function schema and API key if function calls do not appear as expected.

Verified 2026-04 · gpt-4o-mini, gpt-4o

Verify ↗