How to beginner · 3 min read

How to batch embeddings with OpenAI

Q: How to batch embeddings with OpenAI

Use the client.embeddings.create method with a list of input texts to batch embeddings in one API call. This reduces latency and cost by processing multiple texts simultaneously with models like text-embedding-3-small.

Quick answer

Use the client.embeddings.create method with a list of input texts to batch embeddings in one API call. This reduces latency and cost by processing multiple texts simultaneously with models like text-embedding-3-small.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0

Setup

Install the official OpenAI Python SDK and set your API key as an environment variable for secure authentication.

bash

pip install openai>=1.0

Step by step

Batch multiple texts by passing a list of strings to the input parameter of client.embeddings.create. This example uses the text-embedding-3-small model.

python

import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

texts = [
    "OpenAI provides powerful AI models.",
    "Batching embeddings improves efficiency.",
    "Use Python for easy API integration."
]

response = client.embeddings.create(
    model="text-embedding-3-small",
    input=texts
)

for i, embedding in enumerate(response.data):
    print(f"Text {i+1} embedding vector length: {len(embedding.embedding)}")

output

Text 1 embedding vector length: 1536
Text 2 embedding vector length: 1536
Text 3 embedding vector length: 1536

Common variations

Use async calls with asyncio and client.embeddings.acreate for concurrency.
Change the model to text-embedding-3-large for higher quality embeddings.
Batch large datasets by splitting input into chunks to avoid request size limits.

python

import asyncio
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

async def batch_embeddings_async(texts):
    response = await client.embeddings.acreate(
        model="text-embedding-3-small",
        input=texts
    )
    return response

texts = ["Example 1", "Example 2", "Example 3"]

async def main():
    response = await batch_embeddings_async(texts)
    for i, embedding in enumerate(response.data):
        print(f"Async Text {i+1} embedding length: {len(embedding.embedding)}")

asyncio.run(main())

output

Async Text 1 embedding length: 1536
Async Text 2 embedding length: 1536
Async Text 3 embedding length: 1536

Troubleshooting

If you get a RateLimitError, reduce batch size or add retry logic with exponential backoff.
For InvalidRequestError, ensure your input list is not empty and does not exceed token limits.
Check your API key environment variable if authentication fails.

✅

Key Takeaways

Batch embeddings by passing a list of texts to client.embeddings.create to optimize API usage.
Use async methods for concurrency when embedding large datasets.
Split large input lists to avoid request size limits and rate limiting errors.

Verified 2026-04 · text-embedding-3-small, text-embedding-3-large

Verify ↗