How to batch embeddings with OpenAI
Quick answer
Use the
client.embeddings.create method with a list of input texts to batch embeddings in one API call. This reduces latency and cost by processing multiple texts simultaneously with models like text-embedding-3-small.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0
Setup
Install the official OpenAI Python SDK and set your API key as an environment variable for secure authentication.
pip install openai>=1.0 Step by step
Batch multiple texts by passing a list of strings to the input parameter of client.embeddings.create. This example uses the text-embedding-3-small model.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
texts = [
"OpenAI provides powerful AI models.",
"Batching embeddings improves efficiency.",
"Use Python for easy API integration."
]
response = client.embeddings.create(
model="text-embedding-3-small",
input=texts
)
for i, embedding in enumerate(response.data):
print(f"Text {i+1} embedding vector length: {len(embedding.embedding)}") output
Text 1 embedding vector length: 1536 Text 2 embedding vector length: 1536 Text 3 embedding vector length: 1536
Common variations
- Use async calls with
asyncioandclient.embeddings.acreatefor concurrency. - Change the model to
text-embedding-3-largefor higher quality embeddings. - Batch large datasets by splitting input into chunks to avoid request size limits.
import asyncio
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
async def batch_embeddings_async(texts):
response = await client.embeddings.acreate(
model="text-embedding-3-small",
input=texts
)
return response
texts = ["Example 1", "Example 2", "Example 3"]
async def main():
response = await batch_embeddings_async(texts)
for i, embedding in enumerate(response.data):
print(f"Async Text {i+1} embedding length: {len(embedding.embedding)}")
asyncio.run(main()) output
Async Text 1 embedding length: 1536 Async Text 2 embedding length: 1536 Async Text 3 embedding length: 1536
Troubleshooting
- If you get a
RateLimitError, reduce batch size or add retry logic with exponential backoff. - For
InvalidRequestError, ensure your input list is not empty and does not exceed token limits. - Check your API key environment variable if authentication fails.
Key Takeaways
- Batch embeddings by passing a list of texts to
client.embeddings.createto optimize API usage. - Use async methods for concurrency when embedding large datasets.
- Split large input lists to avoid request size limits and rate limiting errors.