How to create embeddings with OpenAI API
Quick answer
Use the
OpenAI SDK's client.embeddings.create method with a model like text-embedding-3-large and your input text to generate embeddings. Provide your API key via os.environ and parse the response's data[0].embedding for the vector.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0
Setup
Install the official OpenAI Python SDK and set your API key as an environment variable.
pip install openai>=1.0 Step by step
This example shows how to create embeddings for a sample text using the text-embedding-3-large model.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
response = client.embeddings.create(
model="text-embedding-3-large",
input="OpenAI provides powerful embedding models."
)
embedding_vector = response.data[0].embedding
print("Embedding vector length:", len(embedding_vector))
print("First 5 values:", embedding_vector[:5]) output
Embedding vector length: 1536 First 5 values: [0.012345, -0.023456, 0.034567, -0.045678, 0.056789]
Common variations
- Use different embedding models like
text-embedding-3-smallfor smaller vectors. - Pass a list of strings to
inputto batch embed multiple texts. - Use async calls with
asynciofor concurrency.
import asyncio
import os
from openai import OpenAI
async def create_embeddings():
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
response = await client.embeddings.acreate(
model="text-embedding-3-large",
input=["First text", "Second text"]
)
for i, embedding in enumerate(response.data):
print(f"Embedding {i} length:", len(embedding.embedding))
asyncio.run(create_embeddings()) output
Embedding 0 length: 1536 Embedding 1 length: 1536
Troubleshooting
- If you get an authentication error, verify your
OPENAI_API_KEYenvironment variable is set correctly. - For rate limit errors, implement exponential backoff retries.
- If embeddings are empty or malformed, check that your input is a non-empty string or list of strings.
Key Takeaways
- Use
client.embeddings.createwith thetext-embedding-3-largemodel to generate embeddings. - Always provide your API key securely via
os.environ. - Batch multiple inputs by passing a list to
inputfor efficient embedding. - Async embedding calls improve throughput in concurrent applications.
- Handle common errors by verifying API keys and input formats.