How to generate embeddings with Python
Direct answer
Use the OpenAI SDK's embeddings.create method with a model like text-embedding-3-small to generate vector embeddings from text in Python.
Setup
Install
pip install openai Env vars
OPENAI_API_KEY Imports
import os
from openai import OpenAI Examples
inHello world
out[0.0123, -0.0456, 0.0789, ...] # list of floats representing the embedding vector
inHow to generate embeddings with Python?
out[0.0345, -0.0678, 0.0234, ...] # embedding vector for the question text
in
out[] # empty input returns an empty or default embedding vector
Integration steps
- Install the OpenAI Python SDK and set the OPENAI_API_KEY environment variable.
- Import the OpenAI client and initialize it with the API key from os.environ.
- Call the embeddings.create method with the desired model and input text.
- Extract the embedding vector from the response's data[0].embedding field.
- Use or store the embedding vector for downstream tasks like search or classification.
Full code
import os
from openai import OpenAI
# Initialize OpenAI client with API key from environment
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Text to embed
text_to_embed = "How to generate embeddings with Python?"
# Create embeddings
response = client.embeddings.create(
model="text-embedding-3-small",
input=text_to_embed
)
# Extract embedding vector
embedding_vector = response.data[0].embedding
print(f"Embedding vector for input: {embedding_vector[:5]}... (truncated)") API trace
Request
{"model": "text-embedding-3-small", "input": "How to generate embeddings with Python?"} Response
{"data": [{"embedding": [0.02345, -0.01234, 0.04567, ...]}], "usage": {"prompt_tokens": 8, "total_tokens": 8}} Extract
response.data[0].embeddingVariants
Batch Embeddings Generation ›
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
texts = ["Hello world", "Generate embeddings with Python", "OpenAI embeddings"]
response = client.embeddings.create(
model="text-embedding-3-small",
input=texts
)
embeddings = [item.embedding for item in response.data]
for i, emb in enumerate(embeddings):
print(f"Embedding {i} (truncated): {emb[:5]}...") Async Embeddings Generation ›
import os
import asyncio
from openai import OpenAI
async def generate_embedding(text: str):
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
response = await client.embeddings.acreate(
model="text-embedding-3-small",
input=text
)
return response.data[0].embedding
async def main():
embedding = await generate_embedding("Async embeddings with Python")
print(f"Async embedding (truncated): {embedding[:5]}...")
asyncio.run(main()) Alternative Model for Larger Embeddings ›
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
response = client.embeddings.create(
model="text-embedding-3-large",
input="Generate higher dimensional embeddings"
)
embedding = response.data[0].embedding
print(f"Large embedding vector (truncated): {embedding[:5]}...") Performance
Latency~300-600ms per embedding request for single input on text-embedding-3-small
Cost~$0.0004 per 1,000 tokens for text-embedding-3-small
Rate limitsTier 1: 60 RPM / 60,000 TPM typical for embeddings endpoint
- Minimize input length to reduce token usage and cost.
- Batch inputs to amortize overhead per request.
- Use smaller embedding models for less critical tasks.
| Approach | Latency | Cost/call | Best for |
|---|---|---|---|
| Single input embedding | ~300-600ms | ~$0.0004 | Quick embedding of one text |
| Batch embedding | ~500-1000ms | ~$0.0004 per 1k tokens total | Efficient multi-text embedding |
| Async embedding | Varies, concurrent | Same as sync | Embedding in async apps |
| Large model embedding | ~700-1200ms | ~$0.001 | High-quality embeddings |
Quick tip
Always batch multiple texts in one embeddings.create call to optimize latency and cost.
Common mistake
Passing an empty string or non-string input to embeddings.create often results in errors or meaningless embeddings.