How to use text-embedding-3-large in python
Direct answer
Use the OpenAI Python SDK's
client.embeddings.create method with model="text-embedding-3-large" and your input text to generate embeddings.Setup
Install
pip install openai Env vars
OPENAI_API_KEY Imports
import os
from openai import OpenAI Examples
inThe quick brown fox jumps over the lazy dog.
out[0.0123, -0.0456, 0.0789, ...] # vector of floats representing the embedding
inOpenAI provides powerful AI models for developers.
out[0.0345, -0.0234, 0.0567, ...] # embedding vector output
in
out[] # empty input returns an empty or error response depending on API validation
Integration steps
- Install the OpenAI Python SDK and set your OPENAI_API_KEY environment variable.
- Import
OpenAIfrom theopenaipackage and initialize the client with your API key. - Call
client.embeddings.createwithmodel="text-embedding-3-large"and your input text as theinputparameter. - Receive the response containing the embedding vector in
response.data[0].embedding. - Use or store the embedding vector as needed for your application.
Full code
import os
from openai import OpenAI
# Initialize the OpenAI client with your API key from environment
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Text to embed
text_to_embed = "The quick brown fox jumps over the lazy dog."
# Create embedding
response = client.embeddings.create(
model="text-embedding-3-large",
input=text_to_embed
)
# Extract the embedding vector
embedding_vector = response.data[0].embedding
print("Embedding vector length:", len(embedding_vector))
print("First 5 values:", embedding_vector[:5]) output
Embedding vector length: 1536 First 5 values: [0.012345, -0.045678, 0.078912, 0.003456, -0.009876]
API trace
Request
{"model": "text-embedding-3-large", "input": "The quick brown fox jumps over the lazy dog."} Response
{"data": [{"embedding": [0.012345, -0.045678, 0.078912, ...]}], "usage": {"prompt_tokens": 10, "total_tokens": 10}} Extract
response.data[0].embeddingVariants
Batch Embedding Request ›
Use when you want to embed multiple texts in a single API call to reduce latency and cost.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
texts = [
"The quick brown fox jumps over the lazy dog.",
"OpenAI provides powerful AI models for developers."
]
response = client.embeddings.create(
model="text-embedding-3-large",
input=texts
)
embeddings = [item.embedding for item in response.data]
for i, emb in enumerate(embeddings):
print(f"Embedding {i} length: {len(emb)}") Async Embedding Call ›
Use in asynchronous Python applications to avoid blocking while waiting for the embedding response.
import os
import asyncio
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
async def get_embedding(text: str):
response = await client.embeddings.acreate(
model="text-embedding-3-large",
input=text
)
return response.data[0].embedding
async def main():
embedding = await get_embedding("Async embedding example text.")
print("Embedding length:", len(embedding))
asyncio.run(main()) Alternative Model: text-embedding-3-small ›
Use when you need faster embeddings with smaller vector size and lower cost, trading off some accuracy.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
response = client.embeddings.create(
model="text-embedding-3-small",
input="A shorter, faster embedding model example."
)
embedding = response.data[0].embedding
print("Embedding length:", len(embedding)) Performance
Latency~300-600ms per embedding request for single input on text-embedding-3-large
Cost~$0.0004 per 1,000 tokens embedded with text-embedding-3-large
Rate limitsDefault tier: 60 RPM (requests per minute) and 60,000 TPM (tokens per minute)
- Preprocess and truncate input text to relevant content to reduce token usage.
- Batch multiple inputs in one request to amortize overhead.
- Cache embeddings for repeated texts to avoid redundant calls.
| Approach | Latency | Cost/call | Best for |
|---|---|---|---|
| Single input embedding | ~300-600ms | ~$0.0004 | Embedding single texts with high accuracy |
| Batch embedding | ~500-900ms | ~$0.0004 per 1k tokens total | Embedding multiple texts efficiently |
| text-embedding-3-small model | ~200-400ms | ~$0.0002 | Faster, cheaper embeddings with smaller vectors |
Quick tip
Always batch multiple texts in a single <code>embeddings.create</code> call to optimize throughput and reduce latency.
Common mistake
Passing an empty string or non-string input to <code>input</code> without validation, causing API errors or empty embeddings.