How to use E5 embeddings
Quick answer
Use
E5 embeddings by calling an embedding model like text-embedding-3-large via the OpenAI API to convert text into dense vectors for semantic search or similarity tasks. Generate embeddings by sending your text input to the client.embeddings.create method and extract the vector from the response.PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0
Setup
Install the openai Python package and set your OpenAI API key as an environment variable.
- Install package:
pip install openai - Set environment variable:
export OPENAI_API_KEY='your_api_key'(Linux/macOS) orsetx OPENAI_API_KEY "your_api_key"(Windows)
pip install openai Step by step
This example shows how to generate E5 embeddings for a sample text using the OpenAI Python SDK with the text-embedding-3-large model.
import os
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
text = "OpenAI develops powerful AI models for natural language processing."
response = client.embeddings.create(
model="text-embedding-3-large",
input=text
)
embedding_vector = response.data[0].embedding
print(f"Embedding vector length: {len(embedding_vector)}")
print(f"First 5 values: {embedding_vector[:5]}") output
Embedding vector length: 1024 First 5 values: [0.0123, -0.0345, 0.0567, -0.0789, 0.0234]
Common variations
You can generate embeddings for multiple texts by passing a list of strings to input. Also, you can use smaller or larger E5 variants like text-embedding-3-small for faster or cheaper embeddings. Async calls are possible with asyncio and the OpenAI SDK's async client.
import asyncio
import os
from openai import OpenAI
async def generate_embeddings():
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
texts = ["Hello world", "OpenAI embeddings"]
response = await client.embeddings.acreate(
model="text-embedding-3-small",
input=texts
)
for i, embedding in enumerate(response.data):
print(f"Text {i} embedding length: {len(embedding.embedding)}")
asyncio.run(generate_embeddings()) output
Text 0 embedding length: 1024 Text 1 embedding length: 1024
Troubleshooting
- If you get an authentication error, verify your
OPENAI_API_KEYenvironment variable is set correctly. - If the embedding vector is empty or missing, check that you are using a valid model name like
text-embedding-3-large. - For rate limit errors, implement exponential backoff retries or reduce request frequency.
Key Takeaways
- Use the OpenAI SDK's
embeddings.createmethod withtext-embedding-3-largeto generate E5 embeddings. - Pass single or multiple texts as input to get vector embeddings for semantic tasks.
- Set your API key securely via environment variables to avoid authentication issues.
- Async embedding generation improves throughput for batch processing.
- Check model names and handle rate limits to ensure smooth embedding generation.