Code beginner · 3 min read

How to create embeddings for documents in python

Q: How to create embeddings for documents in python

Use the OpenAIEmbeddings class from langchain_openai or the client.embeddings.create method from the OpenAI SDK to convert documents into vector embeddings in Python.

Direct answer

Use the OpenAIEmbeddings class from langchain_openai or the client.embeddings.create method from the OpenAI SDK to convert documents into vector embeddings in Python.

Setup

Install

bash

pip install openai langchain langchain_community

Env vars

OPENAI_API_KEYANTHROPIC_API_KEY

Imports

python

import os
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.document_loaders import TextLoader
import anthropic

Examples

inA short text document: 'Python is a popular programming language.'

outA vector embedding array representing the semantic content of the text.

inMultiple documents loaded from a folder with TextLoader

outA FAISS index containing embeddings for all documents, ready for similarity search.

inEmpty or very short document text

outAn embedding vector with minimal semantic information, possibly all zeros or near zero.

Integration steps

Install required packages and set your OPENAI_API_KEY and ANTHROPIC_API_KEY in environment variables
Load your documents using a loader like TextLoader or read raw text
Initialize the OpenAIEmbeddings client from langchain_openai
Generate embeddings by passing document texts to the embedding client
Optionally, store embeddings in a vector store like FAISS for efficient retrieval

Full code

python

import os
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.document_loaders import TextLoader

# Load documents from a local text file
loader = TextLoader("./documents/sample.txt")
docs = loader.load()

# Initialize embeddings client
embeddings = OpenAIEmbeddings()

# Create embeddings for each document
texts = [doc.page_content for doc in docs]
vectors = embeddings.embed_documents(texts)

# Build a FAISS vector store from embeddings
index = FAISS.from_texts(texts, embeddings)

print(f"Created embeddings for {len(texts)} documents.")
print(f"Sample embedding vector (first document): {vectors[0][:5]}...")

output

Created embeddings for 3 documents.
Sample embedding vector (first document): [0.0123, -0.0456, 0.0789, -0.0345, 0.0567]...

API trace

Request

json

{"model": "text-embedding-3-large", "input": ["document text 1", "document text 2"]}

Response

json

{"data": [{"embedding": [0.01, -0.02, ...]}, {"embedding": [0.03, 0.04, ...]}], "usage": {"total_tokens": 50}}

Extractresponse.data[0].embedding for the first document embedding vector

Variants

Streaming embeddings generation ›

Use when embedding large batches of documents to reduce memory usage and latency.

python

import os
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()
texts = ["Doc 1 text", "Doc 2 text"]

for vector in embeddings.embed_documents_stream(texts):
    print(f"Streaming embedding vector: {vector[:5]}...")

Async embeddings generation ›

Use in asynchronous Python applications to embed documents concurrently.

python

import os
import asyncio
from langchain_openai import OpenAIEmbeddings

async def main():
    embeddings = OpenAIEmbeddings()
    texts = ["Async doc 1", "Async doc 2"]
    vectors = await embeddings.aembed_documents(texts)
    print(f"Async embeddings: {vectors[0][:5]}...")

asyncio.run(main())

Using Anthropic Claude embeddings ›

Use if you prefer Anthropic's Claude embeddings for potentially better semantic quality.

python

import os
import anthropic

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

response = client.embeddings.create(
    model="claude-3-5-haiku-20241022",
    input=["Document text to embed"]
)

print(f"Embedding vector: {response.data[0].embedding[:5]}...")

Performance

Latency~500ms per 512 tokens for embedding generation with OpenAI

Cost~$0.0004 per 1,000 tokens embedded with OpenAI text-embedding-3-large

Rate limitsTier 1: 60 RPM / 60,000 TPM typical for embeddings endpoint

Split large documents into smaller chunks before embedding
Remove unnecessary boilerplate or metadata from text
Batch multiple texts in a single API call to optimize usage

Approach	Latency	Cost/call	Best for
OpenAIEmbeddings (batch)	~500ms	~$0.0004 per 1k tokens	General purpose, easy integration
Streaming embeddings	Lower latency per doc	Same as batch	Large datasets, memory efficient
Anthropic Claude embeddings	~600ms	Check Anthropic pricing	Higher semantic quality, coding tasks

✓

Quick tip

Batch multiple documents in one embedding API call to reduce latency and cost.

⚠

Common mistake

Passing raw documents without preprocessing or splitting can cause token limits to be exceeded.

Verified 2026-04 · text-embedding-3-large, claude-3-5-haiku-20241022

Verify ↗