ValueError
ValueError: embedding dimension mismatch
Stack trace
ValueError: embedding dimension mismatch: expected 1536 but got 1024
File "app.py", line 45, in create_vector_store
vector_store.add_texts(texts, embeddings)
File "langchain/vectorstores/base.py", line 123, in add_texts
raise ValueError(f"embedding dimension mismatch: expected {self.dim} but got {len(embedding)}") Why it happens
OpenAI embedding models produce fixed-size vectors depending on the model used (e.g., text-embedding-3-large returns 1536 dimensions, text-embedding-3-small returns 1024). If your vector store expects a different dimension than the embedding output, this mismatch triggers an error. This often happens when switching embedding models without updating the vector store or mixing embeddings from different models.
Detection
Check the dimension attribute of your vector store and compare it with the length of the embedding vectors returned by the OpenAI embedding model before adding them. Log or assert vector sizes to catch mismatches early.
Causes & fixes
Using an embedding model that outputs vectors of a different dimension than the vector store expects
Ensure the embedding model used matches the vector store's expected dimension or recreate the vector store with the correct embedding dimension.
Mixing embeddings generated from different OpenAI embedding models in the same vector store
Use a consistent embedding model for all vectors stored in the vector store to maintain dimension uniformity.
Loading a pre-existing vector store created with a different embedding model dimension
Rebuild or reindex the vector store using embeddings from the current model to align dimensions.
Code: broken vs fixed
from openai import OpenAI
import os
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
texts = ["Hello world", "Test document"]
# Using embedding model 'text-embedding-3-small' which outputs 1024-dim vectors
embeddings = [client.embeddings.create(input=text, model="text-embedding-3-small").data[0].embedding for text in texts]
# Vector store expects 1536-dim embeddings, causing dimension mismatch error
vector_store = SomeVectorStore(dimension=1536)
vector_store.add_texts(texts, embeddings) # This line raises ValueError from openai import OpenAI
import os
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
texts = ["Hello world", "Test document"]
# Changed embedding model to 'text-embedding-3-large' which outputs 1536-dim vectors matching vector store
embeddings = [client.embeddings.create(input=text, model="text-embedding-3-large").data[0].embedding for text in texts]
vector_store = SomeVectorStore(dimension=1536) # dimension matches embedding output
vector_store.add_texts(texts, embeddings) # Works without error
print("Embeddings added successfully with matching dimensions.") Workaround
Catch the ValueError on dimension mismatch, then log the embedding vector length and vector store dimension; optionally, convert or pad embeddings to match dimensions temporarily before rebuilding the vector store.
Prevention
Standardize on a single OpenAI embedding model across your application and vector store lifecycle, and explicitly store and verify embedding dimensions when saving or loading vector stores.