ValueError: dimension_mismatch
ValueError: Query embedding dimension [X] does not match stored embedding dimension [Y]
Stack trace
Traceback (most recent call last):
File "search.py", line 42, in query_embeddings
results = index.query(query_vector, top_k=10)
File "pinecone/index.py", line 156, in query
self._validate_dimension(query_vector)
File "pinecone/index.py", line 89, in _validate_dimension
raise ValueError(f"Query embedding dimension {len(query_vector)} does not match index dimension {self.dimension}. Expected {self.dimension}, got {len(query_vector)}")
ValueError: Query embedding dimension 256 does not match index dimension 768 Why it happens
Matryoshka embedding models like 'nomic-embed-text-v1.5' and 'e5-large-v2' support flexible output dimensions through truncation. If you indexed documents with `output_dimensions=768` but query with `output_dimensions=256` (or vice versa), the vector dimensions won't match. The vector database (Pinecone, Qdrant, Weaviate, or Milvus) requires strict dimensional consistency across all vectors. This is common when using Matryoshka models for cost optimization but accidentally changing the dimension parameter between indexing and query time.
Detection
Before querying, verify that both your indexing pipeline and query pipeline use identical `output_dimensions` parameters. Log the dimension of your query embeddings and compare against the dimension metadata of your vector index. Add an assertion: `assert query_vector.shape[0] == index.dimension, f'Dimension mismatch: {query_vector.shape[0]} vs {index.dimension}'`.
Causes & fixes
Indexing used `output_dimensions=768` but query code uses `output_dimensions=256` due to code change or config mismatch
Add a centralized config file (e.g., `embedding_config.yaml` with `output_dimensions: 768`) and import it in both indexing and query scripts. Verify both pipelines read from the same config source.
Matryoshka model loaded with different pooling strategy (e.g., 'mean' vs 'cls') which affects final dimension
Ensure both indexing and query use identical pooling: `model = SentenceTransformer('nomic-embed-text-v1.5', pooling_type='mean')` in both places. Store the exact pooling config alongside your index.
Vector index was reindexed with different dimensions, but old query code still expects old dimension
After reindexing, explicitly update query code to match new dimensions. Add a schema version tag to your index metadata (e.g., `index_schema_v2_dim256`) and validate it at query time.
Using different Matryoshka model versions or model checkpoints between indexing (v1.0) and query (v2.0) with different default dimensions
Pin the exact model version: `SentenceTransformer('sentence-transformers/nomic-embed-text-v1.5')` with explicit version in requirements.txt. Never use floating `nomic-embed-text-v1.5` without version lock.
Code: broken vs fixed
import os
from sentence_transformers import SentenceTransformer
from pinecone import Pinecone
# Indexing pipeline (dimension 768)
model = SentenceTransformer('nomic-embed-text-v1.5')
docs = ['example doc 1', 'example doc 2']
embeddings = model.encode(docs, output_value='dense') # Default: 768 dims
pc = Pinecone(api_key=os.environ['PINECONE_API_KEY'])
index = pc.Index('my-index')
for i, emb in enumerate(embeddings):
index.upsert([(f'doc-{i}', emb.tolist())]) # Stores 768-dim vectors
# Query pipeline (BROKEN: dimension 256)
query_text = 'search for something'
query_embedding = model.encode(query_text, output_value='dense')[:256] # BROKEN: truncated to 256
results = index.query(query_embedding.tolist(), top_k=10) # FAILS: 256 != 768 import os
from sentence_transformers import SentenceTransformer
from pinecone import Pinecone
import json
# FIXED: Shared config file (save as embedding_config.json)
config = {
'model_name': 'nomic-embed-text-v1.5',
'output_dimensions': 768,
'pooling_type': 'mean'
}
# Indexing pipeline (dimension 768)
model = SentenceTransformer(config['model_name'])
docs = ['example doc 1', 'example doc 2']
embeddings = model.encode(docs, output_value='dense') # Uses model's 768 dims
pc = Pinecone(api_key=os.environ['PINECONE_API_KEY'])
index = pc.Index('my-index')
for i, emb in enumerate(embeddings):
index.upsert([(f'doc-{i}', emb.tolist())]) # Stores 768-dim vectors
# Query pipeline (FIXED: matches indexing dimensions)
query_text = 'search for something'
query_embedding = model.encode(query_text, output_value='dense') # FIXED: no truncation
# FIXED: Validate dimension before query
assert len(query_embedding) == config['output_dimensions'], \
f'Dimension mismatch: got {len(query_embedding)}, expected {config["output_dimensions"]}'
results = index.query(query_embedding.tolist(), top_k=10)
print(f'Found {len(results["matches"])} results') Workaround
If you cannot easily re-index with matching dimensions, create a small fallback that dynamically pads or normalizes dimensions: `if len(query_emb) < index.dimension: query_emb = np.pad(query_emb, (0, index.dimension - len(query_emb)))`. This is brittle and reduces quality, but unblocks queries while you fix the root cause by re-indexing.
Prevention
Store index metadata (model name, output_dimensions, pooling strategy, model version) alongside the vector index. At query time, load this metadata and validate it matches your embedding pipeline configuration. Use a schema versioning system: tag each index with `schema_version=2_dim768_v1.5` and fail loudly if query code uses a different version. For Matryoshka models specifically, always explicitly set `output_dimensions` in code rather than relying on defaults, and document it in your README.