High severity intermediate · Fix: 5-15 min

ValueError: dimension_mismatch

ValueError: Query embedding dimension [X] does not match stored embedding dimension [Y]

What this error means

Matryoshka embedding model was truncated to different dimensions during indexing vs query time, causing vector search to fail with incompatible dimension sizes.

Stack trace

traceback

Traceback (most recent call last):
  File "search.py", line 42, in query_embeddings
    results = index.query(query_vector, top_k=10)
  File "pinecone/index.py", line 156, in query
    self._validate_dimension(query_vector)
  File "pinecone/index.py", line 89, in _validate_dimension
    raise ValueError(f"Query embedding dimension {len(query_vector)} does not match index dimension {self.dimension}. Expected {self.dimension}, got {len(query_vector)}")
ValueError: Query embedding dimension 256 does not match index dimension 768

QUICK FIX

Add a dimension assertion before query: `assert len(query_embedding) == index.dimension, f'Got {len(query_embedding)}, expected {index.dimension}'`, then synchronize the `output_dimensions` parameter across indexing and query code using a shared config file.

Why it happens

Matryoshka embedding models like 'nomic-embed-text-v1.5' and 'e5-large-v2' support flexible output dimensions through truncation. If you indexed documents with `output_dimensions=768` but query with `output_dimensions=256` (or vice versa), the vector dimensions won't match. The vector database (Pinecone, Qdrant, Weaviate, or Milvus) requires strict dimensional consistency across all vectors. This is common when using Matryoshka models for cost optimization but accidentally changing the dimension parameter between indexing and query time.

Detection

Before querying, verify that both your indexing pipeline and query pipeline use identical `output_dimensions` parameters. Log the dimension of your query embeddings and compare against the dimension metadata of your vector index. Add an assertion: `assert query_vector.shape[0] == index.dimension, f'Dimension mismatch: {query_vector.shape[0]} vs {index.dimension}'`.

Causes & fixes

Indexing used `output_dimensions=768` but query code uses `output_dimensions=256` due to code change or config mismatch

✓ Fix

Add a centralized config file (e.g., `embedding_config.yaml` with `output_dimensions: 768`) and import it in both indexing and query scripts. Verify both pipelines read from the same config source.

Matryoshka model loaded with different pooling strategy (e.g., 'mean' vs 'cls') which affects final dimension

✓ Fix

Ensure both indexing and query use identical pooling: `model = SentenceTransformer('nomic-embed-text-v1.5', pooling_type='mean')` in both places. Store the exact pooling config alongside your index.

Vector index was reindexed with different dimensions, but old query code still expects old dimension

✓ Fix

After reindexing, explicitly update query code to match new dimensions. Add a schema version tag to your index metadata (e.g., `index_schema_v2_dim256`) and validate it at query time.

Using different Matryoshka model versions or model checkpoints between indexing (v1.0) and query (v2.0) with different default dimensions

✓ Fix

Pin the exact model version: `SentenceTransformer('sentence-transformers/nomic-embed-text-v1.5')` with explicit version in requirements.txt. Never use floating `nomic-embed-text-v1.5` without version lock.

Code: broken vs fixed

Broken - triggers the error

python

import os
from sentence_transformers import SentenceTransformer
from pinecone import Pinecone

# Indexing pipeline (dimension 768)
model = SentenceTransformer('nomic-embed-text-v1.5')
docs = ['example doc 1', 'example doc 2']
embeddings = model.encode(docs, output_value='dense')  # Default: 768 dims

pc = Pinecone(api_key=os.environ['PINECONE_API_KEY'])
index = pc.Index('my-index')
for i, emb in enumerate(embeddings):
    index.upsert([(f'doc-{i}', emb.tolist())])  # Stores 768-dim vectors

# Query pipeline (BROKEN: dimension 256)
query_text = 'search for something'
query_embedding = model.encode(query_text, output_value='dense')[:256]  # BROKEN: truncated to 256
results = index.query(query_embedding.tolist(), top_k=10)  # FAILS: 256 != 768

Fixed - works correctly

python

import os
from sentence_transformers import SentenceTransformer
from pinecone import Pinecone
import json

# FIXED: Shared config file (save as embedding_config.json)
config = {
    'model_name': 'nomic-embed-text-v1.5',
    'output_dimensions': 768,
    'pooling_type': 'mean'
}

# Indexing pipeline (dimension 768)
model = SentenceTransformer(config['model_name'])
docs = ['example doc 1', 'example doc 2']
embeddings = model.encode(docs, output_value='dense')  # Uses model's 768 dims

pc = Pinecone(api_key=os.environ['PINECONE_API_KEY'])
index = pc.Index('my-index')
for i, emb in enumerate(embeddings):
    index.upsert([(f'doc-{i}', emb.tolist())])  # Stores 768-dim vectors

# Query pipeline (FIXED: matches indexing dimensions)
query_text = 'search for something'
query_embedding = model.encode(query_text, output_value='dense')  # FIXED: no truncation

# FIXED: Validate dimension before query
assert len(query_embedding) == config['output_dimensions'], \
    f'Dimension mismatch: got {len(query_embedding)}, expected {config["output_dimensions"]}'

results = index.query(query_embedding.tolist(), top_k=10)
print(f'Found {len(results["matches"])} results')

Added a shared embedding config dict that both indexing and query pipelines read from, removed the manual truncation `[:256]` to let the model output its natural dimensions, and added an assertion to validate dimensional consistency before querying the index.

⚠

Workaround

If you cannot easily re-index with matching dimensions, create a small fallback that dynamically pads or normalizes dimensions: `if len(query_emb) < index.dimension: query_emb = np.pad(query_emb, (0, index.dimension - len(query_emb)))`. This is brittle and reduces quality, but unblocks queries while you fix the root cause by re-indexing.

✓

Prevention

Store index metadata (model name, output_dimensions, pooling strategy, model version) alongside the vector index. At query time, load this metadata and validate it matches your embedding pipeline configuration. Use a schema versioning system: tag each index with `schema_version=2_dim768_v1.5` and fail loudly if query code uses a different version. For Matryoshka models specifically, always explicitly set `output_dimensions` in code rather than relying on defaults, and document it in your README.

Python 3.9+ · sentence-transformers, pinecone-client >=sentence-transformers>=2.2.0, pinecone-client>=3.0.0 · tested on sentence-transformers==2.7.0, pinecone-client==3.2.1

Verified 2026-04 · nomic-embed-text-v1.5, sentence-transformers/e5-large-v2, mixedbread-ai/mxbai-embed-large-v1

Verify ↗

Community Notes

No notes yetBe the first to share a version-specific fix or tip.