ValueError
builtins.ValueError
Stack trace
ValueError: No documents found with similarity score above the threshold
File "app.py", line 42, in retrieve_docs
docs = vectorstore.similarity_search(query, score_threshold=0.8)
File "langchain/vectorstores/base.py", line 123, in similarity_search
raise ValueError("No documents found with similarity score above the threshold") Why it happens
The similarity search applies a minimum score threshold to filter out low-relevance documents. If no documents in the vector store have a similarity score exceeding this threshold, the search returns no results and raises this error. This often happens when the threshold is set too high or the vector embeddings do not closely match the query.
Detection
Monitor the similarity search results count and catch ValueError exceptions to detect when no documents meet the threshold before downstream processing.
Causes & fixes
The similarity score threshold is set too high, filtering out all documents.
Lower the score_threshold parameter in your similarity_search call to a more permissive value, such as 0.5 or 0.6.
The vector embeddings for the query and documents are not well aligned or trained.
Use a better embedding model or retrain your embeddings to improve semantic similarity matching.
The vector store is empty or contains very few documents.
Ensure your vector store is properly populated with relevant documents before performing similarity search.
Incorrect query preprocessing causing embeddings to mismatch.
Verify and standardize query preprocessing steps (e.g., tokenization, lowercasing) to match document embedding preprocessing.
Code: broken vs fixed
from langchain.vectorstores import FAISS
vectorstore = FAISS.load_local("./faiss_index", embeddings)
query = "What is AI?"
docs = vectorstore.similarity_search(query, score_threshold=0.9) # Raises ValueError here
print(docs) import os
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()
vectorstore = FAISS.load_local("./faiss_index", embeddings)
query = "What is AI?"
docs = vectorstore.similarity_search(query, score_threshold=0.6) # Lowered threshold to fix error
print(docs) Workaround
Catch the ValueError exception around similarity_search calls and fallback to a search without a threshold or with a lower threshold to ensure some results are returned.
Prevention
Design your RAG pipeline to dynamically adjust or validate similarity score thresholds based on vector store statistics and embedding quality to avoid empty search results.