Fix memory retrieval wrong results
Quick answer
To fix wrong results in memory retrieval, ensure your embedding model matches your retrieval index and that vectors are normalized. Use precise query embeddings and tune similarity thresholds to improve accuracy. Also, verify your document preprocessing and chunking strategies to maintain relevant context.
PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0pip install faiss-cpu or chromadb
Setup
Install necessary packages for embeddings and vector search. Set your OpenAI API key as an environment variable.
pip install openai faiss-cpu output
Collecting openai Collecting faiss-cpu Successfully installed openai-1.x faiss-cpu-1.x
Step by step
This example shows how to embed documents and queries consistently, build a FAISS index, and retrieve relevant memory with correct results.
import os
from openai import OpenAI
import faiss
import numpy as np
# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Sample documents to store in memory
documents = [
"Python is a versatile programming language.",
"OpenAI provides powerful AI models.",
"FAISS is a library for efficient similarity search."
]
# Embed documents using the same model
def embed_texts(texts):
response = client.embeddings.create(
model="text-embedding-3-small",
input=texts
)
embeddings = [np.array(data.embedding, dtype=np.float32) for data in response.data]
# Normalize embeddings for cosine similarity
embeddings = [emb / np.linalg.norm(emb) for emb in embeddings]
return np.vstack(embeddings)
# Create FAISS index
doc_embeddings = embed_texts(documents)
index = faiss.IndexFlatIP(doc_embeddings.shape[1]) # Inner product for cosine similarity
index.add(doc_embeddings)
# Query embedding
query = "What library helps with vector search?"
query_embedding = embed_texts([query])
# Search top 2 relevant docs
k = 2
D, I = index.search(query_embedding, k)
print("Query:", query)
print("Top documents:")
for idx in I[0]:
print(f"- {documents[idx]}") output
Query: What library helps with vector search? Top documents: - FAISS is a library for efficient similarity search. - OpenAI provides powerful AI models.
Common variations
You can use ChromaDB or other vector stores instead of FAISS. For async calls, use async OpenAI client methods. Adjust similarity thresholds or use max_tokens to control retrieval size. Use the same embedding model for both documents and queries to avoid mismatches.
import os
from openai import OpenAI
import chromadb
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Initialize Chroma client
chroma_client = chromadb.Client()
collection = chroma_client.create_collection(name="memory")
# Embed and add documents
texts = ["Python is versatile.", "OpenAI models are powerful."]
embeddings = client.embeddings.create(model="text-embedding-3-small", input=texts)
vectors = [data.embedding for data in embeddings.data]
for i, text in enumerate(texts):
collection.add(ids=[str(i)], documents=[text], embeddings=[vectors[i]])
# Query
query = "Which AI models are strong?"
query_embedding = client.embeddings.create(model="text-embedding-3-small", input=[query]).data[0].embedding
results = collection.query(query_embeddings=[query_embedding], n_results=1)
print("Top result:", results['documents'][0][0]) output
Top result: OpenAI models are powerful.
Troubleshooting
- If retrieval returns irrelevant results, verify that document and query embeddings use the same model and are normalized.
- Check that your vector store index is updated after adding new documents.
- Adjust similarity thresholds or increase
kin search to get more candidates. - Ensure text chunking preserves semantic context; avoid overly large or tiny chunks.
Key Takeaways
- Use the same embedding model for documents and queries to ensure vector compatibility.
- Normalize embeddings to improve cosine similarity accuracy in retrieval.
- Keep your vector index updated after adding or modifying documents.
- Tune similarity thresholds and number of retrieved candidates to balance recall and precision.