How to intermediate · 3 min read

Fix memory retrieval wrong results

Quick answer

To fix wrong results in memory retrieval, ensure your embedding model matches your retrieval index and that vectors are normalized. Use precise query embeddings and tune similarity thresholds to improve accuracy. Also, verify your document preprocessing and chunking strategies to maintain relevant context.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0
pip install faiss-cpu or chromadb

Setup

Install necessary packages for embeddings and vector search. Set your OpenAI API key as an environment variable.

bash

pip install openai faiss-cpu

output

Collecting openai
Collecting faiss-cpu
Successfully installed openai-1.x faiss-cpu-1.x

Step by step

This example shows how to embed documents and queries consistently, build a FAISS index, and retrieve relevant memory with correct results.

python

import os
from openai import OpenAI
import faiss
import numpy as np

# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Sample documents to store in memory
documents = [
    "Python is a versatile programming language.",
    "OpenAI provides powerful AI models.",
    "FAISS is a library for efficient similarity search."
]

# Embed documents using the same model
def embed_texts(texts):
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=texts
    )
    embeddings = [np.array(data.embedding, dtype=np.float32) for data in response.data]
    # Normalize embeddings for cosine similarity
    embeddings = [emb / np.linalg.norm(emb) for emb in embeddings]
    return np.vstack(embeddings)

# Create FAISS index
doc_embeddings = embed_texts(documents)
index = faiss.IndexFlatIP(doc_embeddings.shape[1])  # Inner product for cosine similarity
index.add(doc_embeddings)

# Query embedding
query = "What library helps with vector search?"
query_embedding = embed_texts([query])

# Search top 2 relevant docs
k = 2
D, I = index.search(query_embedding, k)

print("Query:", query)
print("Top documents:")
for idx in I[0]:
    print(f"- {documents[idx]}")

output

Query: What library helps with vector search?
Top documents:
- FAISS is a library for efficient similarity search.
- OpenAI provides powerful AI models.

Common variations

You can use ChromaDB or other vector stores instead of FAISS. For async calls, use async OpenAI client methods. Adjust similarity thresholds or use max_tokens to control retrieval size. Use the same embedding model for both documents and queries to avoid mismatches.

python

import os
from openai import OpenAI
import chromadb

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Initialize Chroma client
chroma_client = chromadb.Client()
collection = chroma_client.create_collection(name="memory")

# Embed and add documents
texts = ["Python is versatile.", "OpenAI models are powerful."]
embeddings = client.embeddings.create(model="text-embedding-3-small", input=texts)
vectors = [data.embedding for data in embeddings.data]

for i, text in enumerate(texts):
    collection.add(ids=[str(i)], documents=[text], embeddings=[vectors[i]])

# Query
query = "Which AI models are strong?"
query_embedding = client.embeddings.create(model="text-embedding-3-small", input=[query]).data[0].embedding

results = collection.query(query_embeddings=[query_embedding], n_results=1)
print("Top result:", results['documents'][0][0])

output

Top result: OpenAI models are powerful.

Troubleshooting

If retrieval returns irrelevant results, verify that document and query embeddings use the same model and are normalized.
Check that your vector store index is updated after adding new documents.
Adjust similarity thresholds or increase k in search to get more candidates.
Ensure text chunking preserves semantic context; avoid overly large or tiny chunks.

✅

Key Takeaways

Use the same embedding model for documents and queries to ensure vector compatibility.
Normalize embeddings to improve cosine similarity accuracy in retrieval.
Keep your vector index updated after adding or modifying documents.
Tune similarity thresholds and number of retrieved candidates to balance recall and precision.

Verified 2026-04 · text-embedding-3-small, gpt-4o

Verify ↗