How to store and retrieve memories with Pinecone
Quick answer
Use
Pinecone to store AI memories by embedding text with an embedding model (e.g., text-embedding-3-small) and upserting vectors into a Pinecone index. Retrieve memories by querying the index with a query embedding to find the most relevant stored vectors.PREREQUISITES
Python 3.8+Pinecone API keyOpenAI API key (for embeddings)pip install openai>=1.0 pinecone-client
Setup
Install the required Python packages and set environment variables for your Pinecone and OpenAI API keys.
- Install packages:
pip install openai pinecone-client - Set environment variables:
export PINECONE_API_KEY=your_pinecone_keyandexport OPENAI_API_KEY=your_openai_key
pip install openai pinecone-client output
Collecting openai Collecting pinecone-client Successfully installed openai-1.x.x pinecone-client-x.x.x
Step by step
This example shows how to embed text memories using OpenAI embeddings, store them in a Pinecone index, and retrieve relevant memories by querying with a new text input.
import os
from openai import OpenAI
import pinecone
# Initialize clients
openai_client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
pc = pinecone.Pinecone(api_key=os.environ["PINECONE_API_KEY"])
# Create or connect to Pinecone index
index_name = "memory-index"
if index_name not in pc.list_indexes():
pc.create_index(index_name, dimension=1536, metric="cosine")
index = pc.Index(index_name)
# Function to embed text
def embed_text(text: str) -> list:
response = openai_client.embeddings.create(
model="text-embedding-3-small",
input=text
)
return response.data[0].embedding
# Store memories
memories = [
("id1", "Met Alice at the conference."),
("id2", "Discussed AI ethics with Bob."),
("id3", "Lunch meeting about project X.")
]
vectors = [(id, embed_text(text), {"text": text}) for id, text in memories]
index.upsert(vectors)
print("Memories stored in Pinecone index.")
# Retrieve memories
query = "Who did I talk about AI ethics with?"
query_embedding = embed_text(query)
results = index.query(vector=query_embedding, top_k=2, include_metadata=True)
print("Retrieved memories:")
for match in results.matches:
print(f"- {match.metadata['text']} (score: {match.score:.4f})") output
Memories stored in Pinecone index. Retrieved memories: - Discussed AI ethics with Bob. (score: 0.9123) - Met Alice at the conference. (score: 0.6789)
Common variations
You can use asynchronous calls with asyncio for embedding and Pinecone operations. Also, you can switch embedding models or use other vector databases with similar APIs. For large-scale memory, batch upserts and queries improve performance.
import asyncio
from openai import OpenAI
import pinecone
async def async_embed_text(client, text: str) -> list:
response = await client.embeddings.acreate(
model="text-embedding-3-small",
input=text
)
return response.data[0].embedding
async def main():
openai_client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
pc = pinecone.Pinecone(api_key=os.environ["PINECONE_API_KEY"])
index = pc.Index("memory-index")
query = "Where did I meet Alice?"
query_embedding = await async_embed_text(openai_client, query)
results = index.query(vector=query_embedding, top_k=1, include_metadata=True)
for match in results.matches:
print(f"Async retrieved: {match.metadata['text']}")
asyncio.run(main()) output
Async retrieved: Met Alice at the conference.
Troubleshooting
- If you get
Index does not exist, ensure you created the index with the correct name and dimension. - If embeddings fail, verify your OpenAI API key and model name.
- For slow queries, batch your upserts and queries or increase
top_kcarefully.
Key Takeaways
- Use OpenAI embeddings to convert text memories into vectors for Pinecone storage.
- Upsert vectors with metadata into a Pinecone index to store memories efficiently.
- Query Pinecone with an embedding of your query text to retrieve relevant memories.
- Batch operations and async calls improve performance for large memory sets.
- Always verify index existence and API keys to avoid common errors.