How to beginner · 3 min read

How to store and retrieve memories with Pinecone

Q: How to store and retrieve memories with Pinecone

Use Pinecone to store AI memories by embedding text with an embedding model (e.g., text-embedding-3-small) and upserting vectors into a Pinecone index. Retrieve memories by querying the index with a query embedding to find the most relevant stored vectors.

Quick answer

Use Pinecone to store AI memories by embedding text with an embedding model (e.g., text-embedding-3-small) and upserting vectors into a Pinecone index. Retrieve memories by querying the index with a query embedding to find the most relevant stored vectors.

PREREQUISITES

Python 3.8+
Pinecone API key
OpenAI API key (for embeddings)
pip install openai>=1.0 pinecone-client

Setup

Install the required Python packages and set environment variables for your Pinecone and OpenAI API keys.

Install packages: pip install openai pinecone-client
Set environment variables: export PINECONE_API_KEY=your_pinecone_key and export OPENAI_API_KEY=your_openai_key

bash

pip install openai pinecone-client

output

Collecting openai
Collecting pinecone-client
Successfully installed openai-1.x.x pinecone-client-x.x.x

Step by step

This example shows how to embed text memories using OpenAI embeddings, store them in a Pinecone index, and retrieve relevant memories by querying with a new text input.

python

import os
from openai import OpenAI
import pinecone

# Initialize clients
openai_client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
pc = pinecone.Pinecone(api_key=os.environ["PINECONE_API_KEY"])

# Create or connect to Pinecone index
index_name = "memory-index"
if index_name not in pc.list_indexes():
    pc.create_index(index_name, dimension=1536, metric="cosine")
index = pc.Index(index_name)

# Function to embed text

def embed_text(text: str) -> list:
    response = openai_client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )
    return response.data[0].embedding

# Store memories
memories = [
    ("id1", "Met Alice at the conference."),
    ("id2", "Discussed AI ethics with Bob."),
    ("id3", "Lunch meeting about project X.")
]

vectors = [(id, embed_text(text), {"text": text}) for id, text in memories]
index.upsert(vectors)
print("Memories stored in Pinecone index.")

# Retrieve memories
query = "Who did I talk about AI ethics with?"
query_embedding = embed_text(query)
results = index.query(vector=query_embedding, top_k=2, include_metadata=True)

print("Retrieved memories:")
for match in results.matches:
    print(f"- {match.metadata['text']} (score: {match.score:.4f})")

output

Memories stored in Pinecone index.
Retrieved memories:
- Discussed AI ethics with Bob. (score: 0.9123)
- Met Alice at the conference. (score: 0.6789)

Common variations

You can use asynchronous calls with asyncio for embedding and Pinecone operations. Also, you can switch embedding models or use other vector databases with similar APIs. For large-scale memory, batch upserts and queries improve performance.

python

import asyncio
from openai import OpenAI
import pinecone

async def async_embed_text(client, text: str) -> list:
    response = await client.embeddings.acreate(
        model="text-embedding-3-small",
        input=text
    )
    return response.data[0].embedding

async def main():
    openai_client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    pc = pinecone.Pinecone(api_key=os.environ["PINECONE_API_KEY"])
    index = pc.Index("memory-index")

    query = "Where did I meet Alice?"
    query_embedding = await async_embed_text(openai_client, query)
    results = index.query(vector=query_embedding, top_k=1, include_metadata=True)

    for match in results.matches:
        print(f"Async retrieved: {match.metadata['text']}")

asyncio.run(main())

output

Async retrieved: Met Alice at the conference.

Troubleshooting

If you get Index does not exist, ensure you created the index with the correct name and dimension.
If embeddings fail, verify your OpenAI API key and model name.
For slow queries, batch your upserts and queries or increase top_k carefully.

✅

Key Takeaways

Use OpenAI embeddings to convert text memories into vectors for Pinecone storage.
Upsert vectors with metadata into a Pinecone index to store memories efficiently.
Query Pinecone with an embedding of your query text to retrieve relevant memories.
Batch operations and async calls improve performance for large memory sets.
Always verify index existence and API keys to avoid common errors.

Verified 2026-04 · text-embedding-3-small, gpt-4o

Verify ↗