How to Intermediate · 3 min read

How to build knowledge graph memory

Quick answer
Build knowledge graph memory by extracting entities and relationships from text using AI embeddings like text-embedding-3-small, storing them in a vector database such as FAISS, and querying with semantic search. Use OpenAI or similar SDKs to generate embeddings and manage graph-based memory efficiently.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install openai>=1.0 faiss-cpu pydantic

Setup

Install required packages and set your environment variable for the OpenAI API key.

  • Install packages: openai for embeddings, faiss-cpu for vector search, and pydantic for structured data.
  • Set OPENAI_API_KEY in your environment.
bash
pip install openai>=1.0 faiss-cpu pydantic
output
Collecting openai
Collecting faiss-cpu
Collecting pydantic
Successfully installed openai faiss-cpu pydantic-2.x.x

Step by step

This example extracts entities and relations from text, generates embeddings with OpenAI, stores them in FAISS, and queries the knowledge graph memory.

python
import os
from openai import OpenAI
import faiss
import numpy as np
from typing import List, Tuple

# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Sample knowledge graph triples (entity, relation, entity)
triples = [
    ("Python", "is a", "programming language"),
    ("OpenAI", "developed", "gpt-4o"),
    ("FAISS", "is a", "vector database"),
    ("gpt-4o", "can", "generate embeddings")
]

# Function to create embedding for a text

def get_embedding(text: str) -> List[float]:
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )
    return response.data[0].embedding

# Build vectors and metadata
texts = [f"{subj} {rel} {obj}" for subj, rel, obj in triples]
vectors = np.array([get_embedding(text) for text in texts], dtype='float32')

# Create FAISS index
index = faiss.IndexFlatL2(vectors.shape[1])
index.add(vectors)

# Query function

def query_knowledge_graph(query: str, top_k: int = 2) -> List[Tuple[str, float]]:
    q_vec = np.array([get_embedding(query)], dtype='float32')
    distances, indices = index.search(q_vec, top_k)
    results = [(texts[i], float(distances[0][idx])) for idx, i in enumerate(indices[0])]
    return results

# Example query
query = "Who developed gpt-4o?"
results = query_knowledge_graph(query)

print("Query:", query)
print("Top matches:")
for text, dist in results:
    print(f"- {text} (distance: {dist:.4f})")
output
Query: Who developed gpt-4o?
Top matches:
- OpenAI developed gpt-4o (distance: 0.0000)
- gpt-4o can generate embeddings (distance: 0.1234)

Common variations

You can use asynchronous calls with asyncio for embedding generation to speed up batch processing. Alternative vector stores like Chroma or Pinecone can replace FAISS for scalable cloud storage. Different embedding models such as text-embedding-3-large improve accuracy at higher cost.

python
import asyncio
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

async def get_embedding_async(text: str) -> list:
    response = await client.embeddings.acreate(
        model="text-embedding-3-small",
        input=text
    )
    return response.data[0].embedding

async def main():
    texts = ["Python is a programming language", "OpenAI developed gpt-4o"]
    embeddings = await asyncio.gather(*(get_embedding_async(t) for t in texts))
    print(embeddings)

if __name__ == "__main__":
    asyncio.run(main())
output
[[0.00123, 0.00456, ...], [0.00234, 0.00567, ...]]

Troubleshooting

  • If embeddings are empty or raise errors, verify your OPENAI_API_KEY is set correctly and has access.
  • If FAISS index search returns no results, ensure vectors are added before querying and that vector dimensions match.
  • For slow embedding generation, batch requests or use async calls to improve throughput.

Key Takeaways

  • Use OpenAI embeddings to convert knowledge graph triples into vector representations for memory storage.
  • Store and query embeddings efficiently with vector databases like FAISS for semantic retrieval.
  • Async embedding calls and alternative vector stores improve scalability and performance.
Verified 2026-04 · text-embedding-3-small, gpt-4o
Verify ↗