How to beginner · 3 min read

Semantic Kernel memory explained

Quick answer
In Semantic Kernel, memory refers to the capability to store, retrieve, and manage contextual information across AI interactions, enabling stateful conversations and knowledge retention. It abstracts persistent or transient storage to enhance AI responses by providing relevant prior context.

PREREQUISITES

  • Python 3.8+
  • pip install semantic-kernel
  • OpenAI API key (free tier works)
  • pip install openai>=1.0

Setup

Install the semantic-kernel Python package and set your OpenAI API key as an environment variable to enable AI model access.
bash
pip install semantic-kernel openai

Step by step

This example demonstrates how to create a Semantic Kernel instance, add memory storage, and use it to store and retrieve contextual information during chat interactions.
python
import os
import semantic_kernel as sk
from semantic_kernel.connectors.ai.open_ai import OpenAIChatCompletion

# Initialize kernel
kernel = sk.Kernel()

# Add OpenAI chat completion service
kernel.add_service(OpenAIChatCompletion(
    service_id="chat",
    api_key=os.environ["OPENAI_API_KEY"],
    ai_model_id="gpt-4o-mini"
))

# Create memory store (in-memory for demo)
memory = kernel.memory

# Save a memory entry
memory.save("user_profile", {"name": "Alice", "preferences": "likes sci-fi books"})

# Retrieve memory entry
user_profile = memory.get("user_profile")
print("Retrieved memory:", user_profile)

# Use memory in prompt
prompt = f"User info: {user_profile}\nGenerate a book recommendation."
response = kernel.chat.complete("chat", prompt)
print("AI response:", response.text)
output
Retrieved memory: {'name': 'Alice', 'preferences': 'likes sci-fi books'}
AI response: I recommend "Dune" by Frank Herbert, a classic sci-fi novel that explores complex themes and immersive world-building.

Common variations

Use persistent memory stores like VolatileMemoryStore or external databases for long-term context. Integrate memory with custom skills to enable dynamic context-aware AI workflows. Use async methods for scalable applications.
python
import asyncio

async def async_memory_example():
    kernel = sk.Kernel()
    kernel.add_service(OpenAIChatCompletion(
        service_id="chat",
        api_key=os.environ["OPENAI_API_KEY"],
        ai_model_id="gpt-4o-mini"
    ))
    memory = kernel.memory

    await memory.save_async("session_data", {"topic": "space exploration"})
    session = await memory.get_async("session_data")
    print("Async retrieved memory:", session)

asyncio.run(async_memory_example())
output
Async retrieved memory: {'topic': 'space exploration'}

Troubleshooting

If memory retrieval returns None, ensure the key exists and was saved correctly. For persistent memory, verify database connections and credentials. Check that your API key is set correctly in os.environ["OPENAI_API_KEY"].

Key Takeaways

  • Semantic Kernel memory enables stateful AI interactions by storing and retrieving context.
  • Use in-memory or persistent stores depending on your application's needs.
  • Integrate memory with AI chat completions to provide personalized and context-aware responses.
Verified 2026-04 · gpt-4o-mini
Verify ↗