How to Intermediate · 3 min read

How to use DeepSeek for RAG

Quick answer
Use DeepSeek's deepseek-chat model with a vector database to implement Retrieval-Augmented Generation (RAG). Query your vector store for relevant documents, then pass those as context in the messages parameter to client.chat.completions.create for informed responses.

PREREQUISITES

  • Python 3.8+
  • DeepSeek API key
  • pip install openai>=1.0
  • A vector store like FAISS or Chroma installed and populated

Setup

Install the OpenAI-compatible openai Python package and set your DeepSeek API key as an environment variable.

  • Install the package: pip install openai
  • Set environment variable in your shell: export DEEPSEEK_API_KEY='your_api_key'
bash
pip install openai

Step by step

This example shows how to query a vector store for relevant documents, then use deepseek-chat to generate an answer augmented by those documents.

python
import os
from openai import OpenAI

# Initialize DeepSeek client with base_url
client = OpenAI(api_key=os.environ["DEEPSEEK_API_KEY"], base_url="https://api.deepseek.com")

# Example: retrieved documents from your vector store (mocked here)
docs = [
    "DeepSeek is a Chinese AI company providing large language models.",
    "RAG combines retrieval with generation for better context-aware answers."
]

# Construct context prompt with retrieved docs
context = "\n\n".join(docs)

# User question
user_question = "What is DeepSeek and how does RAG work?"

# Prepare messages with context
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": f"Context:\n{context}\n\nQuestion: {user_question}"}
]

# Call DeepSeek chat completion
response = client.chat.completions.create(
    model="deepseek-chat",
    messages=messages,
    max_tokens=512
)

print(response.choices[0].message.content)
output
DeepSeek is a Chinese AI company that provides large language models. Retrieval-Augmented Generation (RAG) combines document retrieval with language generation to produce context-aware and accurate answers by leveraging relevant external information.

Common variations

You can adapt this pattern by:

  • Using different vector stores like FAISS or Chroma for document retrieval.
  • Changing the max_tokens or model to deepseek-chat variants if available.
  • Implementing async calls with asyncio and openai if supported.

Troubleshooting

  • If you get authentication errors, verify your DEEPSEEK_API_KEY environment variable is set correctly.
  • If responses are incomplete, increase max_tokens.
  • For connection issues, check your network and DeepSeek API status.

Key Takeaways

  • Use DeepSeek's deepseek-chat model with a vector store to implement RAG effectively.
  • Pass retrieved documents as context in the messages parameter for informed generation.
  • Always secure your API key via environment variables and handle errors gracefully.
Verified 2026-04 · deepseek-chat
Verify ↗