How to Intermediate · 4 min read

How to add search to chatbot

Quick answer
Add search to a chatbot by embedding user queries and documents with OpenAI embeddings, storing them in a vector database like FAISS, and retrieving relevant documents to provide context in chat completions. Combine search results with the chatbot prompt to enhance responses.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install openai faiss-cpu numpy

Setup

Install required packages and set your OpenAI API key as an environment variable.

  • Install packages: openai for embeddings and chat, faiss-cpu for vector search, and numpy for numeric operations.
  • Set environment variable OPENAI_API_KEY with your API key.
bash
pip install openai faiss-cpu numpy
output
Collecting openai
Collecting faiss-cpu
Collecting numpy
Successfully installed openai faiss-cpu numpy-1.25.0

Step by step

This example shows how to embed documents, build a FAISS index, perform a similarity search on user queries, and use the retrieved context in a chatbot prompt with gpt-4o.

python
import os
import numpy as np
import faiss
from openai import OpenAI

# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Sample documents to index
documents = [
    "Python is a popular programming language.",
    "OpenAI provides powerful AI models.",
    "FAISS is a library for efficient similarity search.",
    "Chatbots can be enhanced with search capabilities."
]

# Embed documents
response = client.embeddings.create(
    model="text-embedding-3-small",
    input=documents
)
embeddings = np.array([data.embedding for data in response.data], dtype=np.float32)

# Build FAISS index
index = faiss.IndexFlatL2(embeddings.shape[1])
index.add(embeddings)

# Function to search documents
def search(query, k=2):
    query_resp = client.embeddings.create(model="text-embedding-3-small", input=[query])
    query_vec = np.array([query_resp.data[0].embedding], dtype=np.float32)
    distances, indices = index.search(query_vec, k)
    return [documents[i] for i in indices[0]]

# User query
user_query = "How can I improve my chatbot?"

# Retrieve relevant documents
results = search(user_query)
context = "\n".join(results)

# Create chat prompt with context
messages = [
    {"role": "system", "content": "You are a helpful assistant that uses the following context to answer questions."},
    {"role": "user", "content": f"Context:\n{context}\n\nQuestion: {user_query}"}
]

# Generate chat completion
chat_response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages
)

print("Chatbot response:", chat_response.choices[0].message.content)
output
Chatbot response: To improve your chatbot, you can integrate search capabilities that retrieve relevant information dynamically. Using embeddings and a vector search like FAISS allows your chatbot to provide more accurate and context-aware answers.

Common variations

You can adapt this approach by:

  • Using asynchronous calls with asyncio and the OpenAI async client.
  • Switching to other vector stores like Chroma or Pinecone for scalable search.
  • Using different embedding models such as text-embedding-3-large for higher quality.
  • Streaming chat completions for real-time response display.
python
import asyncio
from openai import OpenAI

async def async_search_and_chat():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

    # Async embed documents
    documents = ["Example doc 1", "Example doc 2"]
    embed_resp = await client.embeddings.acreate(model="text-embedding-3-small", input=documents)
    embeddings = [data.embedding for data in embed_resp.data]

    # (Vector store code omitted for brevity)

    # Async chat completion
    messages = [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ]
    chat_resp = await client.chat.completions.acreate(model="gpt-4o-mini", messages=messages)
    print(chat_resp.choices[0].message.content)

asyncio.run(async_search_and_chat())
output
Hello! How can I assist you today?

Troubleshooting

  • If embeddings are empty or errors occur, verify your OPENAI_API_KEY is set correctly.
  • If FAISS index search returns no results, ensure embeddings are generated and added properly.
  • For slow responses, consider caching embeddings or using smaller models.
  • Check for API rate limits and handle exceptions gracefully.

Key Takeaways

  • Use OpenAI embeddings to convert text into vectors for semantic search.
  • Store embeddings in a vector database like FAISS for efficient similarity retrieval.
  • Incorporate retrieved documents as context in chatbot prompts to improve answer relevance.
Verified 2026-04 · gpt-4o-mini, text-embedding-3-small
Verify ↗