How to Intermediate · 4 min read

How to add search to chatbot

Quick answer

Add search to a chatbot by embedding user queries and documents with OpenAI embeddings, storing them in a vector database like FAISS, and retrieving relevant documents to provide context in chat completions. Combine search results with the chatbot prompt to enhance responses.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai faiss-cpu numpy

Setup

Install required packages and set your OpenAI API key as an environment variable.

Install packages: openai for embeddings and chat, faiss-cpu for vector search, and numpy for numeric operations.
Set environment variable OPENAI_API_KEY with your API key.

bash

pip install openai faiss-cpu numpy

output

Collecting openai
Collecting faiss-cpu
Collecting numpy
Successfully installed openai faiss-cpu numpy-1.25.0

Step by step

This example shows how to embed documents, build a FAISS index, perform a similarity search on user queries, and use the retrieved context in a chatbot prompt with gpt-4o.

python

import os
import numpy as np
import faiss
from openai import OpenAI

# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Sample documents to index
documents = [
    "Python is a popular programming language.",
    "OpenAI provides powerful AI models.",
    "FAISS is a library for efficient similarity search.",
    "Chatbots can be enhanced with search capabilities."
]

# Embed documents
response = client.embeddings.create(
    model="text-embedding-3-small",
    input=documents
)
embeddings = np.array([data.embedding for data in response.data], dtype=np.float32)

# Build FAISS index
index = faiss.IndexFlatL2(embeddings.shape[1])
index.add(embeddings)

# Function to search documents
def search(query, k=2):
    query_resp = client.embeddings.create(model="text-embedding-3-small", input=[query])
    query_vec = np.array([query_resp.data[0].embedding], dtype=np.float32)
    distances, indices = index.search(query_vec, k)
    return [documents[i] for i in indices[0]]

# User query
user_query = "How can I improve my chatbot?"

# Retrieve relevant documents
results = search(user_query)
context = "\n".join(results)

# Create chat prompt with context
messages = [
    {"role": "system", "content": "You are a helpful assistant that uses the following context to answer questions."},
    {"role": "user", "content": f"Context:\n{context}\n\nQuestion: {user_query}"}
]

# Generate chat completion
chat_response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages
)

print("Chatbot response:", chat_response.choices[0].message.content)

output

Chatbot response: To improve your chatbot, you can integrate search capabilities that retrieve relevant information dynamically. Using embeddings and a vector search like FAISS allows your chatbot to provide more accurate and context-aware answers.

Common variations

You can adapt this approach by:

Using asynchronous calls with asyncio and the OpenAI async client.
Switching to other vector stores like Chroma or Pinecone for scalable search.
Using different embedding models such as text-embedding-3-large for higher quality.
Streaming chat completions for real-time response display.

python

import asyncio
from openai import OpenAI

async def async_search_and_chat():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

    # Async embed documents
    documents = ["Example doc 1", "Example doc 2"]
    embed_resp = await client.embeddings.acreate(model="text-embedding-3-small", input=documents)
    embeddings = [data.embedding for data in embed_resp.data]

    # (Vector store code omitted for brevity)

    # Async chat completion
    messages = [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ]
    chat_resp = await client.chat.completions.acreate(model="gpt-4o-mini", messages=messages)
    print(chat_resp.choices[0].message.content)

asyncio.run(async_search_and_chat())

output

Hello! How can I assist you today?

Troubleshooting

If embeddings are empty or errors occur, verify your OPENAI_API_KEY is set correctly.
If FAISS index search returns no results, ensure embeddings are generated and added properly.
For slow responses, consider caching embeddings or using smaller models.
Check for API rate limits and handle exceptions gracefully.

✅

Key Takeaways

Use OpenAI embeddings to convert text into vectors for semantic search.
Store embeddings in a vector database like FAISS for efficient similarity retrieval.
Incorporate retrieved documents as context in chatbot prompts to improve answer relevance.

Verified 2026-04 · gpt-4o-mini, text-embedding-3-small

Verify ↗