How to Intermediate · 4 min read

How to search files with OpenAI assistant

Quick answer

Use OpenAI embeddings to convert file contents into vectors, store them in a vector database, then query with the OpenAI assistant using client.chat.completions.create for relevant file search results. This approach enables semantic search over your files with gpt-4o or similar models.

PREREQUISITES

Python 3.8+
OpenAI API key (free tier works)
pip install openai>=1.0
pip install faiss-cpu (or another vector store)

Setup

Install the OpenAI Python SDK and a vector store library like FAISS to handle embeddings and similarity search. Set your OpenAI API key as an environment variable.

Install OpenAI SDK: pip install openai
Install FAISS for vector search: pip install faiss-cpu
Set environment variable: export OPENAI_API_KEY='your_api_key' (Linux/macOS) or setx OPENAI_API_KEY "your_api_key" (Windows)

bash

pip install openai faiss-cpu

Step by step

This example loads text files, creates embeddings with OpenAI's gpt-4o embedding model, indexes them with FAISS, and queries the index to find the most relevant file content for a user query. Then it uses the OpenAI chat completion to generate a helpful answer based on the retrieved context.

python

import os
from openai import OpenAI
import faiss
import numpy as np

# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Sample files content
files = {
    "file1.txt": "Python is a versatile programming language used for web development, data analysis, and AI.",
    "file2.txt": "OpenAI provides powerful AI models like GPT-4 for natural language processing tasks.",
    "file3.txt": "FAISS is a library for efficient similarity search and clustering of dense vectors."
}

# Step 1: Create embeddings for each file content
embeddings = []
file_names = []
for fname, content in files.items():
    response = client.embeddings.create(
        input=content,
        model="gpt-4o"
    )
    vector = np.array(response.data[0].embedding, dtype=np.float32)
    embeddings.append(vector)
    file_names.append(fname)

embeddings = np.vstack(embeddings)

# Step 2: Build FAISS index
dimension = embeddings.shape[1]
index = faiss.IndexFlatL2(dimension)
index.add(embeddings)

# Step 3: Query vector for user question
query = "What library helps with vector similarity search?"
query_embedding_resp = client.embeddings.create(
    input=query,
    model="gpt-4o"
)
query_vector = np.array(query_embedding_resp.data[0].embedding, dtype=np.float32).reshape(1, -1)

# Step 4: Search top 1 nearest neighbor
k = 1
distances, indices = index.search(query_vector, k)

# Retrieve the most relevant file content
relevant_file = file_names[indices[0][0]]
relevant_text = files[relevant_file]

# Step 5: Use chat completion to answer based on retrieved context
messages = [
    {"role": "system", "content": "You are a helpful assistant that answers questions based on provided file content."},
    {"role": "user", "content": f"Context: {relevant_text}\n\nQuestion: {query}"}
]

response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages
)

answer = response.choices[0].message.content
print(f"Answer:\n{answer}")

output

Answer:
FAISS is a library designed for efficient similarity search and clustering of dense vectors, making it ideal for vector similarity search tasks.

Common variations

Use async calls with asyncio and await for embedding and chat requests to improve performance on large datasets.
Switch to other vector stores like Chroma or Pinecone for scalable cloud-based search.
Use different OpenAI models like gpt-4o-mini for cost-effective embedding and chat.
Combine multiple file chunks with metadata for more granular search results.

Troubleshooting

If embeddings are empty or errors occur, verify your API key is set correctly in os.environ["OPENAI_API_KEY"].
If FAISS index search returns no results, ensure embeddings are correctly generated and indexed.
For large files, split text into smaller chunks before embedding to avoid token limits.
Check network connectivity if API calls time out or fail.

✅

Key Takeaways

Use OpenAI embeddings to convert file contents into vectors for semantic search.
Index embeddings with FAISS or other vector stores to enable fast similarity queries.
Leverage OpenAI chat completions with retrieved context to generate precise answers.
Split large files into chunks to handle token limits during embedding and chat.
Always secure your API key via environment variables and handle errors gracefully.

Verified 2026-04 · gpt-4o

Verify ↗