How to build product recommendation system with AI
embedding models like text-embedding-3-small to convert products and user data into vectors, combined with a vector database for similarity search. Enhance recommendations by integrating retrieval-augmented generation (RAG) with LLMs such as gpt-4o for personalized, context-aware suggestions.RECOMMENDATION
text-embedding-3-small for embeddings due to its cost-efficiency and quality, paired with a vector store like FAISS or Chroma, and optionally combine with gpt-4o for natural language personalized recommendations.| Use case | Best choice | Why | Runner-up |
|---|---|---|---|
| Personalized product suggestions | text-embedding-3-small + FAISS | Efficient vector search with quality embeddings for fast, relevant matches | text-embedding-3-large + Chroma |
| Context-aware recommendations with explanations | gpt-4o with RAG | LLM generates personalized suggestions and explanations using retrieved data | claude-3-5-sonnet-20241022 |
| Real-time recommendations at scale | text-embedding-3-small + FAISS | Low latency vector search optimized for large catalogs | deepseek-chat |
| Cold start with limited user data | gpt-4o prompt engineering | LLM can infer preferences from minimal input and product metadata | gemini-2.5-pro |
Top picks explained
Use text-embedding-3-small to convert product descriptions and user behavior into dense vectors for efficient similarity search. It balances cost and quality well, making it ideal for ecommerce scale. Combine this with vector databases like FAISS or Chroma to quickly find relevant products.
For richer, context-aware recommendations, integrate gpt-4o with retrieval-augmented generation (RAG). This lets the LLM generate personalized suggestions and explanations by conditioning on retrieved product data and user context.
In practice
Example: Use OpenAI's text-embedding-3-small to embed product catalog and user queries, then perform similarity search with FAISS. Use gpt-4o to generate personalized recommendations based on retrieved products.
import os
from openai import OpenAI
import faiss
import numpy as np
# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Sample product catalog
products = [
{"id": "p1", "description": "Wireless noise-cancelling headphones"},
{"id": "p2", "description": "Bluetooth portable speaker"},
{"id": "p3", "description": "Smart fitness watch with heart rate monitor"}
]
# Embed product descriptions
product_texts = [p["description"] for p in products]
response = client.embeddings.create(model="text-embedding-3-small", input=product_texts)
embeddings = np.array([data.embedding for data in response.data]).astype('float32')
# Build FAISS index
dimension = embeddings.shape[1]
index = faiss.IndexFlatL2(dimension)
index.add(embeddings)
# User query
query = "Looking for headphones with good sound quality"
query_embedding_resp = client.embeddings.create(model="text-embedding-3-small", input=[query])
query_embedding = np.array(query_embedding_resp.data[0].embedding).astype('float32')
# Search top 2 similar products
D, I = index.search(np.array([query_embedding]), k=2)
# Retrieve matched products
matched_products = [products[i] for i in I[0]]
# Generate personalized recommendation with GPT-4o
prompt = f"Recommend products based on user query: '{query}'. Products: {matched_products}"
chat_response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}]
)
print("Recommended products:", chat_response.choices[0].message.content) Recommended products: I recommend the Wireless noise-cancelling headphones as they match your desire for good sound quality. The Bluetooth portable speaker is also a great option for portable audio.
Pricing and limits
| Option | Free tier | Cost | Limits | Context |
|---|---|---|---|---|
text-embedding-3-small | Yes, limited free tokens | $0.02 per 1K tokens | 1536 dimensions, fast inference | Best for embedding product data and queries |
gpt-4o | Yes, limited free tokens | $0.03 per 1K tokens | 8K token context window | Generates personalized recommendations and explanations |
FAISS | Open source, free | Free | In-memory index, scales with RAM | Efficient vector similarity search |
Chroma | Open source, free | Free | Disk-backed, scalable vector DB | Alternative vector store with persistence |
What to avoid
- Avoid using only keyword-based search or traditional collaborative filtering without embeddings, as they lack semantic understanding and personalization.
- Do not rely solely on large LLMs without retrieval augmentation; they can hallucinate or miss product details.
- Avoid embedding models with very high dimension vectors for large catalogs due to cost and latency.
- Steer clear of deprecated APIs or older models like
gpt-3.5-turbofor production recommendation systems.
How to evaluate for your case
Measure recommendation quality by combining offline metrics like precision@k and recall@k on historical purchase data with online A/B testing for user engagement. Benchmark embedding similarity with your product catalog size and latency requirements. Test LLM-generated recommendations for relevance and factual accuracy using human evaluation or automated feedback loops.
Key Takeaways
- Use
text-embedding-3-smallfor cost-effective, high-quality product embeddings. - Combine vector search with
FAISSorChromafor scalable similarity retrieval. - Enhance recommendations with
gpt-4ousing retrieval-augmented generation for personalized context. - Avoid outdated models and purely keyword-based methods lacking semantic understanding.
- Evaluate with both offline metrics and live user feedback for best results.