How to Intermediate · 3 min read

AI for personalized product search

Quick answer
Use LLMs combined with embedding-based semantic search and user behavior data to create personalized product search. This approach matches user queries with relevant products dynamically, improving relevance and conversion.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install openai>=1.0
  • pip install faiss-cpu
  • pip install numpy

Setup

Install required Python packages and set your OpenAI API key as an environment variable.

  • Use pip install openai faiss-cpu numpy to install dependencies.
  • Set OPENAI_API_KEY in your environment for authentication.
bash
pip install openai faiss-cpu numpy
output
Collecting openai
Collecting faiss-cpu
Collecting numpy
Successfully installed openai faiss-cpu numpy

Step by step

This example demonstrates how to build a personalized product search by embedding product descriptions and user queries, then retrieving the most relevant products using FAISS for vector similarity search and OpenAI GPT-4o for query understanding.

python
import os
import numpy as np
from openai import OpenAI
import faiss

# Initialize OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Sample product catalog
products = [
    {"id": 1, "name": "Wireless Bluetooth Headphones", "description": "High-quality wireless headphones with noise cancellation."},
    {"id": 2, "name": "Smart Fitness Watch", "description": "Track your workouts and health metrics with this smartwatch."},
    {"id": 3, "name": "Portable Charger 10000mAh", "description": "Compact power bank for charging devices on the go."},
    {"id": 4, "name": "Ergonomic Office Chair", "description": "Comfortable chair with lumbar support for long work hours."}
]

# Embed product descriptions
product_texts = [p["description"] for p in products]
response = client.embeddings.create(model="text-embedding-3-small", input=product_texts)
product_embeddings = np.array([data.embedding for data in response.data]).astype('float32')

# Build FAISS index
dimension = product_embeddings.shape[1]
index = faiss.IndexFlatL2(dimension)
index.add(product_embeddings)

# Function to embed user query
def embed_query(query: str) -> np.ndarray:
    resp = client.embeddings.create(model="text-embedding-3-small", input=[query])
    return np.array(resp.data[0].embedding).astype('float32')

# Personalized search function
def personalized_search(user_query: str, top_k: int = 3):
    query_vec = embed_query(user_query)
    distances, indices = index.search(np.array([query_vec]), top_k)
    results = []
    for idx in indices[0]:
        product = products[idx]
        results.append(product)
    return results

# Example usage
query = "Looking for noise cancelling headphones"
results = personalized_search(query)
print("Search results for query:", query)
for r in results:
    print(f"- {r['name']}: {r['description']}")
output
Search results for query: Looking for noise cancelling headphones
- Wireless Bluetooth Headphones: High-quality wireless headphones with noise cancellation.
- Smart Fitness Watch: Track your workouts and health metrics with this smartwatch.
- Portable Charger 10000mAh: Compact power bank for charging devices on the go.

Common variations

You can enhance personalized search by:

  • Using async calls with the OpenAI SDK for better performance.
  • Streaming partial results for interactive search experiences.
  • Switching to other embedding models or larger LLMs like gpt-4o for richer query understanding.
  • Incorporating user profile data or past interactions to re-rank results.
python
import asyncio
from openai import OpenAI

async def async_embed_query(client, query: str):
    resp = await client.embeddings.acreate(model="text-embedding-3-small", input=[query])
    return resp.data[0].embedding

async def main():
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    query = "Ergonomic chair for office"
    embedding = await async_embed_query(client, query)
    print("Async embedding vector length:", len(embedding))

asyncio.run(main())
output
Async embedding vector length: 1536

Troubleshooting

  • If embeddings fail, verify your OPENAI_API_KEY is set correctly and has access to embedding models.
  • If FAISS index search returns no results, ensure embeddings are correctly computed as float32 numpy arrays.
  • For slow responses, consider caching embeddings or using smaller models.
  • Check for API rate limits and handle exceptions gracefully.

Key Takeaways

  • Use embedding vectors and FAISS for efficient semantic product search.
  • Combine LLMs for query understanding with vector search for personalized results.
  • Leverage user context and behavior data to improve search relevance dynamically.
Verified 2026-04 · gpt-4o, text-embedding-3-small
Verify ↗