AI-powered product search explained
Quick answer
AI-powered product search uses large language models (LLMs) and vector embeddings to understand user queries semantically and retrieve relevant products beyond keyword matching. It combines embedding-based similarity search with natural language understanding to deliver more accurate and personalized search results.
PREREQUISITES
Python 3.8+OpenAI API key (free tier works)pip install openai>=1.0
Setup
Install the openai Python SDK and set your API key as an environment variable for secure access.
pip install openai output
Collecting openai Downloading openai-1.x.x-py3-none-any.whl (xx kB) Installing collected packages: openai Successfully installed openai-1.x.x
Step by step
This example demonstrates how to embed product descriptions and user queries, then perform a similarity search to find relevant products using OpenAI embeddings and GPT-4o for semantic understanding.
import os
from openai import OpenAI
import numpy as np
def cosine_similarity(a, b):
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Sample product catalog
products = [
{"id": 1, "name": "Wireless Bluetooth Headphones", "description": "Noise-cancelling over-ear headphones with long battery life."},
{"id": 2, "name": "Smartphone with OLED Display", "description": "Latest model smartphone featuring an OLED screen and fast charging."},
{"id": 3, "name": "Running Shoes", "description": "Lightweight running shoes with breathable mesh and cushioned sole."}
]
# Embed product descriptions
product_embeddings = []
for product in products:
response = client.embeddings.create(
model="text-embedding-3-small",
input=product["description"]
)
embedding = response.data[0].embedding
product_embeddings.append((product, embedding))
# User search query
query = "comfortable headphones for travel"
query_embedding_resp = client.embeddings.create(
model="text-embedding-3-small",
input=query
)
query_embedding = query_embedding_resp.data[0].embedding
# Find most similar product
best_product = None
best_score = -1
for product, embedding in product_embeddings:
score = cosine_similarity(np.array(query_embedding), np.array(embedding))
if score > best_score:
best_score = score
best_product = product
print(f"Best match: {best_product['name']} (score: {best_score:.4f})") output
Best match: Wireless Bluetooth Headphones (score: 0.89)
Common variations
You can enhance product search by integrating GPT-4o chat completions to interpret complex queries or use streaming for real-time search suggestions. Async calls improve throughput in high-traffic applications. Alternative embedding models or vector databases like FAISS can scale similarity search efficiently.
import asyncio
from openai import OpenAI
async def async_search():
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
query = "best shoes for marathon"
query_embedding_resp = await client.embeddings.create(
model="text-embedding-3-small",
input=query
)
query_embedding = query_embedding_resp.data[0].embedding
# Assume product_embeddings loaded asynchronously
# Similarity search logic here
asyncio.run(async_search()) Troubleshooting
- If embeddings are slow or time out, check your network and API quota.
- Low similarity scores may indicate poor query or product description quality; improve text clarity.
- Ensure environment variable
OPENAI_API_KEYis set correctly to avoid authentication errors.
Key Takeaways
- Use embedding models to convert product descriptions and queries into vectors for semantic similarity search.
- Combine embeddings with LLMs like gpt-4o to handle complex natural language queries.
- Async and streaming APIs improve responsiveness in production search systems.
- Vector databases like FAISS scale similarity search for large product catalogs.
- Proper API key management and query quality are critical for reliable AI-powered search.