How to beginner · 3 min read

Vector database use cases

Quick answer
A vector database stores and indexes high-dimensional vectors for efficient similarity search, enabling use cases like semantic search, recommendation systems, and anomaly detection. These databases power AI applications by matching vector embeddings from models to find relevant or similar data quickly.

PREREQUISITES

  • Python 3.8+
  • OpenAI API key (free tier works)
  • pip install openai>=1.0
  • pip install pinecone-client>=3.0

Setup

Install the necessary Python packages and set your environment variables for API keys.

  • Install OpenAI and Pinecone clients:
bash
pip install openai>=1.0 pinecone-client>=3.0

Step by step

This example demonstrates how to create vector embeddings with OpenAI and store them in a Pinecone vector database for semantic search use case.

python
import os
from openai import OpenAI
from pinecone import Pinecone

# Initialize OpenAI client
openai_client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Initialize Pinecone client
pinecone_client = Pinecone(api_key=os.environ["PINECONE_API_KEY"])

# Create or connect to Pinecone index
index_name = "example-index"
index = pinecone_client.Index(index_name)

# Sample documents to index
documents = [
    "Machine learning enables computers to learn from data.",
    "Vector databases store embeddings for fast similarity search.",
    "Semantic search improves search relevance using AI embeddings."
]

# Generate embeddings for documents
embeddings = []
for doc in documents:
    response = openai_client.embeddings.create(
        model="gpt-4o-mini",
        input=doc
    )
    vector = response.data[0].embedding
    embeddings.append(vector)

# Upsert vectors into Pinecone index
vectors_to_upsert = [(str(i), embeddings[i]) for i in range(len(embeddings))]
index.upsert(vectors=vectors_to_upsert)

# Query example: find documents similar to a query
query_text = "How do vector databases work?"
query_embedding = openai_client.embeddings.create(model="gpt-4o-mini", input=query_text).data[0].embedding

query_response = index.query(vector=query_embedding, top_k=2)

print("Top matches:")
for match in query_response.matches:
    print(f"ID: {match.id}, Score: {match.score}")
output
Top matches:
ID: 1, Score: 0.92
ID: 2, Score: 0.89

Common variations

You can adapt vector database use cases for:

  • Recommendation systems: Store user/item embeddings to find similar items.
  • Anomaly detection: Detect outliers by distance from normal data vectors.
  • Multimodal search: Combine text, image, and audio embeddings in one index.
  • Async usage: Use async clients for high throughput applications.

Troubleshooting

If you see Index not found errors, ensure your Pinecone index is created and the name matches.

If embedding calls fail, verify your OPENAI_API_KEY is set and has access to the embedding model.

For slow queries, check your vector dimension matches the model embedding size and optimize top_k parameter.

Key Takeaways

  • Use vector databases to enable fast similarity search on high-dimensional embeddings.
  • Semantic search, recommendations, and anomaly detection are primary vector database use cases.
  • Combine OpenAI embeddings with Pinecone for scalable AI-powered search.
  • Ensure embedding dimensions and index configurations match to avoid errors.
  • Async and multimodal vector search expand use cases beyond text.
Verified 2026-04 · gpt-4o-mini
Verify ↗