What is Weaviate
Weaviate is an open-source vector database that stores and indexes high-dimensional vectors for fast similarity search. It enables AI applications to perform semantic search and retrieval by combining vector embeddings with rich metadata and a graph-like data model.Weaviate is an open-source vector database that stores and searches high-dimensional vectors to enable semantic search and AI-powered retrieval.How it works
Weaviate stores data as vectors, which are numerical representations of unstructured data like text, images, or audio. It indexes these vectors using efficient algorithms such as HNSW (Hierarchical Navigable Small World graphs) to enable fast approximate nearest neighbor search. Alongside vectors, Weaviate maintains a graph-like schema to store metadata and relationships, allowing complex queries that combine semantic similarity with structured filters.
Think of it as a smart library catalog that not only knows the keywords but understands the meaning behind your query, retrieving the most relevant items based on context and similarity.
Concrete example
This Python example demonstrates how to insert a text embedding into Weaviate and perform a semantic search query using the weaviate-client SDK.
import os
import weaviate
# Initialize Weaviate client
client = weaviate.Client(
url=os.environ.get('WEAVIATE_URL', 'http://localhost:8080'),
auth_client_secret=None # Add auth if needed
)
# Example vector and metadata
vector = [0.1, 0.2, 0.3, 0.4] # Example embedding vector
obj = {
"title": "Example document",
"content": "This is a sample document stored in Weaviate."
}
# Create a class schema if not exists
class_obj = {
"class": "Document",
"vectorizer": "none"
}
if not client.schema.contains(class_obj):
client.schema.create_class(class_obj)
# Add object with vector
client.data_object.create(
data_object=obj,
class_name="Document",
vector=vector
)
# Semantic search query
near_vector = {"vector": [0.1, 0.2, 0.3, 0.4], "certainty": 0.7}
result = client.query.get("Document", ["title", "content"]).with_near_vector(near_vector).with_limit(3).do()
print(result) {'data': {'Get': {'Document': [{'title': 'Example document', 'content': 'This is a sample document stored in Weaviate.'}]}}} When to use it
Use Weaviate when you need to build AI applications that require semantic search, recommendation systems, or similarity search over unstructured data like text, images, or audio. It excels when you want to combine vector search with rich metadata filtering and graph relationships.
Do not use Weaviate if your data is purely structured and relational without the need for semantic or vector search capabilities, where traditional relational databases are more appropriate.
Key terms
| Term | Definition |
|---|---|
| Vector database | A database optimized for storing and searching high-dimensional vector embeddings. |
| Embedding | A numerical vector representation of unstructured data capturing semantic meaning. |
| HNSW | Hierarchical Navigable Small World graph, an algorithm for fast approximate nearest neighbor search. |
| Semantic search | Search that retrieves results based on meaning and context rather than exact keyword matches. |
| Schema | The structure defining classes, properties, and relationships in Weaviate. |
Key Takeaways
-
Weaviatecombines vector search with a graph-like schema for rich, semantic queries. - It uses efficient indexing like HNSW for fast similarity search on high-dimensional vectors.
- Use Weaviate for AI applications needing semantic search, recommendations, or multimodal data retrieval.