ValueError | RuntimeError (HNSW parameter constraint violation)
hnswlib.HNSWIndex parameter validation error or Pinecone/Weaviate HNSW configuration error
Stack trace
ValueError: ef_construction must be >= M * 2 and <= index_size. Got ef_construction=50 for index with 100 vectors and M=32. or RuntimeError: ef_search parameter must be >= ef_construction. Got ef_search=20, ef_construction=200. or TypeError: ef_construction expected int, got <class 'float'>: 50.5
Why it happens
HNSW (Hierarchical Navigable Small World) index requires ef_construction (construction-time expansion factor) and ef_search (query-time expansion factor) to satisfy strict mathematical and logical constraints. ef_construction controls the graph density during indexing and must be greater than or equal to M*2 (where M is the maximum number of connections per node). ef_search controls recall at query time and must be >= ef_construction for consistency. When parameters are tuned without understanding these relationships, or when they're set as floats instead of integers, type checking or boundary validation fails.
Detection
Catch ValueError and RuntimeError during index creation and query operations. Log the actual parameter values alongside the error message to identify which constraint was violated (e.g., 'ef_construction=X, M=Y, index_size=Z'). Add pre-flight validation: assert isinstance(ef_construction, int) and ef_construction >= M*2 before calling index creation.
Causes & fixes
ef_construction set below M*2 (M is max connections per node, default M=16, so minimum ef_construction=32)
Set ef_construction to at least M*2. For M=16, use ef_construction >= 32. For M=32, use ef_construction >= 64. Check your M value and adjust: `index = hnswlib.Index(space='l2', dim=768, max_elements=10000, M=16, ef_construction=64)`
ef_search set lower than ef_construction, violating the logical constraint that recall settings cannot be tighter at query time than at build time
Ensure ef_search >= ef_construction. Typical pattern: ef_construction=200 (build-time expansion), ef_search=50-100 (query-time expansion). Set AFTER index creation: `index.ef = 50` where 50 >= your ef_construction value.
ef_construction or ef_search passed as float (e.g., 50.5) instead of int, causing type validation failure
Cast to int explicitly: `ef_construction=int(config['ef_construction'])` or `ef_search=int(ef_search_value)`. Ensure all parameter sources (config files, env vars) are converted to int before passing to HNSW.
ef_construction set higher than index_size (total vectors), creating memory pressure or invalid graph structure
Set ef_construction <= min(index_size, recommended_max). For small indexes (< 1000 vectors), use ef_construction=200-300. For large indexes (> 1M vectors), use ef_construction=128-256 to avoid memory bloat. Use `ef_construction = min(200, len(vectors) // 4)`
Code: broken vs fixed
import hnswlib
import os
# BROKEN: ef_construction too low, ef_search < ef_construction, float type
M = 16
max_elements = 10000
dim = 768
index = hnswlib.Index(space='l2', dim=dim, max_elements=max_elements, M=M, ef_construction=20) # ❌ 20 < M*2 (32)
index.ef = 15 # ❌ ef_search (15) < ef_construction (20)
data = [[0.1] * 768 for _ in range(100)]
index.add_items(data, range(100))
labels, distances = index.knn_query(data[0:1], k=5) # ❌ Fails: constraint violated import hnswlib
import os
# FIXED: Validate parameters, ensure correct types, respect constraints
M = 16
max_elements = 10000
dim = 768
# FIX 1: ef_construction must be >= M*2
ef_construction = max(M * 2, 128) # At least 32, pragmatically 128 for quality
ef_search = 50 # Can be < ef_construction (query-time optimization)
# FIX 2: Cast to int if coming from config/env
ef_construction = int(ef_construction)
ef_search = int(ef_search)
# FIX 3: Ensure ef_construction <= reasonable limit relative to data size
ef_construction = min(ef_construction, max_elements // 4 or 128)
index = hnswlib.Index(space='l2', dim=dim, max_elements=max_elements, M=M, ef_construction=ef_construction)
index.ef = ef_search # Set query-time parameter after creation
data = [[0.1] * 768 for _ in range(100)]
index.add_items(data, range(100))
labels, distances = index.knn_query(data[0:1], k=5)
print(f"✓ Search succeeded with ef_construction={ef_construction}, ef_search={ef_search}") Workaround
If you cannot refactor parameter handling immediately, wrap index creation in try/except, catch ValueError, automatically reduce ef_construction to M*2, and log a warning: `try: index = hnswlib.Index(..., ef_construction=user_param) except ValueError as e: ef_construction = M*2; index = hnswlib.Index(..., ef_construction=ef_construction); print(f'Reduced ef_construction to {ef_construction}')`
Prevention
Build a parameter validation layer at initialization: (1) Define constraints as a config struct (min_ef_construction=M*2, max_ef_construction=index_size//4, ef_search_min=ef_construction, ef_search_max=1000). (2) Validate all user inputs against these constraints before passing to HNSW. (3) Use dataclass with validation or Pydantic model: `class HNSWConfig(BaseModel): ef_construction: int; ef_search: int; @validator('ef_construction') def validate_ec(cls, v, values): assert v >= values['M']*2; return v`. (4) Use environment variable defaults with type coercion and bounds checking. (5) Monitor query latency and recall metrics; if recall degrades after tuning ef_search, log alert and revert to safe defaults.