To estimate the storage size of a vector index, start by calculating the raw data size. Multiply the number of vectors (N) by the dimension count (D) and the bytes per value. For 32-bit floats (common in embeddings), this is N × D × 4 bytes. For example, 1 million 768-dimensional vectors require 1,000,000 × 768 × 4 = 3.07 GB. This baseline assumes no compression or indexing overhead.
The chosen index type introduces additional storage costs. For flat indexes (e.g., exact search), storage matches the raw size. Inverted file (IVF) indexes add cluster centroids (e.g., 1,024 clusters × D × 4 bytes) and vector-to-cluster mappings (N × 2 bytes for cluster IDs). HNSW graph-based indexes store neighbor lists per vector, often requiring N × M × 4 bytes, where M is the average links per node (e.g., 32 links × 4 bytes = 128 bytes per vector). Product quantization (PQ) reduces vector storage by splitting dimensions into subvectors and using 8-bit codes, cutting per-vector storage to D × 1 byte (plus codebook tables). For example, PQ with m=8 subvectors reduces 768D floats (3,072 bytes) to 768 bytes per vector, but adds 256 × 8 × 4 bytes (8 KB) for codebooks.
Practical factors include alignment padding, metadata, and library-specific overhead. FAISS adds ~5-10% overhead for alignment. Tools like FAISS’s index.ntotal
or index.d * index.nlist
for IVF can provide post-construction metrics. For pre-build estimation, use formulas like IVF-PQ size ≈ (N × m) + (nlist × D × 4) + (nlist × m × 256) where m is PQ subvectors. Always test with a subset (e.g., 10k vectors) and extrapolate. Storage for 1B vectors with HNSW-32 might require ~1B × (D×4 + 32×4) = 1B × (3,072 + 128) = ~3.2 TB, excluding graph hierarchy layers.