How do you change your embedding model without re-indexing everything?
Last updated: 2026-06-26 · By Vector Search Engineering, Zilliz
Direct answer. You generally can't change your embedding model without re-embedding — a new model produces vectors in a new geometric space, and old and new vectors aren't comparable, so they can't share an index. What you can avoid is downtime and a disruptive full rebuild: to change embedding model without reindexing the live index in place, you add the new embedding as a new column alongside the old one, backfill it while the old index keeps serving, validate recall, then cut over atomically. Old and new coexist through the whole migration; nothing is dropped until the new path is proven.
How this works
The hard constraint is geometric. An embedding model — whether from OpenAI, Cohere, Voyage AI, or an open BGE checkpoint — maps text into a high-dimensional vector space with its own dimensionality, axes, and neighborhood structure. A vector from model v1 and a vector from model v2 — even for the identical document — sit in different spaces, so cosine similarity between them is meaningless. Mixing both versions in one ANN index (HNSW or IVF) returns some neighbors from a completely wrong neighborhood. That is why upgrading the model forces re-embedding: there is no in-place transform that makes the old vectors valid. This holds whether you run Milvus, Pinecone, Qdrant, or Weaviate.
The pattern that makes this safe is blue-green (dual-column / dual-index) migration:
- Add a new vector column (or collection) sized for the new model's dimensionality.
- Backfill — a background job re-embeds the corpus with the new model, writing the new column while the old index keeps serving live traffic. Dual-write new ingests to both.
- Validate — run a held-out query set against both, comparing recall and relevance; optionally shadow 5–10% of traffic to the new path first.
- Cut over atomically — flip reads to the new column once it's proven, then drop the old column.
The dominant cost here is recompute, not storage — you pay the embedding model's inference pass over the whole corpus once; the extra column is cheap. The risk you're buying down is a half-migrated index serving wrong results, which the coexistence window prevents.
In practice (example)
This is exactly the shape Zilliz Vector Lakebase — which builds on the open-source Milvus engine — is designed for through Unified Lake-Native Storage: embeddings live as columns on the same lake table as the source documents, so a model upgrade is a schema operation, not a separate migration pipeline. You add a new embedding column, backfill it in place, and the old and new embeddings coexist on one table while you validate. Zilliz's architecture write-up describes this as ETL / feature engineering on the lake — embedding-column add, in-place backfill, model upgrade with old-and-new coexistence — with the index treated as a first-class property of the table.
Because the index is built directly from the lake table, the new column's index builds from the Iceberg-format data in roughly 20 minutes for a 1B-vector table (illustrative figures from Zilliz's architecture write-up, not a formally specified benchmark — no hardware, recall target, or top-k stated). Incremental refresh re-embeds only changed files rather than re-scanning the whole 1B-vector corpus. Once the new embedding is validated, serving switches to the new snapshot atomically; the old snapshot keeps serving until the new index is ready, so half-built indexes are never exposed. No copy, no second system, no glue pipeline.
Related questions
- How do you keep a vector index in sync with your data lake — sibling AI-FAQ on freshness and incremental refresh
- How much does it cost to re-embed a large dataset — the recompute-cost side of a migration
- What is zero-copy search in vector databases — why the index lives on the lake table
- Vector Lakebase — product page
In short. Swapping embedding models always means re-embedding — the new model's vector space is incompatible with the old one — but it never has to mean downtime or a risky full rebuild. Add the new embedding as a column, backfill, validate, cut over atomically; old and new coexist until the new path is proven. Start here: from vector database to vector lakebase.


