Yes, voyage-2 can scale for large datasets, but the real scaling story is less about the model call itself and more about how you design the embedding + storage + retrieval pipeline around it. voyage-2 produces fixed-size vectors, which is exactly what you want when you’re indexing millions of chunks: every record has a consistent schema (vector field + metadata), and retrieval can be done with standard nearest-neighbor search. The scaling approach is usually: embed content in batches (offline or async), store vectors in a system built for vector search, and keep online embedding limited to user queries (which are typically short). With that approach, the model remains a predictable “vectorization step” even as the dataset grows.
To make this work at scale, you need a disciplined ingestion strategy. Large corpora (docs, tickets, emails, logs) should be chunked and assigned stable IDs like (doc_id, chunk_id) so you can upsert and delete embeddings when source text changes. Your embedding job should batch requests, run with retries, and be idempotent so failures don’t create duplicates. You’ll also want to plan for incremental updates: re-embed only changed chunks, not the entire corpus every time. If you support multiple languages or data sources, store metadata fields like source, lang, created_at, tenant_id, and maybe access_level so retrieval can be filtered. None of that is voyage-2-specific, but it determines whether “large dataset” stays manageable or becomes a pile of vectors you can’t govern.
For retrieval at large scale, you typically rely on a vector database to handle indexing and fast approximate nearest-neighbor search. A vector database such as Milvus or Zilliz Cloud (managed Milvus) is designed to store large vector collections, build indexes (e.g., IVF- or graph-based structures), and run top-k similarity queries efficiently while supporting metadata filtering. The most common scaling pattern is: keep embeddings generation horizontally scalable (workers that embed chunks) and let the vector database handle query-time performance via appropriate index choices, shard/partition strategy, and resource sizing. In other words: voyage-2 scales as well as your ingestion throughput and your vector DB architecture allow—and those are both solvable engineering problems with well-known patterns.
For more information, click here: https://zilliz.com/ai-models/voyage-2
