voyage-2 works with vector databases by providing the “text → vector” step that makes semantic similarity search possible, while the vector database provides the “store → index → retrieve” step that makes it fast and scalable. In practice, voyage-2 outputs a dense embedding vector for each input text. That vector becomes the primary searchable field in your vector database. When a user submits a query, you embed the query with voyage-2 and then run a nearest-neighbor search in the database to retrieve the most similar vectors. The database returns the IDs and similarity scores of the closest matches, and your application maps those IDs back to the original text chunks (and any metadata) to display results or feed a downstream pipeline.
Implementation-wise, the integration is usually a two-phase workflow. First, indexing: you chunk content, call voyage-2 to generate embeddings, and insert records into the vector database where each record includes a vector field plus metadata fields like doc_id, title, url, chunk_text, and chunk_index. Second, querying: you embed the query text and call the database’s search API with parameters like top_k, the similarity metric, and optional filters (e.g., lang == "en" and tenant_id == "acme"). If you need “semantic search plus keyword constraints,” the usual approach is metadata filtering (and potentially hybrid logic in your app) rather than trying to force embeddings to behave like a keyword engine.
A vector database such as Milvus or Zilliz Cloud typically handles the critical pieces that make this workable in production: approximate nearest-neighbor indexes, data partitioning, efficient vector storage, and high-throughput query execution. This division of responsibilities is important. voyage-2 does not manage indexes, does not know about your corpus size, and does not enforce access controls; it simply generates vectors. The vector database is where you implement production concerns like: “only return documents the user can access,” “filter to a specific product line,” “limit results to the latest version,” and “keep latency stable under load.” So “working with vector databases” really means: treat voyage-2 as an embedding generator, and let the vector DB do what databases do—persist, index, and query at scale with predictable performance.
For more information, click here: https://zilliz.com/ai-models/voyage-2
