voyage-code-2’s biggest limitation is that it is an embedding model, not a code interpreter, static analyzer, or symbolic reasoner. It produces vectors that represent semantic similarity, which is great for retrieval, but it cannot guarantee that the nearest neighbor is the “correct” implementation for a specific environment, version, or runtime behavior. If your codebase contains multiple similar patterns (for example, two different token validation flows for different products), embeddings may retrieve a plausible-but-wrong candidate unless you add metadata filters (product, version, service) and enforce them in the search layer. This is why code retrieval quality is usually a pipeline problem, not just a model choice.
A second limitation is input shaping: embedding “too much” code at once often makes vectors less precise. voyage-code-2 supports long inputs (the Zilliz guide lists 16,000 max input tokens), but long context can tempt teams to embed whole files or modules. That can hurt retrieval because one vector ends up representing multiple responsibilities. A safer approach is to embed coherent units (function/class) and optionally prepend minimal context like file path + signature + docstring. Another common limitation is that code retrieval often depends on relationships across files (call graphs, inheritance, configuration). Embeddings don’t automatically encode those relationships unless you include them in the embedded text or retrieve multiple chunks and then let the application (or an LLM) stitch them together.
Finally, there are operational constraints and failure modes you need to plan for: deduplication (generated code, vendored dependencies), freshness (embeddings drifting behind main branch), and access control (don’t retrieve private code for the wrong user). These are best handled in a vector database such as Milvus or Zilliz Cloud, where you can store metadata like is_generated, repo_visibility, commit, and acl_group, then filter results accordingly. voyage-code-2 can power strong semantic matching, but it will not fix messy corpora, inconsistent chunking, or missing metadata. Treat it like a strong retrieval primitive that needs good indexing hygiene around it.
For more information, click here: https://zilliz.com/ai-models/voyage-code-2
