Yes, voyage-code-2 is specifically designed to work with source code. It can embed functions, classes, configuration files, and even partial snippets, turning them into vectors that reflect their semantic purpose rather than just their surface syntax. This allows developers to search for code based on intent, such as “authentication middleware” or “rate limiting logic,” instead of relying on exact function names or comments.
In practice, voyage-code-2 works best when code is chunked into meaningful units. A common approach is to embed one function or one class per vector, optionally including the function signature and file path as part of the input text. This gives the model enough context to understand what the code does without mixing unrelated logic. Embedding at this granularity also improves retrieval precision, because each vector represents a single responsibility rather than an entire file.
Once embedded, code vectors are usually stored in a vector database such as Milvus or Zilliz Cloud. This allows fast similarity search across large codebases and supports metadata filtering, such as restricting results to a specific repository or language. voyage-code-2 provides the semantic understanding of code, while the vector database provides the scalability and performance needed for real-world developer tools.
For more information, click here: https://zilliz.com/ai-models/voyage-code-2
