Yes, voyage-code-2 does understand documentation, especially technical documentation that is closely related to code. This includes README files, API references, inline comments, design docs, and even issue descriptions. The model embeds documentation text into the same semantic space as code, which makes it possible to retrieve relevant code based on a documentation query, or retrieve documentation based on a code-related question.
This capability is important because developers often search using natural language, not code. For example, a developer might search for “how do we refresh access tokens,” even though the code uses a function named renew_session_credentials. voyage-code-2 can bridge this gap by embedding both the documentation text and the code in a way that aligns their meaning. This is especially useful for RAG systems that answer developer questions by retrieving both explanatory text and implementation details.
As with code, documentation embeddings become most useful when stored in a vector database such as Milvus or Zilliz Cloud. You can store code and documentation in the same collection or separate ones, and use metadata to control how they are retrieved. This setup enables unified search across code and docs, which is often more valuable than searching either source in isolation.
For more information, click here: https://zilliz.com/ai-models/voyage-code-2
