How do I build hybrid search with Vertex AI and Milvus?

To build hybrid search with Vertex AI and Milvus, developers can combine lexical (keyword-based) and vector (semantic) search techniques into a single pipeline. The goal of hybrid search is to capture both exact keyword matches and semantic similarities between queries and documents. In this setup, Vertex AI handles text embedding generation, while Milvus stores and indexes these embeddings for vector similarity search. At the same time, a traditional search engine or SQL query handles keyword-based scoring. The results from both systems are then merged and re-ranked according to relevance.

The process typically involves three steps. First, data (e.g., documents, FAQs, or transcripts) are embedded using a model hosted in Vertex AI. These embeddings are stored in Milvus with metadata like titles, tags, or timestamps. Second, when a query arrives, it’s processed in two parallel paths: one performs lexical search to match exact terms, and the other performs semantic search through Milvus. Finally, results from both paths are merged—often using weighted scoring or learning-to-rank—to produce balanced outputs that reward both precision (from lexical matches) and recall (from semantic matches).

This hybrid design is especially effective for enterprise knowledge retrieval or AI assistant use cases. For example, a Vertex AI–powered assistant could use Milvus to find semantically related documents while also ensuring that exact keyword hits (e.g., product names or legal terms) aren’t missed. This combination produces search results that feel both intuitive and accurate. Milvus’s ability to handle millions of embeddings efficiently makes it ideal for scaling hybrid search, while Vertex AI provides the modeling and inference capabilities that keep the pipeline intelligent and adaptive.