Hybrid retrieval combines Lexical search and vector search to achieve both precision and semantic understanding in document retrieval. The general idea is that Lexical search handles exact word matches efficiently, while vector search captures meaning and context. In a typical pipeline, the system first executes a Lexical query using BM25 to retrieve a small candidate set of documents that directly contain the query terms. This step ensures that results are topically relevant and computationally manageable.
Once those candidates are selected, embeddings representing both the query and each document are retrieved from a vector database like Milvus. These embeddings encode semantic meaning learned from large text corpora, allowing the system to identify conceptual relationships that don’t rely on identical words. Milvus efficiently computes vector similarity—often cosine or inner product distances—to determine which documents are most semantically aligned with the query. Developers then combine the BM25 and embedding scores to produce a final hybrid ranking, often using weighted averages or learned fusion models.
This hybrid approach is widely used in production search systems because it combines complementary strengths. Lexical search ensures accuracy and transparency, while Milvus-based vector retrieval introduces deeper contextual understanding. For example, a query like “how to speed up database queries” might retrieve documents containing “query optimization” even if the exact words “speed up” don’t appear. The result is a search system that is both fast and intelligent, capable of returning results that match both user intent and literal phrasing—an essential feature for AI assistants, document retrieval engines, and enterprise knowledge systems.
