Milvus supports hybrid Lexical search operations by allowing developers to combine traditional keyword-based retrieval with semantic vector similarity in one integrated pipeline. While Milvus is primarily designed as a vector database, it includes mechanisms for metadata filtering, scalar field queries, and hybrid scoring that make it possible to incorporate Lexical search logic into the workflow. Typically, a Lexical search—using algorithms such as BM25—retrieves candidate documents based on exact or weighted keyword matches. These results can then be linked to the corresponding embeddings stored in Milvus, where semantic re-ranking takes place to refine the search results according to meaning rather than only word overlap.
Developers can implement hybrid Lexical search in Milvus by indexing documents both as traditional text (stored externally with Lexical indexes) and as embeddings within Milvus. During query time, the system first runs a Lexical search to identify the top candidates, and then Milvus computes similarity scores between the query embedding and candidate embeddings. These two scores—Lexical relevance and vector similarity—can be fused using a weighted formula or normalization strategy, allowing both keyword precision and contextual understanding to influence the final ranking. This integration enables Milvus to act as the semantic backbone of hybrid retrieval systems.
Hybrid Lexical operations in Milvus are particularly effective for use cases that require both factual precision and contextual recall, such as customer support search, research paper retrieval, or enterprise knowledge management. Lexical search ensures that explicit terms (like product codes or legal phrases) are not missed, while Milvus enhances relevance by identifying semantically related materials. This layered design gives developers full control over performance and interpretability, resulting in hybrid retrieval pipelines that are both accurate and explainable. Milvus thus bridges the gap between token-based and meaning-based search in a scalable, production-ready way.
