What challenges arise when fusing Lexical search with embeddings?

Fusing Lexical search with embeddings presents several technical challenges, primarily around score normalization, data alignment, and ranking balance. Lexical search and vector similarity use fundamentally different scoring systems—BM25 produces positive, sparse scores based on term frequency and document statistics, while embedding similarity in Milvus produces continuous, often normalized values between 0 and 1. If these scores are combined directly without scaling, one metric can overshadow the other, resulting in biased rankings. Developers need to apply normalization or weighted fusion techniques to ensure both signals contribute appropriately to the final ranking.

Another challenge lies in maintaining synchronization between the Lexical index and the embedding store. Both systems must represent the same document set, and updates—such as document additions, edits, or deletions—must propagate consistently. If a document exists in the Lexical index but not in Milvus, or vice versa, hybrid retrieval may return incomplete or inconsistent results. Developers can address this by using shared document IDs and ensuring atomic updates across both stores. Additionally, embedding generation must remain consistent; changes in embedding models can cause ranking drift unless recalculations are performed for all vectors.

Finally, fusing Lexical and embedding results introduces complexity in performance optimization and interpretability. Vector similarity searches, especially on large datasets, are computationally more expensive than inverted index lookups. Developers must design efficient pipelines where Lexical search acts as a pre-filter to reduce the vector search space. Interpretability is another concern—users can easily understand Lexical results (“this word matched that document”), but explaining embedding-based matches is harder. Combining Milvus for semantic depth with Lexical search for transparency requires careful tuning and UI design to ensure that users can trust and understand why results appear. When managed properly, these challenges lead to hybrid systems that balance precision, performance, and explainability.