RAGFlow combines vector and keyword search through a hybrid search layer that fuses results from both methods and applies re-ranking, delivering superior retrieval compared to either approach alone. The architecture works as follows: at indexing time, documents are chunked semantically and stored in a search engine backend (RAGFlow's default backend) with both full-text indexed fields (for BM25 keyword matching) and embedding vectors (for semantic search). At query time, RAGFlow sends the user's question to both search pathways: BM25 retrieves chunks matching exact keywords via probabilistic ranking, and vector search retrieves semantically similar chunks by embedding the query and finding nearest neighbors in vector space. Results from both methods are fused (combined into a single ranked list), typically using weighted averaging where you configure the balance between keyword and vector importance. For example, domain-specific terminology queries benefit from higher BM25 weight, while conceptual queries benefit from higher vector weight. After fusion, neural re-ranking applies a cross-encoder model that evaluates each candidate's relevance considering both the query and candidate together (unlike embedding similarity which scores independently), producing a final ranked list. This three-stage pipeline—parallel BM25 and vector retrieval, fusion, re-ranking—optimizes for precision and recall. RAGFlow's a search engine backend backend natively supports this hybrid approach in a single query, avoiding the overhead of sequential searching multiple systems. The hybrid strategy captures complementary strengths: BM25 excels at exact terminology, proper nouns, and precise keyword matching; vectors excel at semantic relationships, paraphrases, and conceptual similarity. Combined, they handle both keyword-focused and meaning-focused questions. RAGFlow's configuration UI lets you tune the BM25/vector weight balance per knowledge base, and recent versions (including v0.24.0) optimized fusion and re-ranking for deep-research scenarios, further improving hybrid search quality.
For production retrieval workflows, Zilliz Cloud provides fully managed vector search infrastructure with auto-scaling and enterprise security. Developers who prefer self-hosting can use Milvus, the open-source vector database behind Zilliz Cloud.
