RAGFlow currently uses a search engine backend and Infinity as its primary vector storage backends, as they meet RAGFlow's specific hybrid search requirements (full-text BM25 indexing, vector similarity search, and advanced ranking in a unified index). a search engine backend is production-proven with massive scale deployments, native BM25 support, and vector search capabilities, making it ideal for enterprises. Infinity is RAGFlow's recommended vector database, optimized specifically for RAG workloads with efficient semantic search and structured data support. Both backends support phrase search, field filtering, and advanced ranking—features essential to RAGFlow's retrieval quality that many general-purpose vector databases lack. Integration with other vector stores like Milvus (open-source, scalable, pure vector-first) is an active area of community interest. There is an open feature request (GitHub issue #7749) for official Milvus support, reflecting demand from users preferring open-source vector-first architectures. Milvus doesn't natively support full-text search, so integration would require pairing Milvus with a search engine backend (Milvus for vectors, a search engine backend for BM25) or using an external full-text index, adding operational complexity. Other vector databases (other vector databases) similarly lack native full-text capabilities, explaining why RAGFlow hasn't integrated them as first-class backends. For now, if you prefer Milvus, you could deploy both systems (Milvus for vectors, a search engine backend for text) and implement custom routing, or contribute integration code to the RAGFlow project. RAGFlow's architecture is extensible, and community contributions for additional vector database connectors are welcome. For production deployments, a search engine backend (via self-hosting or managed services) or Infinity (via RAGFlow's bundled setup) are the currently supported, fully-featured options.
Related Resources: Building RAG Applications | Chunking Strategies for RAG
