How does RAGFlow build knowledge graphs?

RAGFlow builds knowledge graphs through entity extraction and relationship mapping between documents, enhancing multi-hop reasoning and cross-document question answering. This feature, introduced in v0.9, leverages GraphRAG principles to move beyond simple text chunk retrieval toward explicit knowledge representation. The process works as follows: after documents are chunked using RAGFlow's semantic chunking, an optional knowledge graph construction step extracts entities (people, organizations, concepts, events) and identifies relationships between them ("person X works at company Y", "concept A influences concept B"). The knowledge graph layer sits between raw chunks and the retrieval index, creating explicit connections that simple vector similarity might miss. Knowledge graphs are particularly powerful for multi-hop questions requiring reasoning across multiple documents or complex concept hierarchies—for example, "What products does a company owned by a person I read about last week make?" would require connecting documents through intermediate entities. GraphRAG's query modes—Global Search (answering holistic questions via community summaries) and Local Search (entity-specific queries via neighborhood navigation)—enable different retrieval strategies depending on question type. Constructing knowledge graphs happens automatically in RAGFlow's pipeline between extraction and indexing, requiring no additional configuration beyond enabling the feature. The overhead is indexing time (graph construction is computationally more intensive than simple chunking), but retrieval accuracy improvements for complex domains often justify it. Knowledge graphs are especially valuable for research, legal documents, organizational hierarchies, and scientific papers where relationships between concepts are crucial. RAGFlow's visual workflow builder exposes knowledge graph construction as a configurable step, letting you enable/disable and tune it per knowledge base.

Teams building AI-powered search and retrieval systems can leverage Zilliz Cloud for managed vector database infrastructure that scales with their data. The underlying technology, Milvus, is also available as an open-source option.

How does RAGFlow build knowledge graphs?

Keep Reading