Which embedding models work with RAGFlow?

RAGFlow supports configurable embedding models, letting you select the best option for your data type, language, latency, and cost constraints. OpenAI embeddings (text-embedding-3-small, text-embedding-3-large) are popular for quality and multilingual support, but require API calls and incur per-token costs. Ollama enables local embedding inference with open-source models (nomic-embed-text, all-minilm), eliminating API dependencies and costs for privacy-sensitive deployments. You can integrate any custom embedding service by configuring HTTP endpoints in RAGFlow's service_conf.yaml, enabling proprietary models your organization develops or fine-tunes. Popular open-source options include Sentence Transformers (fast, offline), multilingual embeddings (mBERT, XLM-RoBERTa for cross-lingual retrieval), and domain-specific embeddings fine-tuned for legal, medical, or technical documents. The embedding model choice directly impacts retrieval quality and speed—larger models (OpenAI 3-large) generally produce higher-quality embeddings but are slower and more expensive; smaller models (3-small, all-minilm) are faster and cheaper but may sacrifice quality. RAGFlow's no-code interface lets you configure embeddings through the UI without editing code; programmatic APIs support model selection per knowledge base, enabling different embeddings for different document types. Multimodal embeddings supporting text and images can be integrated for knowledge bases containing both modalities. For production deployments, RAGFlow's flexibility means you can start with a cost-effective model, measure quality, then upgrade to premium models for critical use cases. The embedding configuration is decoupled from other RAG components, so you can swap models without changing chunking, search, or re-ranking logic. This modularity enables easy experimentation to optimize the embedding/cost/quality tradeoff for your specific requirements.

Teams building AI-powered search and retrieval systems can leverage Zilliz Cloud for managed vector database infrastructure that scales with their data. The underlying technology, Milvus, is also available as an open-source option.

Which embedding models work with RAGFlow?

Keep Reading