OpenEvidence Powers Medical AI with Zilliz Cloud

< 10ms
Search Latency With High Recall
4,000 QPS
Vector Search Traffic At Peak Load
800,000+
Verified Clinicians Served with Quality Answers
25M+
Consultations Every Month
We believe AI is becoming a meaningful support layer for physicians, but the experience has to feel trustworthy, reliable, and seamless. Building that kind of product requires a strong foundation behind the scenes. Zilliz Cloud has helped us create that foundation as we continue to grow and serve hundreds of thousands of clinicians.
Jagath Kumar
About OpenEvidence
OpenEvidence is an AI-powered clinical decision tool used by more than 800,000 verified clinicians across 10,000+ hospitals and health systems. The platform delivers evidence-based answers from leading medical sources, including the New England Journal of Medicine, JAMA, PubMed, the FDA, and the CDC, at the exact moment a physician needs them.
In March 2026 alone, OpenEvidence supported 25 million clinical consultations, and the number is still growing rapidly. That is roughly one every 0.103 seconds, around the clock. At that scale, the retrieval layer is not just infrastructure. It is part of patient care.
| < 10ms | 4,000 QPS | 800,000+ | 25M+ | 99%+ |
|---|---|---|---|---|
| Search latency with high recall | Vector search traffic at peak load | Verified clinicians served with quality answers | Consultations every month | Search recall guaranteed |
The Challenge
As a clinical AI platform, OpenEvidence operates under requirements that go far beyond those of a typical consumer application. The system must deliver accurate, low-latency answers from heterogeneous medical corpora, handle high concurrency during peak usage, and maintain strict HIPAA compliance for the storage and transmission of sensitive medical data.
Prior to Zilliz Cloud, OpenEvidence used an established leader in vector search, but then moved to Zilliz Cloud as its requirements for scale, performance, and operational flexibility increased.
| Latency | Query latency could not reliably meet the sub-10ms requirement for a real-time physician-facing product. |
|---|---|
| Scale | Traffic reached thousands of QPS, and the existing system could not scale without performance degradation. |
| Search Quality | High recall could not be maintained at scale without significant latency trade-offs, which is risky in clinical settings. |
| Compliance & Operations | Required HIPAA compliance and stable ingestion without index rebuild disruptions, both of which the previous system struggled to support. |
Why Zilliz Cloud
After evaluating alternatives, OpenEvidence chose Zilliz Cloud - the fully managed vector database service built on Milvus. Five factors drove the decision:
- Horizontally scalable for high-QPS workload: Zilliz Cloud handles OpenEvidence's mission-critical workload at 4,000 QPS, serving more than 800,000 verified clinicians across 10,000+ hospitals.
- Ultra-low Latency: Head-to-head benchmarking showed Zilliz Cloud can serve 4,000 QPS at 10ms @P50 latency and 50ms @P99, delivering significantly better performance than their current vector database.
- Search Quality at Scale: Zilliz Cloud maintains high recall accuracy without sacrificing speed. OpenEvidence strives to deliver clinically reliable retrieval where a missed evidence match isn't just a product failure, it's a patient safety risk. With sophisticated quantization and refined strategy, Zilliz Cloud achieves 99% recall with an aggressively compressed index to serve high-QPS low-latency workload, unlike competitors that trade accuracy for speed.
- HIPAA-Ready Out of the Box: Zilliz Cloud is HIPAA-ready. Using GCP Private Service Connect, OpenEvidence established a fully private data path to Zilliz Cloud and maintained HIPAA compliance.
- Fully Managed Operations and Unbeatable Developer Experience: Infrastructure scaling, high availability, and maintenance are fully managed by Zilliz Cloud. A 24/7 on-call team ensures production reliability around the clock. On the Business Critical plan, OpenEvidence receives dedicated engineering support for performance tuning, schema design, and production best practices.
Results & Benefits
Sub-10ms latency with 99%+ recall
Zilliz Cloud consistently delivers sub-10ms latency with 99%+ recall. Each user query triggers multiple vector searches, yet responses return fast enough that the latency is not visible to clinicians.
Scales to production clinical workloads
The system supports roughly 4,000 QPS and serves more than 800,000 clinicians. This level of scale is required to handle millions of consultations per month without degradation.
HIPAA-compliant deployment
Using GCP Private Service Connect, OpenEvidence maintains a fully private data path between application and database layers, meeting HIPAA requirements for sensitive medical data.
Reduced operational overhead
With Zilliz Cloud fully managing scaling, availability, and maintenance, the engineering team can focus on product development instead of infrastructure.