Vector Search Latency and QPS at $1,000 Monthly Cost
Dataset
1M Dataset
Streaming Performance
This chart benchmarks search performance on the Cohere-10M dataset under constant ingestion pressure. We measure p99 serial search latency and maximum concurrent QPS at 90% data capacity while the insertion workload remains active. These 'streaming' results are then contrasted with the 'static' performance on a fully indexed database.
Plot values
QPS
- static
- constant ingestion (500 rows/s)
- constant ingestion (1000 rows/s)
ZillizCloud
8cu-perf
Pinecone
p2.x8-1node
OpenSearch
16c128g
QdrantCloud
16c64g
Milvus
16c64g-sq8
Performance vs. Recall
Dataset
1M Dataset
Plot values
Filtering Performance
Dataset
1M Dataset
Plot values