Balancing Precision and Performance: How Zilliz Cloud's New Parameters Help You Optimize Vector Search

Introduction
Over the past years, large language models (LLMs) and vector databases have become the backbone of countless AI-powered applications—from e-commerce recommendations to facial recognition and RAG systems. However, these diverse use cases present a common challenge: how do you balance search accuracy (recall rate) against performance (latency and throughput)?
This is precisely why we're excited to introduce two powerful new features in the latest release of Zilliz Cloud, our fully managed vector database built on Milvus:
The
level
parameter - A simple yet powerful knob to fine-tune search accuracyThe
enable_recall_calculation
parameter - A built-in tool to estimate and validate recall rates
These additions empower developers to find the perfect balance for their specific use cases—whether you need lightning-fast recommendations or highly accurate security applications. In this blog, we'll show you exactly how to leverage these features to optimize your vector search implementations.
Different Use Cases, Different Requirements
As vector search proliferates across industries, we've observed that requirements for recall, latency, and queries per second (QPS) vary dramatically between applications. Let's take a look at two contrasting examples:
Recommendation Systems: Speed over Perfect Accuracy
Recommendation systems filter vast content libraries to suggest relevant items based on user preferences. In these systems, recall rate isn't the top priority. While recommendations should be relevant, introducing some variety often enhances user discovery and engagement.
Instead, these systems must handle thousands of concurrent requests in real-time, requiring:
High QPS to serve many users simultaneously
Very low latency for responsive user experiences
Moderate recall (85-95%) with tolerance for some imperfect matches
The business impact of slow recommendations usually outweighs occasional imperfect suggestions, making performance optimization critical.
Facial Recognition: Accuracy is Non-Negotiable
Facial recognition systems, especially in security contexts, have entirely different requirements. They must accurately identify authorized users to prevent both false positives (security breaches) and false negatives (legitimate users being denied).
These systems need:
Very high recall (99%+) for accurate identification
Tolerance for moderate latency (users accept a short verification delay)
Lower QPS demands (verification is a relatively infrequent task)
The consequences of misidentification are significant, making accuracy the non-negotiable priority even at the cost of some performance.
The Common Thread: Finding Your Balance
These contrasting examples highlight why a one-size-fits-all approach to vector search configuration falls short. Every application sits somewhere on this spectrum, requiring thoughtful optimization of recall, latency, and QPS based on business needs.
This is precisely why we've introduced the level
and enable_recall_calculation
parameters—to give developers the tools to find their optimal balance.
Introducing Zilliz Cloud's Precision Tuning Features
The level
Parameter: Fine-Tuning Search Accuracy
The level
parameter offers simple yet powerful control over search accuracy, with values ranging from 1 to 10:
Level Value | Ideal For | Typical Recall | Performance Impact |
---|---|---|---|
Lower (1-3) | Performance-focused applications | 90-97% | Minimal latency impact, highest QPS |
Medium (4-7) | Balanced applications | 97-99.5% | Moderate latency impact, good QPS |
Higher (8-10) | Accuracy-critical applications | 99.5%+ | Higher latency, reduced QPS |
We've extended the upper limit from 5 to 10 based on user feedback, enabling even higher precision for security, risk control, and many other accuracy demanding scenarios.
Please note that a higher recall rate doesn’t always good for your use cases. For example, if level=3 or 5 satisfies the needs, further increasing the level would only lead to unnecessary resource usage and higher latency.
The enable_recall_calculation
Parameter: Validating Your Accuracy
While the level
parameter adjusts search accuracy, how do you verify you're hitting your target recall rate? That's where enable_recall_calculation
comes in.
When enabled during a search operation, this parameter:
Estimates the actual recall rate of your current configuration
Returns this value alongside your search results
Enables data-driven decisions about configuration changes
This one-time calculation helps you validate whether your current settings meet your accuracy requirements without requiring external benchmarking tools.
Practical Implementation: A Step-by-Step Guide
Let's walk through a real-world example of optimizing a vector search implementation with these new features.
Prerequisites
A Zilliz Cloud instance (sign up here if you haven't already)
PyMilvus 2.5.4 or later
Your test dataset (we'll use the cohere-1M dataset in this example)
Step 1: Define Your Requirements
Before diving into parameter tuning, clearly define your requirements:
Target recall rate: 99.5%
Maximum acceptable latency: 5ms
Other considerations: QPS, resource utilization
Query pattern: Pure vector search
TopK: 16,000
Step 2: Import Test Data
For accurate testing, you should use data that closely resembles your production environment. In this example, we'll use the cohere-1M dataset and import it into Zilliz Cloud via VectorDBBench.
Step 3: Estimating Baseline Recall Rate
Let's first check how the default settings (level=1
) perform:
search_params = {
"params": {
"level": 1,
"enable_recall_calculation": True
}
}
res = client.search(
collection_name = "ZillizCloudVectorDBBench",
# test data
data = [
[0.22252834,0.26758388,0.3414864,0.31775144,0.25819996,-0.06176423,0.60313016,-0.31930527,-0.05070293,0.80085576,-0.7066278,-0.14704825,0.07324219,-0.051405758,0.24823247,0.20365287,-0.005265507,0.24754052,0.06302843,-0.24397966,-0.2805658,0.543768,0.018544307,0.14154078,0.03093845,-0.25058296,-0.61569184,0.08389459,-0.27519965,0.2121497,0.26527727,-0.03291658,-0.2631627,0.026973603,-0.22165383,0.38862047,0.012616088,-0.066382475,-0.013436819,-0.59001106,-0.08682751,0.13704056,0.08583454,0.0802483,0.01096984,0.20214474,0.11156094,0.5482859,-0.0807617,-0.16539982,-0.29261217,0.08269717,0.03385099,-0.48874223,-0.013168857,0.01616468,-0.6270225,0.13169415,1.0166928,0.6573267,0.40487188,0.2235163,-0.68331105,-0.24911362,-0.1763628,0.34692895,0.077760294,0.96388775,-0.10841275,0.3977706,0.08965021,0.29019687,0.106024966,0.11854912,-0.070255764,-0.24960922,-0.5354312,-0.70186704,0.25364435,0.43369204,0.42516047,0.15078346,-0.35151976,0.4886603,0.37026608,0.39485633,-0.046821307,0.18807457,0.13850202,-0.15630293,0.1469321,0.13219471,-0.31201053,0.22278261,-0.23063508,0.42379102,0.66762435,-0.11903275,0.22101034,-0.10102379,0.21675844,-0.2571628,-0.26546052,0.6185626,0.241754,-0.15792729,-0.37045696,0.23996626,-0.27011713,-0.11512929,0.31577003,0.11118326,-0.76019096,-0.1682175,0.5329204,0.3614348,-0.2348311,0.108711846,0.068240434,-1.3124024,0.20385003,0.6049856,-0.2924265,-0.07293733,0.5181119,0.15742214,0.7537572,-0.656809,0.57201636,0.09775318,0.5414663,-0.53804034,-0.0802571,0.62367904,0.023681821,0.6950041,0.3207407,0.36089638,0.53875273,-0.7288189,-0.12956187,0.15076943,0.057313107,0.41065657,-0.00928412,0.090415776,0.18091775,-0.010793066,-0.010142676,-0.20986095,-0.3740831,0.25086942,-0.31494206,0.17761512,0.04850758,-0.06098805,0.7605751,0.3038707,0.5178377,-0.5769507,0.80365956,0.22879237,-0.34868854,-0.2688102,-0.20910782,0.3392469,0.2990533,0.5502763,0.6561665,0.04177933,-0.45408615,-0.055697974,0.05596695,-0.22720425,0.63778013,-0.1921504,-0.16227654,-0.053658817,0.04536426,-0.28570235,0.30350783,0.5217574,0.0025516534,0.10135456,0.5973671,0.09276529,0.7803261,0.45648357,-0.21722879,-0.3496141,0.18574907,0.1729008,0.6754883,-0.5101994,0.16308193,-0.32053986,-0.0013728795,0.0371755,-0.114131995,0.19870742,0.3973309,0.17016156,0.016581284,0.3074155,0.32889378,-0.6682561,0.36933577,0.45571226,-0.19315217,-0.5065343,0.15996625,0.026897952,0.046443015,0.2667398,-0.18946062,-0.4283052,-0.44281873,-0.24062063,0.41703427,-0.30064407,0.35975343,0.31060407,0.18125875,0.14912511,0.10962614,-0.06901708,-0.2846222,-0.027887726,0.037055127,0.031954445,0.56672156,-0.0863331,0.1497875,-0.1635759,-0.25121027,0.6303942,0.17385906,0.4313834,0.15800661,-0.6267578,-0.03539913,0.32520285,0.42759246,0.24401832,0.2115575,-0.8652025,0.13317755,-0.5719402,0.17294376,-0.12595764,0.34818307,0.24802469,0.05904272,0.1538172,-0.57994705,0.2582915,0.45511153,-0.44164076,-0.074042775,0.04943926,-0.1648779,0.3280813,0.5601293,-0.0018850226,-0.140464,-0.07845455,0.44026145,0.56197315,0.1462102,-0.18595229,0.014953136,0.46956787,-0.14819877,0.1859354,0.019512085,-0.01712815,0.5366789,0.7835224,-0.7423546,0.6503855,0.44647282,0.3631722,-0.66614413,0.10151727,-0.1348695,0.32992417,0.10387001,-0.26746857,-0.33413792,-0.5662058,0.36110422,0.7741211,-0.039930806,-0.15249825,0.09454683,0.4891987,0.0062028184,0.06745152,0.55465925,-0.06739082,0.5588079,-0.43696547,0.555966,0.56702,0.056295626,-0.62005293,-0.3722073,0.21030217,-0.017268468,0.95288086,0.51696795,-0.25066343,-0.3169728,0.42543235,0.31396082,0.17551036,0.3922707,0.07407632,0.91187936,0.38888615,-0.12070266,0.011815081,-0.45720986,0.04727247,0.62094855,-0.45443076,0.16062841,-0.40287957,-0.55417335,-0.3830013,0.055438586,0.1718703,-0.6422826,0.22917171,-0.5290951,0.1585279,0.07934802,0.50577295,-0.035466444,0.088082545,0.5693788,0.11773129,0.1821725,0.41347143,-0.2278498,0.50422746,0.29794943,-0.9369089,0.47065943,0.28594327,-0.6866015,-0.63375616,-0.15243755,-0.46409172,-0.4630497,-0.25025153,0.6375835,0.54886156,0.19831929,-0.03725618,0.20592122,0.36338213,0.31409082,-0.05410012,0.14887711,0.09740482,0.05067692,0.14775206,0.28025475,-0.34377113,-0.27423778,-0.354568,-0.20043556,0.3899774,-0.19812085,-0.36292082,-0.18255037,0.07038237,0.642794,-0.060884897,0.2948623,0.68766963,0.6928454,-0.3849391,0.17996079,0.12549743,0.10299729,-0.25861496,-0.09246836,-0.32353002,-0.01378604,-0.095313616,-0.04558251,0.20014873,-0.4066689,-0.08052519,-0.4618455,0.37693843,0.45283204,-0.114583425,0.050728872,0.13196129,-0.1941961,-0.11727777,3.9586966,0.05150596,0.11701303,0.5739518,0.07567582,0.48247826,0.25156844,0.38180268,0.12796494,0.009531708,-0.04081167,-0.30954623,-0.035167653,0.43064785,0.24091315,-0.11113215,0.027972942,0.3501582,0.54151994,0.14281327,-0.6420307,0.48611403,0.5221564,0.47878447,0.8510151,0.5528693,0.27463847,0.7548287,0.760392,0.4057206,0.5247366,0.6815815,0.46189928,-0.0665814,0.29575244,-0.13240056,0.44400057,-0.26975283,-0.15510945,0.15475176,-0.46221858,0.054507546,0.14640503,0.66453534,0.19300742,-0.3626597,-0.16279799,0.3795997,0.122737944,-0.20419496,0.18285695,0.027228406,-0.22584598,-0.16478994,0.28747237,0.53937024,0.44095138,0.6340223,-0.41380823,0.38367343,0.39497304,-0.043954037,0.38885015,-0.33315817,-0.4766579,0.17371525,-0.23392603,0.7948543,0.3054392,-0.72041094,0.2532946,0.415873,0.80443436,-0.34634262,-0.4886025,0.30351955,-0.049782824,-0.47253707,-0.11401102,-0.096243046,0.19083612,-0.34427363,-0.24545296,0.5773733,0.16357873,0.38620606,0.39995435,-0.65907687,0.6957725,0.24120355,0.34054404,-0.039899644,0.80393964,0.06337182,0.14144897,0.117613785,-0.019442292,-3.7490542,-0.38971332,0.14894387,-0.61240107,0.19039957,0.23817067,0.022639165,0.015894404,-0.6198486,0.14320132,0.041371442,-0.30882874,-0.30676636,1.0463533,-0.034157425,0.31748047,0.4891939,0.5333419,-0.3289819,0.14962271,0.2807266,0.35519713,0.4001028,-0.18559772,-0.7066097,0.14664957,-0.565848,0.013109448,-0.18452193,-0.07372118,0.28156808,-0.36035228,0.8867393,-0.16306667,-0.04191513,0.4594507,0.43135175,-0.091903865,-0.042651527,-0.32555583,-0.19054003,-0.06525034,0.16911364,0.04686202,-0.038171988,0.1336097,0.3761719,-0.050084345,-0.2679286,0.64759475,0.7107872,0.2074471,-0.27312976,0.40090975,0.5491712,-0.10747743,0.74496686,0.18130445,-0.09431538,0.19524746,-0.21418755,-0.12488151,0.15227054,-0.3852693,-0.7784234,-0.14571632,0.041122317,-0.16407914,0.03949264,-0.1925929,0.32901394,0.12069722,0.23391949,-0.16763128,-0.12962814,0.5088096,0.21486548,-0.20993523,0.603585,0.24633685,-0.14029086,-0.27401388,-0.49189645,-0.10249644,2.3196032,-0.12417316,2.1448603,0.058190174,-0.6551869,0.6827868,0.6356786,0.7710372,0.3722568,0.8363127,0.3799041,0.26085538,-0.20764771,0.512162,0.08349497,-0.15835808,0.5738307,-0.66654295,0.18993358,0.32188657,0.0764867,0.64592606,-0.2310478,0.18350935,-0.3915338,0.028645294,-0.101273224,0.8696747,-0.50792813,-0.39119712,-0.30162883,0.7319297,0.71813834,0.39383802,-0.012138247,0.3298783,0.23386809,4.5470805,-0.049004212,0.107414484,0.052308656,0.2678271,-0.15366946,0.5438965,-0.47809094,-0.14649442,0.022792917,0.1358324,-0.4503206,0.57014287,-0.13368002,0.23805767,-0.22125027,-0.08700341,0.045676652,0.16678812,-0.27974084,0.45245427,-0.2107062,0.6667994,0.036875203,0.54632777,0.20104687,0.5349449,0.06913179,-0.086024776,0.76876926,0.16203642,5.155099,0.2797164,0.21450946,-0.17529553,-0.038863413,0.5156995,0.08603405,-0.516439,-0.35604522,0.10131945,0.008194211,0.084706515,-0.34049395,0.21572115,-0.83385843,-0.046860088,-0.48247585,0.023293016,0.22008015,-0.5305121,0.5061096,0.0183293,0.1326365,0.22057603,-0.43027383,-0.3885953,0.1500542,-0.1449458,-0.38747045,0.2789606,0.27069542,-0.37978157,-0.58541,0.5139468,-0.60586643,-0.5236463,0.22003366,0.15764758,0.3512009,0.13694952,0.7772281,0.28431293,0.113065295,0.14233269,-0.047996823,0.0024461043,0.06218189,-0.28065726,-0.2061346,-0.36278206,0.24291486,-0.0869041,0.7448049,0.36513415,0.61559093,0.42820337,0.41123256,-0.32082868,-0.10876272,-0.028618973,0.6750199,-0.048880983,-0.12521495,0.1926665,0.6695621,0.21937566,0.46856737,0.30544627,0.2650348,-0.11578811,-0.15696093,-0.047148716,0.19283816,0.12149068,-0.03274016,0.021503512,0.008024155,0.19709297,0.15727529,0.14134975,-0.16997191,-0.063695885,-0.39591065,-0.11891319,-0.04673462,0.16978487,-0.09345571,0.11924938,0.13301763,-0.2266567,0.4164705,0.3571622,0.09038913,0.18044233,0.09119875,-0.23754075,0.45051736,0.35435763,0.20957275,0.5704436,-0.36682,0.26963162,0.15532929,-0.24306794,0.17486432,0.39116114,0.12234816,0.21448524,-0.019066956,-0.09756305,0.4465544,0.3394048,-0.7088385,-0.5032021,0.03529406]
],
limit = 16000,
search_params = search_params
)
print(f"recall: {res.recalls}")
The output is:
recall: [0.9886875152587891]
With a recall of 98.87%, we're below our target of 99.5%. Let's measure latency too (optional):
search_params = {
"params": {
"level": 1,
# "enable_recall_calculation": True
}
}
start_time = time.perf_counter()
client.search(collection_name = "ZillizCloudVectorDBBench",
data = data,
limit = limit,
search_params = search_params
)
end_time = time.perf_counter()
elapsed_time = end_time - start_time
print(f"latency at level=1: {elapsed_time:.2f} seconds")
And the output is:
latency at level=1: 0.03 seconds
Step 4: Find the Optimal Level Value
Now, let's systematically test different level values to find the optimal configuration:
recall at level=1: [0.9886875152587891]
latency at level=1: 0.03 seconds
...
recall at level=6: [0.9947500228881836]
latency at level=6: 0.04 seconds
recall at level=7: [0.9961249828338623]
latency at level=7: 0.04 seconds
...
recall at level=10: [1.0]
latency at level=10: 0.06 seconds
The experiments show that level=7
meets our recall target (99.61% > 99.5%) while keeping latency within acceptable limits.
Step 5: Monitoring System Performance (Optional)
Zilliz Cloud offers a comprehensive set of system metrics to help you monitor performance effectively. When adjusting the level parameter and applying new settings, Zilliz Cloud provides real-time tracking of key metrics like QPS and latency. This allows you to assess the impact of parameter adjustments on overall system performance, enabling more informed decision-making.
Step 6: Validate with Production Workload
For production validation, we recommend:
Running tests with representative query vectors
Monitoring the Zilliz Cloud dashboard for:
Average latency
QPS
Resource utilization
Step 7: Conduct Comprehensive Benchmarking (Optional)
For a more rigorous evaluation, we used VectorDBBench, an open-source benchmarking tool, to test various configurations with 1,000 search operations:
level | Avg latency(ms) | QPS | Recall from VectorDBBench | Recall from Zilliz Cloud |
---|---|---|---|---|
1 | 3.1 | 1266 | 0.9519 | 0.953 |
2 | 3.2 | 1080 | 0.9644 | 0.9669 |
3 | 3.4 | 972 | 0.9728 | 0.9755 |
4 | 3.6 | 814 | 0.9816 | 0.9846 |
5 | 3.9 | 704 | 0.9871 | 0.99 |
6 | 4.4 | 549 | 0.9916 | 0.995 |
7 | 5 | 448 | 0.9936 | 0.9971 |
8 | 5.4 | 375 | 0.9945 | 0.9983 |
9 | 5.9 | 340 | 0.9952 | 0.9991 |
10 | 6.3 | 296 | 0.9958 | 0.9995 |
Table 1: Test Results based on 1,000 search operations from VDBBench
These benchmarking results confirm that:
The
enable_recall_calculation
estimates closely match actual recall performanceHigher recall comes at the cost of increased latency and reduced QPS
For our target requirements,
level=7
provides the optimal balance
It is important to note that improvements in recall generally come at the cost of increased latency and reduced QPS. If the current recall meets your business needs but QPS is suboptimal, you may consider scaling up CU resources or adding replicas to enhance throughput.
Step 8: Confirm the Optimal Configuration
Based on the experimental results, the optimal configuration in this example is level=7
, which delivers a 99.6% recall while maintaining acceptable latency.
Conclusion
Finding the perfect balance between search accuracy and performance has long been a challenge for vector database users. With Zilliz Cloud's new level
and enable_recall_calculation
parameters, developers now have powerful tools to optimize their vector search implementations for their specific requirements.
Whether you're building recommendation systems that prioritize speed, security applications that demand high accuracy, or anything in between, these features enable you to:
Precisely tune search accuracy to meet your requirements
Validate actual recall rates with built-in estimation
Make data-driven decisions about configuration trade-offs
We're committed to continuing to enhance Zilliz Cloud with features that make vector search more powerful, flexible, and easy to optimize. These new parameters are just the beginning of our journey to empower developers to build exceptional AI-powered applications.
Ready to start optimizing your vector searches? Sign up for Zilliz Cloud today or log in to your existing account to explore these new features.
- Introduction
- Different Use Cases, Different Requirements
- Introducing Zilliz Cloud's Precision Tuning Features
- Practical Implementation: A Step-by-Step Guide
- Conclusion
Content
Start Free, Scale Easily
Try the fully-managed vector database built for your GenAI applications.
Try Zilliz Cloud for FreeKeep Reading

Build RAG with LangChainJS, Milvus, and Strapi
A step-by-step guide to building an AI-powered FAQ system using Milvus as the vector database, LangChain.js for workflow coordination, and Strapi for content management

Enabling Fine-Grained Access Control with Milvus Row-Level RBAC
Milvus offers row-level RBAC (Role-Based Access Control) which is a robust solution for managing data access with precision and efficiency.

The Practical Guide to Self-Hosting Compound LLM Systems
BentoML shares its research insights in AI orchestration, demonstrating solutions for optimizing performance issues when self-hosting AI models.