Blog
What’s New In Milvus 2.3 Beta - 10X faster with GPUs

What’s New In Milvus 2.3 Beta - 10X faster with GPUs

Mar 21, 20233 min read

We are proud to announce the Beta release of Milvus 2.3 on behalf of the Milvus community. This Beta release contains new features and improvements that we are sure will boost the performance of your AI-powered applications. We appreciate your help testing some of these capabilities to quickly get us to the general release! This blog post will highlight some of the more prominent features. For a complete list of changes, check the release notes.

📦 PyPI: https://pypi.org/project/milvus/
📚 Docs: https://milvus.io/docs
🛠️ Release Notes: https://github.com/milvus-io/milvus/releases
🐳 Docker Image: docker pull milvusdb/milvus
🚀 Release: https://github.com/milvus-io/milvus/releases/tag/v2.3.0-beta

One of the features of Milvus 2.3 Beta is its support for GPU acceleration and RAFT-based integration, which allows Milvus to take full advantage of the power of modern graphics processing units. The GPU-accelerated Milvus delivers 10X faster performance than the CPU-only version. This can significantly enhance the speed and responsiveness of your AI and machine learning-powered applications, enabling faster and more accurate data processing.

Another critical feature of Milvus 2.3 Beta is its range search support, allowing users to search for data within a specified range. This can be particularly useful for applications that require complex data queries, as it allows for more precise and accurate searching. In addition, Milvus 2.3 Beta also supports mmap, and incremental backups, all of which can help further to boost the performance and efficiency of your AI applications. By allowing for more efficient management and storage of data, these features can ensure that your AI systems are continuously operating at peak levels.

Overall, the improvements in this release are essential for any developer building applications with similarity search capabilities.

Nvidia GPU support This new feature brings the ability to support heterogeneous computing, which can significantly accelerate specialized workloads. This new addition allows users to expect faster and more efficient vector data searches, ultimately improving productivity and performance. We compared RAFT-IVF-Flat (GPU) with IVF-Flat (CPU) and HNSW (CPU) on four datasets at a 95% recall. The GPU index achieved an average of 32x and 8x higher throughput than IVF-Flat and HNSW. Evaluation results are shown in Table 1. (These benchmarks ran against Knowhere on a host with an 8-core CPU, 32 GB of RAM, and an Nvidia A100 GPU)

Table 1. The QPS of IVF-Flat, HNSW, RAFT-IVF-Flat on four datasets at 95% recall

	SIFT	GIST	GLOVE	DEEP
IVF-Flat (CPU)	3097	142	791	723
HNSW (CPU)	14,537	791	1,516	5,761
RAFT-IVF-Flat (GPU)	121,568	5,737	20,163	16,557

Special thanks go to @wphicks and @cjnolet from Nvidia for their valuable contributions to the RAFT code.

Range Search Range search is a different search method from k-NN query. k-NN queries return a fixed number of the nearest neighbors. For range search, given a query q and distance threshold R, it returns all entities within distance R of q. Range search is commonly used to find all relevant results within a specified range. For instance, it can help with (but is not limited to) data deduplication or detecting copyright infringement without missing similar candidates.

Upsert Upsert is an operation that will update an entity's value if it already exists in a collection or insert a new one if it does not exist. Milvus offers high flexibility to add data to your collections. For now, there are three options in total:

Bulk insert for high throughput in the offline cases.
Insert for low latency in the online streaming cases.
Upsert for the cases if you are unsure about whether to update or insert new entities.

Change Data Capture (CDC) Change Data Capture (CDC) is the process of identifying and capturing changes to data in a vector database in real-time and delivering those changes to downstream systems. Milvus now offers zero-downtime backup and synchronization based on this mechanism. Developers can also use CDC to capture and provide a continuous stream of changes to their downstream workloads, such as data analytics or customized auditing.

Memory-mapped (mmap) file I/O In scenarios of insufficient memory in large data sets and query performance is not critical, Milvus uses mmap to allow the system to treat parts of a file as if they were in memory, reducing memory usage and improving performance if all data is in the system page cache.

Summary

In addition to all of the features listed above, Milvus 2.3 Beta includes several bug fixes and improvements. To learn more:

See the release notes for version 2.3 Beta for the complete list of changes
Download Milvus and get started
Check out the Milvus benchmarks in this paper

Updated on Mar 28, 2025

Chris Churilo
Chris Churilo is the VP of Marketing & Community at Zilliz where she leads all community, developer relations, and marketing efforts. Prior to Zilliz, Chris was a founding member of the InfluxData’s go to market efforts and helped propel the time series database platform to dominance in the market. In earlier roles she defined and designed a SaaS monitoring solution at Centroid, and prior to that she was the VP of product management at iPass and was the LOB for several cloud services that required her to track the business and operational metrics and analytics to help identify and resolve issues.

Content

Summary

Start Free, Scale Easily

Try the fully-managed vector database built for your GenAI applications.

Try Zilliz Cloud for Free

Share this article

Keep Reading

Vector Databases vs. Graph Databases

Use a vector database for AI-powered similarity search; use a graph database for complex relationship-based queries and network analysis.

GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval

GPL is an unsupervised domain adaptation technique for dense retrieval models that combines a query generator with pseudo-labeling.

Introducing Milvus 2.5: Built-in Full-Text Search, Advanced Query Optimization, and More 🚀

We're thrilled to announce the release of Milvus 2.5, a significant step in our journey to build the world's most complete solution for all search workloads.