Getting Started with GPU-Powered Milvus: Unlocking 10x Higher Performance
Are you ready to enhance your vector searching with GPU acceleration? Milvus 2.3, the latest release of Milvus, officially supports NVIDIA A100 GPUs, providing a 10x increase in throughput and significant reductions in latency. This blog post will delve into the motivations behind this strategic innovation and show you how to start with the Milvus GPU version.
Why does Milvus introduce GPU support?
Vector databases play a crucial role in large-scale data retrieval and similarity searching. However, traditional CPU-based indexing strategies need help to keep up with the increasing demand for high performance and low latency, particularly with the rise of Large Language Models (LLMs) like GPT-3. Recognizing the potential synergy between Milvus and NVIDIA GPUs, the Milvus team decided to introduce GPU support in Milvus 2.3.
Thanks to the support from the NVIDIA team (Special thanks go to @wphicks and @cjnolet from NVIDIA for their valuable contributions to the RAFT code), GPU support in Milvus has become a reality, making it possible to quickly and efficiently search through massive datasets and expand the AI landscape.
Getting started with Milvus GPU version
Let's dive into the steps to kickstart your journey with the Milvus GPU version.
Installing CUDA driver
First and foremost, ensure that your host machine recognizes your NVIDIA GPU. You can verify this by running the following command:
You're good to go if you see the "NVIDIA" field in the device output. Below is my device's result, which recognizes an NVIDIA T4 graphics card.
00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma]00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]00:01.3 Non-VGA unclassified device: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 08)00:03.0 VGA compatible controller: Amazon.com, Inc. Device 111100:04.0 Non-Volatile memory controller: Amazon.com, Inc. Device 806100:05.0 Ethernet controller: Amazon.com, Inc. Elastic Network Adapter (ENA)00:1e.0 3D controller: NVIDIA Corporation TU104GL [Tesla T4] (rev a1)00:1f.0 Non-Volatile memory controller: Amazon.com, Inc. NVMe SSD Controller
Next, install the necessary CUDA drivers. You can find the appropriate driver for your system on the NVIDIA website.
For example, if you use the Ubuntu Linux 20.04 operating system (OS), you can download and install the driver by executing the following commands:
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.1-1_all.debsudo dpkg -i cuda-keyring_1.1-1_all.debsudo apt-get update
You can skip the CUDA installation step if your host machine does not require CUDA drivers.
The minimum driver version required depends on your GPU type:
NVIDIA Tesla series professional GPUs: >=450.80.02
Gaming GPUs: >=520.61.05
After installing the driver, you must restart the system for it to take effect. Once the restart is complete, you can proceed by entering the following command:
Note: The Milvus GPU image supports NVIDIA graphics cards with Compute Capability 6.1, 7.0, 7.5, and 8.0.
To learn your GPU Compute Capability, visit NVIDIA GPU Compute Capability.
For instructions on installing NVIDIA Container Toolkit, refer to NVIDIA documentation.
Milvus GPU configuration
The Milvus GPU version only supports a single Milvus process and a single GPU by default. You can run multiple Milvus processes or containers and set the
CUDA_VISIBLE_DEVICES environment variable to utilize multiple GPUs.
- In a Container, you can set this environment variable using
sudo docker run --rm -e NVIDIA_VISIBLE_DEVICES=3 milvusdb/milvus:v2.3.0-gpu-beta
- You can set this environment variable in a Docker Compose using the
device_idsfield. Refer to GPU access with Docker Compose Documentation for more information.
Even if you configure multiple graphics cards for a single Milvus process or container, Milvus can only utilize one.
You can fine-tune performance by adjusting environment variables like
KNOWHERE_STREAMS_PER_GPU(for CUDA stream concurrency) and
KNOWHERE_GPU_MEM_POOL_SIZE(for GPU memory pool size).
We strongly recommend adjusting your environment variable if you deploy two Milvus processes on a single graphics card. Otherwise, Milvus may experience crashes due to memory competition.
Building Milvus GPU version locally
Before you build Milvus locally, ensure you have installed the necessary dependencies.
- CUDA Toolkit
sudo apt install --no-install-recommends cuda-toolkit
- Python 3, pip, libopenblas-dev, libtbb-dev, and pkg-config:
sudo apt install python3-pip libopenblas-dev libtbb-dev pkg-config
- Conan, a C/C++ package manager:
pip3 install conan==1.59.0 --userexpoprt PATH=$PATH:~/.local/bin
CMake (>=3.23): refer to Kitware APT Repository for more details.
Golang: refer to Go documentation for more details.
After you have installed all the necessary tools, build the Milvus GPU version using the following command:
Start Milvus in standalone mode by running the following command:
cd binsudo ./milvus run standalone
If you prefer containerization, you can use the provided
docker-compose.yml file for deployment.
docker-compose up -d
The introduction of GPU support in Milvus 2.3 opens up exciting possibilities for accelerating vector database performance. With NVIDIA A100 GPUs at your disposal, you can achieve remarkable gains in both throughput and latency, making it a compelling choice for data-intensive applications and AI workloads.