Getting Started with GPU-Powered Milvus: Unlocking 10x Higher Performance
Are you ready to enhance your vector searching with GPU acceleration? Milvus 2.3, the latest release of Milvus, officially supports NVIDIA A100 GPUs, providing a 10x increase in throughput and significant reductions in latency. This blog post will delve into the motivations behind this strategic innovation and show you how to start with the Milvus GPU version.
Why does Milvus introduce GPU support?
Vector databases play a crucial role in large-scale data retrieval and similarity searching. However, traditional CPU-based indexing strategies need help to keep up with the increasing demand for high performance and low latency, particularly with the rise of Large Language Models (LLMs) like GPT-3. Recognizing the potential synergy between Milvus and NVIDIA GPUs, the Milvus team decided to introduce GPU support in Milvus 2.3.
Thanks to the support from the NVIDIA team (Special thanks go to @wphicks and @cjnolet from NVIDIA for their valuable contributions to the RAFT code), GPU support in Milvus has become a reality, making it possible to quickly and efficiently search through massive datasets and expand the AI landscape.
Getting started with Milvus GPU version
Let's dive into the steps to kickstart your journey with the Milvus GPU version.
Installing CUDA driver
First and foremost, ensure that your host machine recognizes your NVIDIA GPU. You can verify this by running the following command:
lspci
You're good to go if you see the "NVIDIA" field in the device output. Below is my device's result, which recognizes an NVIDIA T4 graphics card.
00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma]00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]00:01.3 Non-VGA unclassified device: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 08)00:03.0 VGA compatible controller: Amazon.com, Inc. Device 111100:04.0 Non-Volatile memory controller: Amazon.com, Inc. Device 806100:05.0 Ethernet controller: Amazon.com, Inc. Elastic Network Adapter (ENA)00:1e.0 3D controller: NVIDIA Corporation TU104GL [Tesla T4] (rev a1)00:1f.0 Non-Volatile memory controller: Amazon.com, Inc. NVMe SSD Controller
Next, install the necessary CUDA drivers. You can find the appropriate driver for your system on the NVIDIA website.
For example, if you use the Ubuntu Linux 20.04 operating system (OS), you can download and install the driver by executing the following commands:
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.1-1_all.debsudo dpkg -i cuda-keyring_1.1-1_all.debsudo apt-get update
Note:
You can skip the CUDA installation step if your host machine does not require CUDA drivers.
The minimum driver version required depends on your GPU type:
NVIDIA Tesla series professional GPUs: >=450.80.02
Gaming GPUs: >=520.61.05
After installing the driver, you must restart the system for it to take effect. Once the restart is complete, you can proceed by entering the following command:
nvidia-smi
Note: The Milvus GPU image supports NVIDIA graphics cards with Compute Capability 6.1, 7.0, 7.5, and 8.0.
To learn your GPU Compute Capability, visit NVIDIA GPU Compute Capability.
For instructions on installing NVIDIA Container Toolkit, refer to NVIDIA documentation.
Milvus GPU configuration
The Milvus GPU version only supports a single Milvus process and a single GPU by default. You can run multiple Milvus processes or containers and set the CUDA_VISIBLE_DEVICES
environment variable to utilize multiple GPUs.
- In a Container, you can set this environment variable using
-e
:
sudo docker run --rm -e NVIDIA_VISIBLE_DEVICES=3 milvusdb/milvus:v2.3.0-gpu-beta
- You can set this environment variable in a Docker Compose using the
device_ids
field. Refer to GPU access with Docker Compose Documentation for more information.
Note:
Even if you configure multiple graphics cards for a single Milvus process or container, Milvus can only utilize one.
You can fine-tune performance by adjusting environment variables like
KNOWHERE_STREAMS_PER_GPU
(for CUDA stream concurrency) andKNOWHERE_GPU_MEM_POOL_SIZE
(for GPU memory pool size).We strongly recommend adjusting your environment variable if you deploy two Milvus processes on a single graphics card. Otherwise, Milvus may experience crashes due to memory competition.
Building Milvus GPU version locally
Before you build Milvus locally, ensure you have installed the necessary dependencies.
- CUDA Toolkit
sudo apt install --no-install-recommends cuda-toolkit
- Python 3, pip, libopenblas-dev, libtbb-dev, and pkg-config:
sudo apt install python3-pip libopenblas-dev libtbb-dev pkg-config
- Conan, a C/C++ package manager:
pip3 install conan==1.59.0 --userexpoprt PATH=$PATH:~/.local/bin
CMake (>=3.23): refer to Kitware APT Repository for more details.
Golang: refer to Go documentation for more details.
After you have installed all the necessary tools, build the Milvus GPU version using the following command:
make milvus-gpu
Running Milvus
Start Milvus in standalone mode by running the following command:
cd binsudo ./milvus run standalone
If you prefer containerization, you can use the provided docker-compose.yml
file for deployment.
docker-compose up -d
Conclusion
The introduction of GPU support in Milvus 2.3 opens up exciting possibilities for accelerating vector database performance. With NVIDIA A100 GPUs at your disposal, you can achieve remarkable gains in both throughput and latency, making it a compelling choice for data-intensive applications and AI workloads.
Start Free, Scale Easily
Try the fully-managed vector database built for your GenAI applications.
Try Zilliz Cloud for Free