Efficiently Deploying Milvus on GCP Kubernetes: A Guide to Open Source Database Management
Self-hosting Milvus on Kubernetes (K8s), especially in the Google Cloud Platform (GCP) environment, offers numerous benefits. Read about the benefits and how to set up the Kubernetes cluster on GCP in the blog.
Read the entire series
- Effortless AI Workflows: A Beginner's Guide to Hugging Face and PyMilvus
- Building a RAG Pipeline with Milvus and Haystack 2.0
- How to Pick a Vector Index in Your Milvus Instance: A Visual Guide
- Semantic Search with Milvus and OpenAI
- Efficiently Deploying Milvus on GCP Kubernetes: A Guide to Open Source Database Management
- Building RAG with Snowflake Arctic and Transformers on Milvus
- Vectorizing JSON Data with Milvus for Similarity Search
- Building a Multimodal RAG with Gemini 1.5, BGE-M3, Milvus Lite, and LangChain
1. Introduction
Vector databases like Milvus enable powerful similarity searches, revolutionizing industries with content-based retrieval capabilities. Milvus, an open-source vector database, excels at efficiently storing, managing, and searching through vast collections of high-dimensional vector embeddings, unlocking unstructured data’s potential and enhancing AI Applications and machine learning models’ capabilities to understand context, nuances, and semantic relationships.
Self-hosting Milvus on Kubernetes (K8s), especially in the Google Cloud Platform (GCP) environment, offers numerous benefits. It provides flexibility and control over the deployment, allowing organizations to tailor the setup according to their specific requirements. K8s enables seamless scaling and load balancing, ensuring efficient resource utilization and high availability. The GCP ecosystem offers robust security, monitoring, and integration with other GCP services, streamlining the overall management and operation of Milvus deployments. Self-hosting empowers organizations to leverage Milvus’s full potential while maintaining control over their data and infrastructure.
2. Why Kubernetes for Milvus on GCP?
Deploying Milvus, an open-source vector database designed for scalable and fast vector data search, on K8s offers significant advantages in terms of scalability, resilience, and manageability. Below are several reasons why K8s is an ideal choice for Milvus deployments, particularly on GCP:
Scalability
K8s excels at managing horizontal scaling, allowing Milvus pods to scale out seamlessly to handle varying workloads and large datasets. This capability is important for maintaining performance and efficiency in vector search operations, making it a key component of successful Cloud Deployment strategies.
Resilience
K8s improves the resilience of Milvus through its self-healing mechanisms and support for persistent volumes. These features ensure high availability and data durability, which are vital for maintaining robust Database Security.
Manageability
From a manageability perspective, K8s simplifies deployments with automated rollouts, rollbacks, and declarative configurations. This streamlines the maintenance of complex applications like Milvus, which is critical in any open-source project.
Integration with GCP Services
Deploying Milvus on K8s within the GCP environment unlocks synergies and benefits through integration with GCP services:
Secure and Scalable Networking: K8s clusters on GCP can leverage Google’s Virtual Private Cloud (VPC) for secure networking, crucial for maintaining Database Security.
Efficient Load Balancing: GCP’s load balancing solutions ensure efficient request distribution across Milvus instances, optimizing resource use and response times.
High-Performance Storage: GCP’s Persistent Disks, ideal for databases like Milvus, can be easily attached to K8s pods, simplifying storage management and enhancing performance.
Comprehensive Monitoring: Integrating with GCP’s operations suite enables comprehensive monitoring, logging, and diagnostics for Milvus deployments, crucial for maintaining system health and performance.
Cost-Effective Scaling: GCP also offers custom machine types and preemptible VMs, allowing cost-effective scaling of Milvus deployments based on workload demands.
These factors make K8s an excellent platform for deploying Milvus on GCP, enhancing scalability, resilience, manageability, and integration with robust cloud services, all while supporting the principles of Open Source software development.
3. Prerequisites
Before deploying Milvus on GCP, ensure you have a GCP project already set up, named milvus-testing-nonprod
for the purposes of this guide. If you do not have a project, you can create one by following the instructions for Creating and managing projects.
You’ll need to have the gcloud
CLI installed on your local machine or use the browser-based Google Cloud Shell. Both kubectl
and Helm should also be installed locally, as they are essential for managing K8s clusters and applications. Make sure to initialize the gcloud
CLI with your GCP account credentials to manage your GCP resources effectively.
Required Tools and Access
To deploy Milvus on GCP, the following tools and access rights are necessary:
Google Cloud Platform Account: Access to a GCP account with permissions to create and manage resources is essential.
gcloud CLI: This command-line interface is necessary for managing GCP services, including networks and VMs. You must install this tool on your local machine or use Google Cloud Shell.
kubectl: This is a command-line tool for K8s cluster management. It is crucial for deploying and managing the K8s resources.
Helm: Helm is used to manage K8s applications through charts, simplifying the deployment and maintenance of applications.
Network Access: You need sufficient permissions to configure networks and firewall rules within your GCP project. Below are the commands to set up a virtual network and configure firewall rules to ensure secure and functional connectivity for Milvus:
Create a VPC Network:gcloud compute networks create milvus-network
--project=milvus-testing-nonprod
--subnet-mode=auto
--mtu=1460
--bgp-routing-mode=regional
Set Up Firewall Rules:
gcloud compute firewall-rules create milvus-network-allow-icmp
--project=milvus-testing-nonprod
--network=projects/milvus-testing-nonprod/global/networks/milvus-network
--description="Allows ICMP connections from any source to any instance on the network."
--direction=INGRESS
--priority=65534
--source-ranges=0.0.0.0/0
--action=ALLOW
--rules=icmp
gcloud compute firewall-rules create milvus-network-allow-internal
--project=milvus-testing-nonprod
--network=projects/milvus-testing-nonprod/global/networks/milvus-network
--description="Allows connections from any source in the network IP range to any instance on the network using all protocols."
--direction=INGRESS
--priority=65534
--source-ranges=10.128.0.0/9
--action=ALLOW
--rules=all
gcloud compute firewall-rules create milvus-network-allow-rdp
--project=milvus-testing-nonprod
--network=projects/milvus-testing-nonprod/global/networks/milvus-network
--description="Allows RDP connections from any source to any instance on the network using port 3389."
--direction=INGRESS
--priority=65534
--source-ranges=0.0.0.0/0
--action=ALLOW
--rules=tcp:3389
gcloud compute firewall-rules create milvus-network-allow-ssh
--project=milvus-testing-nonprod
--network=projects/milvus-testing-nonprod/global/networks/milvus-network
--description="Allows TCP connections from any source to any instance on the network using port 22."
--direction=INGRESS
--priority=65534
--source-ranges=0.0.0.0/0
--action=ALLOW
--rules=tcp:22
Allow Milvus Traffic on Port 19530:gcloud compute firewall-rules create allow-milvus-in
--project=milvus-testing-nonprod
--description="Allow ingress traffic for Milvus on port 19530"
--direction=INGRESS
--priority=1000
--network=projects/milvus-testing-nonprod/global/networks/milvus-network
--action=ALLOW
--rules=tcp:19530
--source-ranges=0.0.0.0/0
Ensure these tools are configured correctly and that you have the necessary administrative access to perform the operations within your GCP project.
4. Setting Up the Kubernetes Cluster on GCP
Creating and Configuring the Cluster
To set up a K8s cluster suitable for Milvus on GCP using Google Kubernetes Engine (GKE), follow these detailed instructions:
Open the Google Cloud Console: Navigate to the Google Cloud Console and log in with your Google account.
Access GKE: From the navigation menu, select “Kubernetes Engine”, then “Clusters”.
Create Cluster: Click on the “Create” button to begin configuring your K8s cluster.
Basic Configuration:
Name your cluster: Enter “milvus-cluster-1” as the cluster name.
Choose the zone: Select “us-west1-a” for the cluster zone to ensure proximity to your data sources or users.
Node Configuration:
Machine type: Choose “c2-standard-4” for each node, which provides 16 GB of memory per node, ensuring optimal performance.
Image type: Select “COS_CONTAINERD”.
Disk type and size: Set the disk type to “pd-standard” with a size of 100 GB.
Advanced Settings:
Disable basic authentication by unchecking the “Enable basic authentication” box.
Set the K8s version to “1.27.3-gke.100” under the “Master version” dropdown.
Enable IP aliasing for better network performance and management.
Deploy the Cluster: Click “Create” to provision your cluster. It will take a few minutes for the cluster to be ready.
Once the cluster is active, fetch its credentials for remote management:
gcloud container clusters get-credentials milvus-cluster-1 --zone "us-west1-a"
Cluster Optimization for Milvus
To optimize your GKE cluster for enhanced Milvus performance, consider implementing the following settings:
Resource Allocation
- Node Adjustment: Based on the expected workload, adjust the number of nodes. For optimal performance and resilience, maintaining at least 3 nodes in your cluster is advisable. This setup helps distribute the load and ensures system resilience.
Network Performance
- VPC and Network Tiers: Leverage Google’s Virtual Private Cloud (VPC) and premium network tiers. This enhancement will improve the throughput between the nodes and facilitate efficient communication with other GCP services.
Storage Performance
SSDs vs. Standard Disks: For better I/O performance crucial for Milvus operations, opt for high-performance SSDs (
pd-ssd
) rather than standard disks (pd-standard
), budget permitting.Disk Size: If you anticipate handling large datasets, consider scaling up the disk size to accommodate your data storage needs without performance degradation.
Scaling and Auto-Scaling
- Auto-Scaling: Enable auto-scaling within your cluster to efficiently manage workload fluctuations. It’s important to configure auto-scaling parameters carefully, based on observed CPU and memory usage patterns, to maintain performance without unnecessary resource expenditure.
Pod Density
- Maximum Pods per Node: Depending on the operational requirements of your Milvus instance and the capacity of the nodes, you may need to increase the maximum number of pods per node. Up to 110 pods per node are supported, which allows for significant scaling and flexibility in deployment.
By carefully configuring these parameters, you can significantly enhance both the performance and stability of Milvus on your GKE cluster, ensuring a robust environment for managing and querying vector data.
5. Deploying Milvus on Kubernetes
Deploying Milvus on a K8s cluster involves setting up Helm, adding the Milvus Helm repository, and using Helm to deploy the Milvus application. Here’s a step-by-step guide:
Helm Installation and Setup
Helm is a package manager for Kubernetes that simplifies the deployment and management of applications. To install Milvus using Helm, you must first install Helm itself and then add the Milvus repository.
Install Helm: Follow the official Helm installation guide at Helm Docs to install Helm on your system.
Add Milvus Repository: Run the following commands to add the Milvus Helm repository. This repository contains all the necessary charts to deploy Milvus on K8s.
helm repo add milvus
https://zilliztech.github.io/milvus-helm/
helm repo update
Deploying Milvus with Helm
Once Helm is set up and the Milvus repository has been added, you can proceed to deploy Milvus on your K8s cluster.
Prepare Configuration
Create a values.yaml
file that specifies the configuration settings for your Milvus deployment. This file controls various aspects such as persistence, resource allocation, and service type.
helm show values milvus/milvus > values.yaml
Deploy Milvus
Execute the following command to deploy Milvus:
helm install -f values.yaml my-release milvus/milvus
In this command:
-f values.yaml
specifies the configuration file.my-release
is the name of your Helm release.milvus/milvus
identifies the chart to install.
Service Type
Ensure the service.type
value in values.yaml
is set to LoadBalancer
if you plan to expose the Milvus instance externally through a Layer-4 load balancer.
Verify the Deployment
Run the following command to find the external IP address of your Milvus deployment:
kubectl get services | grep my-release-milvus | grep LoadBalancer | awk '{print $4}'
This command helps confirm that all pods are running smoothly and the service is accessible.
6. Configuring Persistent Storage for Milvus on GCP
Persistent Volume Setup
To set up a Persistent Volume (PV) for Milvus on GCP, start by creating a GCP Persistent Disk in the Google Cloud Console. Choose an appropriate disk type and size. Next, define a PV in K8s that references this disk, specifying properties like storage capacity, access modes, and the disk name. Finally, create a Persistent Volume Claim (PVC) to request storage from your PV, ensuring it matches your Milvus deployment requirements.
Storage Management Practices
For effective storage management on GCP, regularly back up your Milvus data using GCP snapshots to safeguard against data loss. Automate these backups and monitor their status through Google Cloud’s operations suite. Additionally, keep an eye on disk utilization and performance metrics to timely scale your storage resources. Implement alerts for abnormal activities to maintain data integrity and optimize costs, ensuring your Milvus installation continues to run efficiently and securely.
7. Accessing and Managing Milvus
Access Techniques
Internal Access:
- Use internal network configurations and service discovery mechanisms to access Milvus within a cluster.
External Access:
Access Milvus via exposed endpoints such as:
RESTful API
SDKs
Connect through public IP addresses or domain names.
Management Tools
Milvus Insight:
A GUI-based tool for interactive management.
Monitor system health and query performance.
Manage resources efficiently.
Command Line Interface (CLI):
- Provides direct command control for advanced users.
8. Example Usage
Import Necessary Libraries
First, import the necessary libraries and modules.
from pymilvus import MilvusClient, DataType
import random
Connect to Milvus
Once you have obtained the cluster credentials or an API key, you can use it to connect to your Milvus now.
# 1. Set up a Milvus client
client = MilvusClient(
uri=”http://my-cluster.com/endpoint:19530”,
token=”mypassword”,
uri="http://localhost:19530"
)
Create a Collection
In Milvus, you need to store your vector embeddings in collections. All vector embeddings stored in a collection share the same dimensionality and distance metric for measuring similarity. You can create a collection in either of the following manners.
Quick Setup
To set up a collection in quick setup mode, you only need to set the collection name and the dimension of the vector field of the collection.
# 2. Create a collection in quick setup mode
client.create_collection(
collection_name="quick_setup",
dimension=5
)
Customized Setup
To define the collection schema by yourself, use the customized setup. In this manner, you can define the attributes of each field in the collection, including its name, data type, and extra attributes of a specific field.
# 3. Create a collection in customized setup mode
# 3.1. Create schema
schema = MilvusClient.create_schema(
auto_id=False,
enable_dynamic_field=True,
)
# 3.2. Add fields to schema
schema.add_field(field_name="my_id", datatype=DataType.INT64, is_primary=True)
schema.add_field(field_name="my_vector", datatype=DataType.FLOAT_VECTOR, dim=5)
# 3.3. Prepare index parameters
index_params = client.prepare_index_params()
# 3.4. Add indexes
index_params.add_index(
field_name="my_id"
)
index_params.add_index(
field_name="my_vector",
index_type="AUTOINDEX",
metric_type="IP"
)
# 3.5. Create a collection
client.create_collection(
collection_name="customized_setup",
schema=schema,
index_params=index_params
)
In the above setup, you have the flexibility to define various aspects of the collection during its creation, including its schema and index parameters.
Insert Data
Collections created in both Quick Setup and Customized Setup modes have been indexed and loaded. Once you are ready, you can insert some example data.
# 4. Insert data into the collection
# 4.1. Prepare data
data=[
{"id": 0, "vector": [0.3580376395471989, -0.6023495712049978, 0.18414012509913835, -0.26286205330961354, 0.9029438446296592], "color": "pink_8682"},
# More data...
]
# 4.2. Insert data
res = client.insert(
collection_name="quick_setup",
data=data
)
print(res)
Get Entities
If you know the IDs of the entities to retrieve, you can get entities by their IDs as follows:
# 5. Get entities by IDs
res = client.get(
collection_name="quick_setup",
ids=[1,2,3],
output_fields=["color", "vector"]
)
print(res)
Delete Entities
Milvus allows deleting entities by IDs and by filters.
# 6. Delete entities by IDs
res = client.delete(
collection_name="quick_setup",
ids=[0,1,2,3,4]
)
print(res)
Drop the Collection
Remember to drop the collection once you’re done:
# 7. Drop collection
client.drop_collection(
collection_name="quick_setup"
)
client.drop_collection(
collection_name="customized_setup"
)
This guide provided an overview of how to use Milvus to insert, retrieve, and delete data, along with the creation of collections. Be aware that data insertion process may take some time. It is recommended to wait a few seconds after inserting data and before conducting similarity searches. Filter expressions can be used in both search and query requests. However, they are mandatory for query requests.
9. Submitting an Open Source Issue on Milvus
When you encounter issues with the Milvus vector database, the process to submit an issue is straightforward but requires attention to detail to help maintainers understand and address the problem efficiently:
Check Existing Issues: Before submitting a new issue, visit the Milvus GitHub Issues page to check if someone else has already reported the same problem. This helps in avoiding duplicate issue reports.
Create a New Issue: If the issue is not already reported, create a new issue. Be sure to provide a detailed description that includes:
Steps to Reproduce: Clearly describe the steps to reproduce the bug. This is crucial for maintainers to see the problem in action.
Expected Outcome: Describe what you expect to happen when following the steps above.
Actual Outcome: Clearly state what actually happens, including any error messages or incorrect behaviors.
Environment Details: Include details like the Milvus version, operating system, and any relevant configurations.
Logs/Error Messages: Attach logs or error messages if available. Use Markdown to format logs and make them easier to read.
Submit the Issue: Once you have compiled all the necessary information, submit the issue. The Milvus community and maintainers will review it and provide feedback or ask for further information if needed.
Troubleshooting Common Milvus Deployment Issues
For common deployment issues with Milvus, consider the following troubleshooting steps:
Check Docker Logs: If you are running Milvus in containers, use the command
docker logs <container_id>
to check for any error messages that might indicate what is going wrong.Review Compatibility: Ensure that all components of your Milvus deployment are compatible with each other. The Milvus GitHub repository’s documentation often includes specific version requirements and compatibility information.
Configuration Checks: Verify that all configuration files are set up correctly according to the documentation provided in the Milvus GitHub repository.
Milvus Community and Support Resources
If you need further assistance with Milvus, you can access several support resources:
Milvus Community on Discord: Join the Milvus community on Discord for real-time discussions and support from other users and the Milvus team. You can join using this invitation link: Milvus Discord.
GitHub Discussions: For more structured discussions or to share ideas, check out Milvus Discussions on GitHub.
GCP Deployments: If your deployment issues are specific to GCP, refer to the Google Cloud Platform documentation or contact GCP support for specialized help.
Leveraging these resources effectively will help you resolve issues and make the most of your experience with Milvus.
- 1. Introduction
- 2. Why Kubernetes for Milvus on GCP?
- 3. Prerequisites
- 4. Setting Up the Kubernetes Cluster on GCP
- 5. Deploying Milvus on Kubernetes
- 6. Configuring Persistent Storage for Milvus on GCP
- 7. Accessing and Managing Milvus
- 8. Example Usage
- 9. Submitting an Open Source Issue on Milvus
Content
Start Free, Scale Easily
Try the fully-managed vector database built for your GenAI applications.
Try Zilliz Cloud for Free