Blog
The Evolution and Future of Vector Databases: Insights from Charles, CEO of Zilliz

The Evolution and Future of Vector Databases: Insights from Charles, CEO of Zilliz

Apr 04, 20248 min read

This is the first installment of our two-part blog series about the evolution and future of vector databases and AI.

Vector databases have emerged as a critical innovation in the rapidly evolving area of data science and artificial intelligence, driven by the surge in complex, unstructured data and the rise of large language models (LLMs). This new type of database is pivotal in managing and semantically querying unstructured data through vector embeddings, modernizing data accessibility and analysis, and catering to the demands of next-generation AI applications. Guided by insights from Charles, CEO of Zilliz, this blog examines vector databases’ evolution, current dynamics, and future trajectory.

What is a Vector Database?

A vector database is a cutting-edge data infrastructure designed to manage and query unstructured data such as images, videos, and natural languages. We can use deep learning algorithms to transform unstructured data into a novel data format known as vector embeddings. Then, we store these embeddings in a vector database, which allows us to perform semantic queries on that data. This capability is potent because, unlike traditional keyword-based searches, it will enable us to look into the semantics of unstructured data, offering a more nuanced and contextually rich search experience.

Developers widely use vector databases in building smart and personalized recommendation systems, AI-powered chatbots, and semantic search. With the rise of LLMs, vector databases have emerged as a key infrastructure component of Retrieval-Augmented Generation (RAG), providing LLMs with additional knowledge as query context for generating highly relevant answers.

The Current Landscape of Vector Databases

Currently, the market is populated with numerous "vector databases,” including purpose-built vector databases like Milvus, traditional databases with a vector search plugin like Elasticsearch, lightweight vector databases like Chroma, and many more technologies with vector search capabilities like FAISS. Even though there are so many types of vector databases, they are not all created equal.

Vector Search Technologies

Some adopt approaches akin to embedded systems, such as Chroma, which stands out as a leader in this regard. Its advantage lies in its minimal footprint, making it exceptionally straightforward for users to set up and initiate operations. However, like SQLite, Chroma is not a complete database system but a runtime library. Consequently, it lacks the support of essential functionalities like data persistence, data recovery, and, notably, scalability.

Databases like PGVector and Pinecone have embraced a scale-up approach. When deployed on more advanced processors, they achieve superior performance within a single-node instance, instilling short term confidence in their capabilities. However, scaling up has limitations, primarily due to the physical constraints imposed by a single-node machine. Pinecone, for instance, supports a substantial number of pods but is restricted by the CPU calls on a single x86 architecture. Opting for the latest, more expensive CPU architectures is necessary to surpass these limits.

Another drawback of the scale-up model is the risk of a single point of failure. In the event of a node failure, all data associated with that node will get lost. In contrast, a distributed architecture inherent to the scale-out approach allows for efficient data replication and failover mechanisms. In a worst-case scenario of losing a node in a distributed system with, for instance, 16 nodes, only a fraction (1/16) of the data is compromised. Recovering this smaller portion of data is more manageable and quicker, minimizing the risk of total data loss.

How Did We Build the Milvus Vector Database?

Before exploring our journey of building Milvus, let’s discuss the essence of database systems. Broadly speaking, a comprehensive database system comprises a storage layer, a specified storage format, a data orchestration layer responsible for placing or caching data in appropriate locations, a query optimizer, and an efficient execution engine. The execution engine and query optimizer must be flexible to adapt and optimize for a broader spectrum of hardware infrastructures, given the proliferation of heterogeneous architectures over the last decade. This flexibility relies on various underlying processors, spanning modern CPUs, ARM processors, GPUs, and an array of accelerators designed explicitly for AI applications. Such integration enables the crafting of optimal execution plans that exploit the unique strengths of each processor type, thereby significantly enhancing the efficiency and performance of query execution.

Then, what is our philosophy for building the Milvus vector database system?

Embracing heterogeneous computing

Since its inception, Milvus has fully committed to heterogeneous computing, showcasing its versatility and high performance across various modern processors. It supports various hardware, from Intel and AMD CPUs to ARM processors and Nvidia GPUs. Milvus’s integration capabilities cover AI vector processing tasks from basic linear algebra to intricate graph-based computations. This compatibility is crucial because each processor type has a unique instruction set, cache architecture, and execution model. Tailoring algorithms and optimizing the execution engine to match these distinct features maximize performance and efficiency.

Supporting both vertical and horizontal scalability

As data volumes continue to grow, scalability becomes a critical concern. We designed the Milvus system to address this challenge through both vertical (scaling up) and horizontal (scaling out) scalability. This capability entails developing diverse distributed algorithms to facilitate scaling out and adopting robust strategies for data consistency, synchronization, replication, and recovery in case of unexpected system failures.

Offering smooth developer experience from prototyping to production

We provide a suite of Milvus deployment modes to meet the unique needs of different stages of development: Milvus Lite for quick prototyping, Milvus Standalone for smaller-scale applications, Milvus Cluster for horizontal scalability, and Zilliz Cloud (the fully managed Milvus) for ease of management. In addition to maintaining the market-leading position in high-performance vector databases, we are committed to improving the experience of AI developers who are new to search. We will soon upgrade Milvus Lite to an even more beginner-friendly, easy-to-use deployment mode.

The core philosophy is straightforward: implement client-side code once and use it in any stage of your application development with tailored Milvus instances, from prototyping in a Jupyter notebook to a production service serving billions of documents, wherever vector search is needed.

For a detailed guide on choosing from the four different Milvus versions, see this blog about what Milvus version to start with.

How to Choose the Right Vector Database for Your Business?

When considering the shift to a vector database, there are two primary aspects to evaluate:

First, assess whether the performance of vector search is critical to your business. For example, if you're building a Retrieval Augmented Generation (RAG) solution that serves millions of users daily and is core to your business, the performance of vector computing becomes paramount. In such a case, opting for a pure vector database system is recommended. A specialized vector database like Zilliz Cloud not only ensures consistent, optimal performance but also aligns with your SLA requirements, providing reassurance for mission-critical services where performance is non-negotiable.

Second, consider the projected growth in data volume over time. As your service runs for an extended period, the volume of your datasets grows, making cost optimization an inevitable concern for your decision-making. Most pure vector database systems on the market deliver superior performance while requiring fewer resources, making them highly cost-effective. In this context, Milvus stands out, showcasing over 100 times more cost-effectiveness than alternatives such as PG Vector, OpenSearch, and other non-native web database solutions.

In addition to the above factors, performance, scalability, and functionality are among the top metrics for assessing a vector database. For a more detailed guide on evaluating vector databases, refer to this benchmarking blog

When is a Full-Scale, Distributed Vector Database Unnecessary?

A full-size vector database might be overkill for developers and organizations working on prototypes or testing RAG solutions. They could suffice with a locally running, lightweight vector database. To provide users with a more optimal user experience, Milvus will have greater support for local deployment, tailored for quicker setup during the initial stages of development.

Our commitment extends to providing a unified experience for developers, regardless of their project's scale or complexity. Whether you're trying out the AI stack on your laptop or looking for a scalable and production-ready vector search solution, Milvus ensures a smooth journey. As you progress from prototype to production, you can easily migrate to the Docker and Kubernetes deployment for superior performance and customizability with its distributed architecture and benefit from the consistent SDK and API interfaces. On the well-lit path, you only need to write your program once, and it can run seamlessly across various environments, from laptops to data centers and public clouds. We aim to empower developers at every stage, offering flexibility without compromising the user experience.

What is the Future of Vector Databases?

We have witnessed an expansion in the functionalities offered by vector database systems. In the past few years, these systems primarily focused on providing a single functionality: approximate nearest neighbor search (ANN search). However, the landscape is evolving, and in the next two years, we will see a broader array of functionalities.

Traditionally, vector databases supported similarity-based search. Now, they are extending their capabilities to include exact search or matching. This versatility allows you to analyze your data through two lenses: a similarity search for a broader understanding and an exact search for nuances. By combining these two approaches, users can fine-tune the balance between obtaining a high-level overview and delving into specific details.

Obtaining a sketch of the data might be sufficient for certain situations, and a semantic-based search works well. However, in situations where minute differences matter, users can zoom in on the data and scrutinize each entry for subtle features.

Vector databases will likely support additional vector computing workloads, such as vector clustering and classification. These functionalities are not just additional features but are particularly relevant and impactful in applications like fraud detection and anomaly detection. Here, unsupervised learning techniques can be applied to cluster or classify vector embeddings, identifying common patterns and potentially preventing significant losses.

In the next part of the blog series, I will share my insights into the evolution of AI technologies and how they influence the future of vector databases. Stay tuned!

Updated on May 01, 2025

Charles Xie
Charles Xie is the founder and CEO of Zilliz, focusing on building next-generation databases and search technologies for AI and LLMs applications. At Zilliz, he also invented Milvus, the world's most popular open-source vector database for production-ready AI. He is currently a board member of LF AI & Data Foundation and served as the board's chairperson in 2020 and 2021. Charles previously worked at Oracle as a founding engineer of the Oracle 12c cloud database project. Charles holds a master’s degree in computer science from the University of Wisconsin-Madison.

Content

Start Free, Scale Easily

Try the fully-managed vector database built for your GenAI applications.

Try Zilliz Cloud for Free

Share this article

Keep Reading

Zilliz Cloud BYOC Upgrades: Bring Enterprise-Grade Security, Networking Isolation, and More

Discover how Zilliz Cloud BYOC brings enterprise-grade security, networking isolation, and infrastructure automation to vector database deployments in AWS

Empowering Women in AI: RAG Hackathon at Stanford

Empower and celebrate women in AI at the Women in AI RAG Hackathon at Stanford. Engage with experts, build innovative AI projects, and compete for prizes.

GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval

GPL is an unsupervised domain adaptation technique for dense retrieval models that combines a query generator with pseudo-labeling.