Democratizing Vector Databases: Empowering Access & Equality
This post was originally published in TheSequence and is reposted here with permission.
The 21st century is all about the democratization of technology. The internet boom enabled large-scale collaboration leading to open source becoming a typical software adoption pattern. As the pace and scale of technological innovation grow, we must work to make it more accessible.
As a software engineer, democratization of technology means making it as widely available as possible. It means using what I know to make creating, adopting, and understanding technological advances easier for others. Here at Zilliz, we have always been about accelerating the adoption of vector databases, not just about increasing adoption of the open source project Milvus.
In this article, I’m going to cover:
What democratizing vector databases means for devs
Pillars of technology democratization
Education on vector databases and related tooling
Increasing accessibility to vector databases
Technology evangelism
A summary of democratizing vector databases
What does democratizing vector databases mean for devs?
Whenever I hear “democratize” in the context of “democratizing XYZ technology,” I think of expanding access to that technology. So when it comes to vector databases, I think of expanding access to vector databases. Traditionally, vector databases have only been available to software developers at enterprises.
Milvus began the process of democratizing vector databases when it became an open-source Linux Foundation project. It was one of the first vector databases available to developers through being an open-source project. As the project has grown, more and more developers have been able to use, learn about, and contribute to vector databases.
Pillars of technology democratization
Democratization of technology comes with specific challenges — especially the democratization of complex tools like vector databases. There are three pillars to look at when it comes to democratizing technology. They are education, increasing accessibility, and evangelism.
Education on vector databases and related tooling
Education is the most crucial topic that many companies often get wrong. Education is about educating on your specific product, the technology at large, and related tools. That’s why here at Zilliz, we create content about many things, not just Milvus.
The content we write reflects our desire to accelerate the adoption of vector databases through education. Therefore, we have content about essential concepts like Hierarchical Navigable Small Worlds (HNSW), scalar and product quantization, and inverted file indices.
In addition to providing resources for understanding the concepts behind vector databases, we must provide resources for related tooling. For example, the popularity of large language models (LLMs) has ushered in a range of new tools.
Some new tools that have come to the forefront include LlamaIndex, Auto-GPT, and LangChain. Additionally, in alignment with our goal to provide educational resources for the community, many of our content pieces go out to third parties, such as The Sequence, The New Stack, and some Medium publications.
Increasing accessibility to vector databases
While providing education about technology is excellent, it’s not helpful unless you offer ways to access it. In our case, open-sourcing Milvus was the first step to increasing access to vector databases. Moving beyond simply being open source, the Milvus project has also pursued other avenues of increasing accessibility.
Milvus is also available through Docker images with templates for Docker Compose and Helm. In addition, we recently made it available through pip install
as Milvus Lite. Milvus has garnered over 3.5M Docker downloads, 18.8k GitHub stars, and 212 contributors.
In addition to Milvus, Zilliz has worked to increase accessibility as well. Initially, Zilliz provided $400 in free credits and now offers a free tier that allows up to half a million vectors! That’s enough for pretty much any developer. With Zilliz Cloud's free tier, almost any developer can get started with vector databases — for free.
Technology evangelism
The last pillar to address in democratizing technology is to evangelize it. What use is making technology available and providing education about it if you don’t tell people why it’s useful? In terms of accelerating adoption, education explains the how, and increasing the accessibility accounts for the what — evangelizing provides the why.
We do evangelism mainly through content that shows the power of vector databases. You can see this through some of the educational material I provided above. We also give talks about vector databases. Some virtual and some in-person talks. I recently gave a talk in Seattle about the use of vector databases as a solution to solving data problems with LLMs.
Summing up: democratizing vector databases
Democratizing vector databases is critical because vector databases solve many problems in unstructured data. They were previously only available to developers at large companies due to the sheer complexity and scale of such a project. However, the popularity of LLMs has thrust the idea of vector databases into the mainstream and given rise to countless use cases that they didn’t have before. This makes democratization even more critical.
At Zilliz, we approached democratization with three pillars — education, accessibility, and evangelism. Education is the most crucial part of these pillars for us. For developers, educational material provides the “how” to use vector databases and complementary tools.
Additionally, we’ve always worked to increase accessibility and continue to do so. Open-sourcing the software was the first step. Other steps to increase accessibility include providing templates and images for containerization. We recently released Milvus Lite, a vector database that can run directly in your Jupyter Notebook.
Finally, we engage in technical evangelism to spread the word about the use and power of vector databases. We do this through providing webinars, speaking at community events, and being present on social media.
Zilliz continues to make efforts to democratize vector databases and this is exciting because of how important they are. Vector databases are critical for solving data problems in LLMs and are the best existing solution for things like reverse image search, semantic text search, and product recommendations. I’m personally excited to be part of a team who’s helping to grow the vector database space and look forward to all the amazing things being built for and by the community!
- What does democratizing vector databases mean for devs?
- Pillars of technology democratization
- Summing up: democratizing vector databases
Content
Start Free, Scale Easily
Try the fully-managed vector database built for your GenAI applications.
Try Zilliz Cloud for Free