Blog
Beyond PGVector: When Your Vector Database Needs a Formula 1 Upgrade

Beyond PGVector: When Your Vector Database Needs a Formula 1 Upgrade

Jan 13, 20256 min read

Postgres, a tenor of the relational database world, has served developers faithfully for more than 28 years. With the introduction of its pgvector extension, Postgres has taken steps to support vector embeddings, offering a convenient entry point for basic vector similarity search.

However, while pgvector provides a practical starting point, it still falls short compared to purpose-built vector databases like Milvus, especially when handling large-scale applications and complex search requirements. Relying solely on Postgres with pgvector for demanding vector search workloads is like trying to enter a Formula 1 race with a souped-up family sedan—it’s a step up, but it’s simply not built for that level of competition.

As AI applications explode in popularity, developers are encountering growing pains. What starts as a convenient solution with pgvector quickly becomes a frustrating bottleneck as data grows and search requirements become more sophisticated. Search quality declines, index updates drag on, and frustration rises as you struggle to meet your application's demands.

This blog explores why Postgres, with its vector search add-on, pgvector, works well for smaller projects and simpler use cases but reaches its limits for large-scale vector search. We’ll also discuss why purpose-built vector databases like Milvus are indispensable for tackling the unique challenges of this rapidly advancing field.

The Postgres and Pgvector Bottleneck

You can see Postgres as a Sedan; it has been here for years and works, but it will not allow you to be extremely fast. While pgvector adds vector storage and basic similarity search capabilities to Postgres, it inherits fundamental limitations:

Performance at Scale: pgvector supports only two indexing methods: HNSW and IVF_FLAT. While HNSW is a popular algorithm, it comes with significant trade-offs, including long indexing times and higher memory requirements. On the other hand, IVF_FLAT offers faster index building but struggles to maintain query performance as the dataset scales. The lack of support for on-disk indexes like DiskANN or GPU-based index types further limits its performance and flexibility when dealing with large-scale datasets.
High Dimensional Embeddings: Pgvector cannot handle high-dimensional vector embeddings due to architectural constraints. It relies on fixed 8KB pages for data storage, fundamentally restricting the number of dimensions a vector can accommodate. Since each dimension requires 4 bytes for storing a float and metadata also occupies space, indexing high-dimensional vectors effectively becomes impossible. In contrast, purpose-built databases like Milvus are designed to handle high-dimensional embeddings easily. While there are workarounds in pgvector like quantization exist, they often require compromising on precision.
Lack of Advanced Features: pgvector lacks the comprehensive feature set provided by purpose-built vector databases. For example, Milvus supports advanced metadata filtering search, a broader range of distance metrics beyond L2 and inner product, hybrid sparse and dense search, and even full-text search (available in Milvus 2.5).
Scalability Challenges: Scaling pgvector to handle large datasets and high query loads is non-trivial. It often requires substantial effort to implement sharding and manage indexes across multiple nodes, introducing additional complexity and operational overhead. Purpose-built vector databases are designed with scalability in mind, offering seamless performance even as datasets and query demands grow.

Milvus: The Formula 1

Milvus is an open-source vector database engineered from the ground up to address the specific demands of vector similarity search at scale. Think of it as a Formula 1 car, meticulously designed for speed and performance in the high-stakes world of vector data.

Here's how Milvus outperforms Postgres with pgvector:

Blazing Fast Search: Milvus supports 11 state-of-the-art indexing algorithms, including FLAT, HNSW, DiskANN, CAGRA, and GPU acceleration, to deliver unmatched search performance, even with 10s of billions of vectors.
Effortless Scalability: Milvus has a distributed and Kubernetes-native architecture. It enables seamless horizontal scaling, allowing you to handle massive datasets and high query throughput without the complexities of manual sharding.
Comprehensive Feature Set: Milvus offers a comprehensive suite of features, including metadata filtering, support for various distance metrics, full-text search, hybrid search, and flexible indexing options to tailor your search strategy to your specific needs.
Optimized for the Future of Data: Milvus is designed to handle the scale and complexity of the ever-growing volume of unstructured data represented as vectors, making it the ideal solution for the next generation of AI applications.
Continuous Innovation: Just like a Formula 1 team constantly pushes the boundaries of performance, Milvus is continually evolving with cutting-edge indexing algorithms, hardware acceleration support, and machine learning-driven optimizations.

Making the Right Choice: When to Use What

While Postgres with pgvector might not be a Formula 1 car, it still has its place in the garage. Let's explore when to use each solution:

Choose pgvector when:

You're building a proof of concept or MVP with small to medium datasets.
Your vector search needs are simple and don't require complex filtering.
Your embedding models produce vectors with dimensions under the Postgres page size limits.
You need ACID compliance and strong transactional guarantees.

Choose Milvus when:

You're working with large-scale datasets (millions to billions of vectors).
You need high-dimensional embeddings beyond pgvector's limitations.
Query performance is critical to your application.
You require advanced features like diverse indexing options or GPU acceleration.
You anticipate rapid growth and need a solution that scales horizontally.

Moving Your Vectors to Milvus with Our Migration Service

If you are using PGVector and are encountering issues, we offer an open-source migration tool called VTS (short for Vector Transport Service) to help you move your vectors and unstructured data to Milvus or its managed service on Zilliz Cloud.

Built on top of Apache Seatunnel, VTS offers:

Rich, extensible connectors
Unified stream and batch processing for real-time synchronization and offline batch imports
Distributed snapshot support for data consistency
High performance, low latency, and scalability
Real-time monitoring and visual management

In addition to pgvector, VTS supports migrating vector data from various sources, including Elasticsearch, Pinecone, Qdrant, and Tencent Cloud VDB, to purpose-built vector databases like Milvus. It also enables seamless vector migration between open-source Milvus and Zilliz Cloud, both ways.

To simplify the migration process, VTS automatically handles schema conversion, eliminating the need for complex setup and development efforts. In 2025, VTS will expand its capabilities to support data migration from additional sources like MongoDB and Weaviate. Future versions will also introduce the ability to generate vector embeddings on the fly, allowing unstructured data to be easily converted and ported to vector databases for accelerated approximate nearest neighbor (ANN) search. Stay tuned for these exciting updates!

How VTS works

The Road Ahead

The landscape of vector databases continues to evolve alongside the rapid advancement of AI technologies. While pgvector provides a convenient entry point, the demands of production-scale AI applications often necessitate purpose-built solutions.

The choice between pgvector and Milvus represents more than just a technical decision. It's a strategic investment in your application's future scalability. Just as a Formula 1 team selects their equipment based on performance requirements, organizations must evaluate their vector search needs against their growth trajectory.

With tools like VTS streamlining the migration process, companies can confidently transition their vector search capabilities when their requirements outgrow pgvector's capabilities. Whether architecting new applications or scaling existing ones, early consideration of vector search requirements can prevent technical debt and ensure sustainable growth.

We'd Love to Hear What You Think!

If you like this blog post, please consider:

⭐ Giving us a star on GitHub
💬 Joining our Milvus Discord community to share your experiences or if you need help to move from pgvector
🔍 Exploring our Bootcamp repository for examples of applications using Milvus

Updated on Mar 24, 2025

Stephen Batifol
Stephen Batifol is a Developer Advocate at Zilliz. He previously worked as a Machine Learning Engineer at Wolt, where he was working on the ML Platform and as a Data Scientist at Brevo. Stephen studied Computer Science and Artificial Intelligence. He enjoys dancing and surfing.

Content

Start Free, Scale Easily

Try the fully-managed vector database built for your GenAI applications.

Try Zilliz Cloud for Free

Share this article

Keep Reading

Data Deduplication at Trillion Scale: How to Solve the Biggest Bottleneck of LLM Training

Explore how MinHash LSH and Milvus handle data deduplication at the trillion-scale level, solving key bottlenecks in LLM training for improved AI model performance.

Empowering Innovation: Highlights from the Women in AI RAG Hackathon

Over the course of the day, teams built working RAG-powered applications using the Milvus vector database—many of them solving real-world problems in healthcare, legal access, sustainability, and more—all within just a few hours.

3 Key Patterns to Building Multimodal RAG: A Comprehensive Guide

These multimodal RAG patterns include grounding all modalities into a primary modality, embedding them into a unified vector space, or employing hybrid retrieval with raw data access.