How a Major Indian Online Retailer Scaled Product Matching with Milvus

75% Lower Memory Usage
Shifted embeddings from RAM to object storage, removing memory as a scaling bottleneck.
6x Faster Processing Times
Reduced catalog matching from days to hours across large-scale searches.
~200ms Latency
Delivered stronger precision at scale while meeting strict latency targets.
Greater Operational Flexibility
Independent node scaling eliminates full index rebuilds during updates.
About the Company
The customer is one of India’s largest online retail platforms, often referred to as the “Amazon of India,” serving a nationwide audience across categories such as electronics, fashion, groceries, and household essentials. Beyond its consumer-facing marketplace, the company also operates a SaaS division that delivers end-to-end commerce solutions for enterprises and online sellers. A key component of this offering is a pricing intelligence system that helps retailers stay competitive in a market where price remains one of the strongest influencers of customer decision-making.
Supporting accurate, real-time pricing at a national scale, however, introduced significant engineering challenges. The team needed to match products across an ever-expanding digital catalog—growing from a few million SKUs to tens of millions — and refresh them daily across several large competitors, each with extensive inventories of their own. This explosive growth overwhelmed the existing keyword- and FAISS-based architecture, driving up infrastructure costs and slowing down update cycles.
To address these bottlenecks, the team migrated its product matching pipeline to the Milvus Vector Database. With Milvus’s disk-based indexing and distributed architecture, they significantly reduced processing times and lowered operational costs, enabling a more scalable, sustainable, and high-performance system for enterprise-grade pricing management.
The Challenge: Scaling Product Matching at the Enterprise Level
The company’s pricing intelligence platform is powered by three core modules: competitive intelligence (tracking competitor prices using scraping and product matching), dynamic pricing (rule-driven adjustments based on market signals), and assortment intelligence (identifying gaps in a retailer’s catalog). As the platform onboarded more enterprise customers, the product-matching engine underpinning these capabilities began to show signs of strain.
Retailers managed more than 20 million SKUs, updated daily or even hourly, while tracking 10 or more competitors. Each competitor catalog added another 5 million SKUs to crawl and compare, creating a workload that grew almost exponentially.
On top of the scale, the unstructured data itself was messy: product images in inconsistent resolutions, descriptions written in different styles, and the need to support both exact matches and “close enough” variants.
Precision was also critical. Even a small error rate could adversely impact pricing recommendations, causing retailers to lose sales when items are priced too high, lose money when priced too low, or be stuck with inventory that doesn’t move, directly reducing revenue and eroding retailers’ confidence in the system.
Resource optimization presented another major hurdle. The system needed to efficiently manage computational resources, including CPU, memory, and storage for continuous data processing and querying at a massive enterprise scale.
The legacy architecture, built on a single-server FAISS-based in-memory index, simply wasn’t designed for this level of data growth. Competitor embeddings were stored on local disk, then periodically loaded into memory for similarity search. While functional at smaller volumes, the design broke down at scale. Memory usage ballooned—storing around 20 million 1,024-dimensional embeddings required nearly 400 GB of RAM—causing infrastructure costs to spike. Performance degraded as well, with certain end-to-end processing jobs taking up to 12 days to complete. For customers expecting timely competitive insights, the system had clearly hit its limits.
Evaluating Paths to Scale and Choosing Milvus
Faced with these critical limitations, the engineering team considered three potential paths to address its scaling challenges.
Vertical scaling — moving to larger, memory-heavy VMs. While this option could temporarily meet performance requirements, the solution's physical limitations would quickly resurface, leading to significantly increased costs. Clearly, this was merely a stopgap solution.
Extending their SQL database with vector search functionality. From an integration perspective, this approach was appealing, but the team quickly realized the risks: overloading their primary SQL database could slow down both transactional operations and vector queries, undermining the entire system’s reliability.
Adopting a dedicated vector database designed for similarity search at scale. This proved to be the most promising option. The team ran extensive benchmarks across Milvus, Pinecone, Qdrant, and Weaviate, testing their insertion speed, query latency, filtering precision, and deployment flexibility. In these evaluations, Milvus emerged as the clear leader.
Why Milvus: Key Decision Factors
During the evaluation, Milvus emerged as the only solution that checked every box, as it directly addressed the platform’s scaling and cost challenges. Below are the key factors:
Milvus’s distributed architecture allowed horizontal scaling and efficient resource utilization, giving the team the flexibility to handle billions of embeddings without overprovisioning infrastructure. In addition, Milvus’s tunable design enabled the engineering team to optimize the system to meet the exact workload demands of their applications.
A second differentiator was DiskANN, Milvus’s disk-based indexing algorithm. By reducing memory requirements by up to 75% compared to in-memory methods like HNSW, DiskANN made large-scale search not only feasible but cost-efficient. Combined with support for object storage such as S3, this provided the platform with a scalable and affordable foundation.
Finally, Milvus’s pre-filtering capabilities aligned perfectly with the team’s search optimization strategy, enabling them to narrow down candidate sets by category, brand, or price before conducting a vector search. This significantly reduced the search space, improving both performance and accuracy.
The Solution: Building Product Matching at Scale with Milvus
After selecting Milvus, the customer’s engineering team designed a new architecture optimized for both scale and precision.
The pipeline begins with data ingestion, where client catalogs and competitor data are crawled and stored. A normalization layer then processes messy, unstructured data, such as inconsistent product images and descriptions, into standardized formats. Then, the team’s proprietary machine learning models, explicitly trained on e-commerce data, generate 1024-dimensional vectors with 4-byte floats that capture the key attributes of each product. These embeddings are indexed and stored in Milvus, where similarity search compares client product embeddings against competitor catalogs using cosine similarity in a high-dimensional space.
The search pipeline follows a multi-stage process. It begins with pre-filtering based on structured attributes such as category, brand, and price range, narrowing the candidate set. Milvus then performs vector similarity search within this filtered subset, followed by post-processing and scoring of the results. Finally, a threshold-based filter generates recommendations, with manual review applied to high-confidence matches. This layered approach strikes a balance between automation and oversight, ensuring both speed and accuracy at enterprise scale.
By adopting Milvus, the team accelerated product-matching cycles while significantly reducing infrastructure costs. More importantly, they established a future-proof foundation capable of supporting enterprise customers with massive catalogs and highly dynamic competitive landscapes.
The Results: Scalable, Cost-Efficient, and Accurate Product Matching
Migrating to Milvus reshaped how the pricing intelligence platform handles product matching at enterprise scale. What was once constrained by memory limits, long processing cycles, and rigid operations has become efficient, accurate, and ready to scale with enterprise growth.
Lower Infrastructure Costs: The previous FAISS setup required loading all embeddings into memory, making scaling both expensive and impractical. By moving to Milvus, the customer’s engineering team cut memory requirements by up to 75%, shifting storage to S3 and GCP buckets. What used to be a cost barrier is now a sustainable foundation for expansion.
6x Faster Processing Times: Large catalog-to-catalog matching jobs that once stretched to 12 days now finish in about 2 days across 20M searches. While still batch-based, this sixfold improvement ensures competitive intelligence stays current enough to inform real-time pricing decisions.
Higher Accuracy at Scale: For pricing, precision is non-negotiable. Milvus delivered stronger accuracy than alternatives, such as HNSW, while meeting the team’s ~200ms latency target for batch queries. By combining structured filters (category, brand, price) with vector similarity search, the team minimized costly mismatches and built confidence in its recommendations.
Greater Operational Flexibility: With Milvus, the team no longer needs full index rebuilds to handle updates. Its distributed architecture allows query, index, and data nodes to scale independently. A hybrid integration with MySQL further streamlined the workflow, pairing structured filtering with vector search for maximum efficiency.
Conclusion
For the team behind this platform, better product matching meant more than just faster processing — it created a stronger foundation for their entire pricing engine. By adopting Milvus, they gained the ability to handle massive, messy catalogs with accuracy and at a sustainable cost. Using DiskANN for indexing, a self-hosted distributed architecture for scale, and a hybrid approach integrated with their existing databases, the team built a system that is both practical and resilient.
This shift has allowed them to deliver reliable competitive insights and pricing recommendations that enterprise clients can act on with confidence. As e-commerce catalogs grow and competition intensifies, this experience demonstrates that vector databases provide a practical approach to achieving both scale and precision — qualities that are now essential for staying competitive in rapidly evolving markets.