How Gorgias Scaled Its Conversational AI Agents for 15,000+ Merchants with Zilliz Cloud

Real-time responses
across product and knowledge searches.
Smarter Answers
Richer metadata, higher relevance, better personalization.
Simpler Operations
No manual indexing or workarounds.
More Developer Focus
Time spent improving AI, not managing infra.
About Gorgias
Gorgias builds conversational AI agents for e-commerce brands, tightly integrated with Shopify and other commerce ecosystems. The platform powers over 15,000 merchants, helping them deliver personalized, efficient customer experiences at scale. At the core of Gorgias’ product is an AI agent designed to replicate the warmth and precision of in-store service — answering questions, recommending products, and handling tasks such as returns and order tracking, all through conversational interfaces.
To deliver this level of personalization, Gorgias relies heavily on vector search. The AI agent must instantly retrieve relevant information from each merchant’s product catalog, customer history, and help center content — all while maintaining accuracy and context across thousands of unique stores. As usage grew, the team struggled to keep searches fast and consistent while supporting thousands of merchants simultaneously.
To overcome these limitations, Gorgias migrated to Zilliz Cloud, the fully managed service for Milvus. This shift allows the company to consolidate its AI search infrastructure, enabling real-time semantic retrieval and recommendation across millions of customer interactions. With Zilliz Cloud, Gorgias reduced operational complexity, improved response quality, and gained the flexibility to support rapid product evolution — all while maintaining consistent performance for its growing merchant network.
The Legacy System Hit Its Limits at Massive Tenant Scale
Gorgias initially relied on a competitor vector database for its vector search infrastructure. However, the platform’s metadata size restrictions made it difficult to represent Shopify’s complex product variants, such as combinations of color, size, and gender. It also imposed limitations around query depth and filtering capabilities, which impacted Gorgias’ ability to deliver highly contextual, brand-specific experiences. To address performance ceilings on the previous database's dedicated tier, the team transitioned to the serverless version — but encountered even steeper costs and additional feature limitations. These challenges ultimately led them to migrate the majority of their vector workloads to Zilliz Cloud.
At the same time, Gorgias was scaling to support millions of end customers across more than 15,000 merchants — each operating their own unique brand. While Gorgias’ customers are merchants, their AI agent must act on behalf of each merchant’s brand, capturing tone, voice, catalog, and customer context. This meant every interaction needed to retrieve results that aligned not only with the merchant’s data but also with how that brand presents itself to shoppers. Supporting that level of brand-specific personalization across a multi-tenant architecture pushed the limits of the existing vector infrastructure, underscoring the need for a more flexible, performant, and reliable solution.
Scaling the Customer Support Agent with Zilliz Cloud
Gorgias built its AI agent around a modular command center that processes customer messages and delegates them to specialized task workflows. Depending on the nature of the request — whether it’s a support inquiry, a product question, or a sales opportunity — the agent will retrieve relevant knowledge, identify matching products, or surface past tickets. These workflows rely on embedding the input query, retrieving candidates from Zilliz Cloud, re-ranking them, and then prompting an LLM to synthesize a response.
For support tasks, knowledge articles, and past ticket examples, data is retrieved from multiple collections hosted in Zilliz Cloud. These include both merchant-authored content and automatically scraped data from the merchant’s site. For sales and product-related tasks, Gorgias stores entire product catalogs as embeddings and filters recommendations based on customer behavior and preferences, including exclusion logic like avoiding certain colors or allergens. All results are ultimately composed into a unified message by a final LLM step that aggregates insights from the individual workflows.
The Gorgias architecture enables parallel processing of thousands of customer interactions, with tenant data isolated using partition-key-based partitioning in Zilliz Cloud. A feedback loop continuously refines the relevance of knowledge retrieval by mapping historical customer phrasing to specific knowledge resources. This reinforcement mechanism improves response accuracy even when the customer's language deviates from standard prompts.
For example, if a customer says, “I’m wondering why my delivery is so late,” the system learns to associate that phrasing with the appropriate knowledge article typically linked to the more common query, “Where is my order?” On the product side, Gorgias is exploring ways to improve recommendations by filtering out undesired traits — such as avoiding mugs described as “white” when a customer says “I hate the color white” — essentially reversing typical vector search to prioritize dissimilar results when context calls for it.
Technical Implementation Details
Gorgias’ AI agent workflow begins with message intake. An internal orchestration layer routes incoming messages through a “command center” of LLMs, which classify the request and determine the appropriate downstream tasks. Each task — whether it involves retrieving support knowledge, related past tickets, or relevant products — uses vector embeddings and queries one or more indexes in Zilliz Cloud.
These embeddings are generated using proprietary models hosted on Hugging Face. Retrieval results are re-ranked based on context, and the final LLM composes a complete response. In production, this system supports high concurrency and handles merchant-specific customization automatically through metadata, including language, tone of voice, product features, and business rules.
During development, the team uses batch ingestion and parallel workflows to validate retrieval logic. Monitoring and observability are ongoing areas of investment, particularly as new product categories and merchant types are onboarded.
Production Results: Simplified Architecture, Faster AI at Scale
A simpler system that freed engineers to work on the AI agent: After migrating to Zilliz Cloud, Gorgias removed many of the workarounds and custom indexing logic required by the previous system. This reduced infrastructure complexity, allowing developers to spend more time improving the AI agent rather than maintaining the vector search layer.
Faster searches with better results: Search latency decreased for both product data and knowledge content. At the same time, the system could store and query richer metadata, thereby improving search relevance and enabling more accurate, personalized responses.
More efficient parallel task execution: The platform now handles parallel workflows more efficiently, retrieving, ranking, and generating responses at scale without performance bottlenecks.
Lower operational overhead and more predictable costs: With fewer moving parts and constraints, infrastructure overhead was reduced, and cost management became more predictable as usage grew.
Better customer experiences: These improvements led to faster response times, higher-quality support, and the ability to personalize interactions at scale—helping merchants convert more shoppers and build deeper customer relationships.
Developer/Engineering Insights
Firas Jarboui, the ML engineering lead at Gorgias, shared that reliability and flexibility were two of the most critical needs when selecting a new vector database provider. The limitations of their legacy system forced the team to consider alternatives, and a Zilliz team conference session introduced them to Milvus and Zilliz Cloud at just the right moment. While not yet in use, Firas noted that multi-representation search — the ability to store and weight multiple embeddings per item — is a strategic capability that Gorgias plans to adopt. It would enable more nuanced product matching across varied customer contexts.
He also emphasized the importance of maintaining a clean multi-tenant separation, which Zilliz Cloud enables through partition key-level isolation. For future improvements, Gorgias is particularly interested in expanding filtering logic and negative similarity search, such as recommending products that are explicitly not similar to user dislikes.
Future Plans & Roadmap
Looking ahead, Gorgias is building a new merchant-facing AI tool — one that lets sellers ask questions about their own customers, such as sentiment trends and product-specific feedback. This complements the existing customer-facing agent and aims to bring lightweight BI-style insights into the conversational interface, without requiring a data science team. To support this, the team will index entire ticket histories and extract product-specific sentiment embeddings.
On the retrieval side, Gorgias is working to implement advanced filtering and contextual recommendation logic. This includes expanding current capabilities for exclusion search and edge-case discovery (e.g., “products least like this”), and enabling more merchant control over how the AI agent surfaces recommendations.
The long-term vision is to make personalized, AI-powered service accessible to all merchants — even small teams without data scientists — and to keep the digital retail experience as intimate and helpful as the local tailor from their founder’s childhood story.