How Zilliz Cloud Powers CX Genie’s Global Growth with Fast, Reliable Vector Search

2× Faster
query performance compared to their previous solution
5-10ms Latency
for vector similarity searches across 1M+ embeddings
70% Cost Savings
versus self-hosted infrastructure
Zero Downtime
since migration, compared to daily outages with their previous solution
With Zilliz Cloud, we've achieved query latencies as low as 5-10ms across our million-vector database. This represents performance that's twice as fast as our previous solution, which directly translates to more responsive chatbots for our customers.
Nguyễn Ngọc Hải Đăng_ Nguyễn Nhật Khoa
About CX Genie
CX Genie is a conversational AI startup delivering chatbot solutions for customer support teams. Designed for both SMEs and enterprises, their platform helps businesses automate customer service with intelligent, personalized, and scalable AI interactions.
Headquartered in Vietnam but serving a primarily international customer base—with a strong presence in the United States and other global markets—CX Genie supports over 100,000 users, establishing itself as a fast-rising player in the AI-driven customer experience space.
The Technical Challenge: Scaling a RAG-Based AI Chatbot with Performance and Reliability
The core of CX Genie’s operation is built on the Retrieval-Augmented Generation (RAG) technique, which relies on vector search to pull relevant information from knowledge bases and generate accurate, real-time responses. However, as their user base grew, their original setup—using open-source vector search solutions like Qdrant and Chroma—struggled to keep up.
Nguyễn Ngọc Hải Đăng, the AI Engineer at CX Genie, mentioned, "Before Zilliz, we experienced several minutes of downtime almost daily with our previous vector database solution. When you're handling customer support interactions that need to be available 24/7, this was simply unacceptable for our business growth."
The engineering team encountered several major challenges:
Increased latency during query execution as data volumes grew
Slow indexing times that couldn't keep pace with expanding knowledge bases
Hidden costs and complexity of managing infrastructure in-house
Daily system downtimes impacting reliability and customer experience
Engineering resources diverted to database management instead of product innovation
These issues made it increasingly difficult to deliver the fast, responsive chatbot experience their customers expected. A new vector database solution was needed—one that could scale seamlessly, reduce operational burden, and improve reliability without compromising performance.
Why Choose Zilliz Cloud: Performance, Simplicity, and Cost Efficiency
When CX Genie set out to find a new vector database, they weren’t just looking for better speed—they were looking for a platform that could keep up with their growing technical demands without increasing operational complexity.
Their evaluation centered on six key criteria:
Query performance and latency, especially at million-scale vector workloads
Low operational overhead to free up engineering resources
Cost-efficiency compared to self-hosted solutions
Scalability to support business growth
Easy integration with their existing LangChain-based architecture
Rich feature set, including metadata filtering and advanced indexing
Zilliz Cloud delivered on all fronts. Compared to their open-source stack with Chroma and Qdrant—which required manual management and frequent troubleshooting—Zilliz Cloud offered a fully managed platform that removed infrastructure overhead and let the team focus on building their core product.
The onboarding process was refreshingly simple. Thanks to detailed documentation and well-designed APIs, the team was able to connect and test queries within minutes, accelerating development and cutting friction from both migration and integration phases.
How Zilliz Cloud Powers CX Genie’s RAG System
At the heart of CX Genie’s conversational AI platform is a two-phase pipeline powered by Retrieval-Augmented Generation (RAG): the data ingestion phase and the retrieval phase. Zilliz Cloud plays a critical role in ensuring both are performant and scalable.
Data Ingestion Phase
In the data ingestion phase, various business knowledge sources—including HTML pages, documents, FAQs, and articles—are first broken into manageable chunks. These chunks are passed through an embedding model (such as OpenAI’s embedder) to generate dense vector representations. The resulting embeddings are then ingested into Zilliz Cloud, where they are stored and indexed efficiently.
This allows CX Genie to maintain an up-to-date vector database that reflects each customer’s evolving knowledge base, with rich metadata support and partitioning based on business attributes like region or product type.
Retrieval Phase
When a user submits a question, it is also converted into an embedding using the same embedder. This query embedding is sent to Zilliz Cloud, which performs a top-k similarity search across the stored vectors. Zilliz returns the most relevant chunks, which are then fed into a large language model (LLM) for a more adequate response generation.
Thanks to Zilliz Cloud’s low-latency search, rich filtering capabilities, and scalable architecture, CX Genie is able to retrieve the most relevant context in milliseconds, enabling chatbots to respond with accuracy and speed, even at high traffic volumes.
Measurable Results: Speed, Scale, and Cost Savings
Since switching to Zilliz Cloud, CX Genie has seen significant technical and business benefits:
5-10ms latency for vector similarity searches across 1M+ embeddings
2× faster query performance compared to their previous Chroma implementation
70% cost savings versus self-hosted infrastructure
Zero downtime since migration, compared to daily outages with their previous solution
More accurate retrieval using metadata filtering and partitioning capabilities
Reduced engineering burden with fully managed infrastructure
"By migrating to Zilliz Cloud, we've reduced our vector database infrastructure costs by approximately 70% compared to our self-hosted setup. This allows us to reinvest those savings into improving our core AI capabilities rather than managing database infrastructure," said Nguyễn Ngọc Hải Đăng.
These enhancements have helped CX Genie deliver faster and more relevant customer interactions—without stretching their engineering team or cloud budget—ultimately creating a better experience for their end users.
Developer Experience and Seamless Migration
For CX Genie’s engineering team, moving to Zilliz Cloud streamlined both development and system management. The onboarding process was smooth, with the Python SDK and API references making it straightforward to get up and running. Previously, maintaining their self-hosted setup required ongoing effort across multiple teams. With Zilliz Cloud’s managed infrastructure, the core operations are now handled with minimal oversight. Features like multi-condition filtering and partitioned collections have made it easier for the team to organize and retrieve embeddings by region or business context.
Despite handling the migration process manually, the team found it super efficient. They used the REST API and bulk insert capabilities to move data from PostgreSQL while maintaining the structure of their metadata and embeddings. By aligning collections and partitions with their internal logic, they ensured the system remained organized and performant post-migration.
What’s Next: Expanding Capabilities with Zilliz Cloud
CX Genie is continuing to evolve its AI chatbot platform and is eager to expand its usage of Zilliz Cloud. They plan to explore improvements in indexing management and anticipate UI enhancements—particularly around the API playground, which is a critical interface when dealing with large-scale embeddings from providers like OpenAI.
As they scale to serve more global customers, the partnership with Zilliz Cloud will remain a cornerstone of their infrastructure strategy.
Conclusion
CX Genie's journey underscores the power of combining a strategic RAG architecture with a performant, reliable vector database. With Zilliz Cloud, they've been able to maintain their startup agility while operating at global scale—delivering faster, smarter, and more cost-effective AI-driven customer experiences to businesses around the world.
By focusing on solving both technical challenges (vector search performance, system reliability) and business challenges (customer support efficiency, cost reduction), CX Genie exemplifies how the right infrastructure choices can directly impact customer satisfaction and business growth.
- About CX Genie
- The Technical Challenge: Scaling a RAG-Based AI Chatbot with Performance and Reliability
- Why Choose Zilliz Cloud: Performance, Simplicity, and Cost Efficiency
- How Zilliz Cloud Powers CX Genie’s RAG System
- Measurable Results: Speed, Scale, and Cost Savings
- Developer Experience and Seamless Migration
- What’s Next: Expanding Capabilities with Zilliz Cloud
- Conclusion
Content
Industry
Internet Services
Thanks to the well-designed Python SDK and REST API, we were able to integrate Zilliz Cloud with our LangChain-based architecture in a matter of days. The schema-based collections perfectly aligned with how we structure our data, making the transition nearly seamless.
Nguyễn Ngọc Hải Đăng_ Nguyễn Nhật Khoa