Building AI Agents at Scale: How Tanka Leveraged Zilliz Cloud for Intelligent Enterprise Communication

0 Production Issues
related to infrastructure instability
Thousands of Vector Search
per second during peak times
Semantic Memory Engine
Understanding concepts and relationships across tools and time
100% of Team Bandwidth
redirected to AI innovation
About Tanka
Tanka is an enterprise agent platform designed to act as your AI Co-Founder. Built for the modern workplace, Tanka helps startup teams navigate the chaos of scattered messages, messy deliverables, and repetitive tasks. With its unique long-term memory, Tanka captures and connects conversations across tools like Slack, Gmail, Notion, and more, transforming fragmented communication into shared organizational knowledge.
By delivering context-aware responses, summarizing meetings, and sending timely task reminders, Tanka boosts team productivity and decision-making. Since launching its beta in 2024, Tanka has been adopted by over 1,000 teams and has delivered more than 35,000 AI-generated responses.
The Semantic Understanding Challenge: When Keyword Search Isn’t Enough
Tanka’s rapid adoption validated its core value proposition: enabling AI assistants that don’t just respond—they remember, learn, and understand business context over time. But scaling that vision came with technical challenges, especially as users demanded more from the platform.
In its early days, the Tanka team used BM25 keyword-based retrieval—a pragmatic choice that enabled them to move quickly and effectively support basic search use cases. It worked well for simple queries and helped the team ship a functional MVP quickly.
However, as the platform expanded to support Outlook, Gmail, Slack, Telegram, Notion, and other tools, the complexity of data models—and user expectations—grew significantly. Teams were no longer searching for isolated keywords. They were asking nuanced, contextual questions that required the system to understand relationships across messages, meetings, documents, and apps.
“We started with keyword search for simple queries,” says Wu Junjie, AI Architect at Tanka. “But as our users’ needs evolved, it became clear they were expecting semantic answers—not just string matches.”
This marked the beginning of a deeper challenge: bridging the semantic gap between what users meant and what keyword-based systems could deliver. For example, users might ask for a summary of “what changed after the product launch” or “follow-ups from last Friday’s sales meeting”—queries that required cross-referencing emails, chats, and meeting notes. To humans, the connections were obvious. To the search engine, they were invisible.
Meanwhile, the stakes were rising. In a competitive space where speed and intelligence were key differentiators, Tanka’s existing search infrastructure began to show signs of strain. Performance suffered as data volumes grew. Search latency increased, especially for broad queries without clear filters. But the deeper issue wasn’t speed—it was strategic misalignment.
The engineering team found themselves bogged down maintaining brittle search infrastructure instead of building the core AI features on their roadmap. Visionary capabilities—like multi-source insights, intelligent weekly digests, or predictive follow-ups—remained just out of reach.
The Tanka team realized that keyword search had taken them far—but not far enough. To unlock the next phase of their product vision, they needed a system that could truly understand user intent across time, tools, and context.
The Solution: Scaling Memory with Performance and Reliability
Faced with mounting technical and operational challenges, the Tanka team set out to find a solution that could support their ambitious product roadmap while delivering the reliability users expected from an AI assistant with long-term memory.
Evaluating Vector Search Options
Tanka launched a structured, in-depth evaluation process to explore potential solutions. Among the early contenders were PostgreSQL with pgvector and Elasticsearch plugins—attractive at first glance due to compatibility with their existing stack. But performance testing quickly revealed their limitations, especially for memory-intensive workloads.
The team conducted head-to-head comparisons across key criteria: response time, throughput, CPU utilization, and overall scalability. While most platforms offered similar accuracy—since vector similarity algorithms are largely standardized—Milvus stood out for its superior speed and resource efficiency.
"While accuracy was comparable across platforms, Milvus clearly won on speed and resource efficiency," says Wu Junjie, AI Architect at Tanka.
During evaluation, the team prioritized:
Query speed, which was critical for maintaining the illusion of real-time intelligence;
CPU efficiency, to keep infrastructure costs sustainable at scale;
Operational reliability, essential for trust in a memory-first assistant.
Scalability, to support fast-growing data volumes and user bases.
From Self-Hosting to a Managed Service
Tanka initially deployed self-hosted Milvus in development and early production. It was the right choice and provided the low-latency vector search their product required. Milvus delivered on its core promise—powerful, efficient similarity search at scale.
However, as the platform matured and usage grew, the burden of managing infrastructure started to become a distraction. Running and maintaining Milvus clusters in-house meant the engineering team had to manage everything from scaling and failover to monitoring and recovery.
While the Milvus engine itself remained reliable, infrastructure-related incidents—like node failures or network issues—introduced risk and downtime that directly impacted the user experience.
Over time, the trade-off became clear: the team needed to focus on building product features, not maintaining database infrastructure.
Migrating to Zilliz Cloud
The choice to move to Zilliz Cloud, the fully managed version of Milvus, became clear. It delivered the same high-performance core with enterprise-grade reliability, without the overhead of managing clusters in-house.
For a lean, fast-moving team like Tanka’s, offloading operational complexity was a game-changer:
No more firefighting infrastructure issues
Higher uptime and consistency for memory-critical applications
More engineering time to focus on innovation and user experience
Zilliz’s built-in migration service made the transition smooth and low-risk. With prompt tech support and seamless S3 integration, the Tanka team moved to the cloud with minimal disruption.
The Implementation: Powering Advanced AI Memory with Zilliz Cloud
With a reliable infrastructure in place, Tanka was finally able to shift its focus to what truly set it apart: building advanced memory capabilities for its AI-native messaging platform that go far beyond simple search. Powered by Zilliz Cloud, Tanka’s implementation is designed to support rich, context-aware applications that make organizational knowledge actionable at scale.
Beyond Basic Retrieval: Semantic Memory at Scale
At the core of Tanka’s system is a retrieval-augmented generation (RAG) pipeline, enabling users to access relevant information across their connected workplace tools, such as Slack, Gmail, and Notion. However, unlike typical RAG systems that retrieve documents based on surface-level similarity, Tanka takes it a step further.
During preprocessing, Tanka performs entity and relationship extraction to capture higher-level concepts from raw content. They are then converted into vector embeddings and stored in Zilliz Cloud, enabling retrieval based not only on what was said, but also on how different ideas, people, and actions are connected.
This allows users to ask complex, abstract questions—such as “What were the key follow-ups from our Q3 planning efforts?”—and receive answers grounded in structured knowledge rather than keyword matches.
This approach transforms Zilliz Cloud from a storage layer into a semantic memory engine, helping the assistant understand context, history, and patterns across the organization.
Real-Time Processing and Continuous Updates
Tanka’s system ingests and processes data from connected platforms in real time, ensuring the AI assistant always reflects the latest organizational state. As teams communicate and collaborate, new vectors are generated and indexed in Zilliz Cloud, keeping the assistant up to date without requiring manual intervention.
The pipeline includes:
Multi-source ingestion from emails, chats, and documents
Preprocessing for entity and relationship extraction
Vector embedding and indexing in Zilliz Cloud for fast, semantic retrieval
This enables the AI assistant to act as a living memory layer—helping users surface insights, recall decisions, and understand evolving team dynamics.
A Flexible, Multi-Model AI Stack
To complement this infrastructure, Tanka uses a flexible, multi-model LLM strategy. The system primarily relies on Gemini 2 Flash and Claude 3.7 Sonnet for reasoning and summarization, with OpenAI models selectively applied for instruction-heavy tasks. To avoid rate limiting and ensure resilient performance across providers, Tanka uses OpenRouter to manage API access and routing.
The Benefits and Results: Transformational Business Impact
Partnering with Zilliz Cloud didn’t just solve technical pain points for Tanka—it reshaped the company’s trajectory. With infrastructure stabilized and performance optimized, the team could finally shift focus from operational firefighting to AI innovation. The benefits touched every layer of the organization, unlocking new levels of speed, reliability, and scale.
Immediate Operational Relief
The most immediate—and dramatic—impact was the elimination of production issues related to infrastructure instability. Before migrating to Zilliz Cloud, database-related incidents occasionally disrupted service and compromised user trust. That’s no longer the case.
“After moving to Zilliz Cloud, we basically eliminated production issues related to database failures,” says Wu Junjie, AI Architect at Tanka. “We used to have occasional incidents that affected users. Now, those problems are gone.”
This improvement was critical for a platform built on the promise of persistent organizational memory. With database reliability no longer a concern, users could depend on fast, uninterrupted access to their accumulated knowledge—day in and day out.
Refocusing Engineering on Innovation
With infrastructure worries behind them, Tanka’s engineering team was able to reallocate its time toward product development and innovation. Instead of handling failovers, backups, and alerts, engineers could focus on building the features that define Tanka’s competitive edge.
“Zilliz’s performance and reliability fully meet our RAG requirements,” says Wu Junjie. “That allows us to focus our technical efforts on building differentiated AI memory capabilities—where our real value lies.”
The shift led to faster iteration cycles, more ambitious feature launches, and a tighter alignment between engineering effort and business strategy.
Consistent Performance at Massive Scale
As Tanka’s user base grew, so did the demands on its backend. The system now handles thousands of concurrent vector search operations per second during peak times, drawing from over three years of organizational data that spans millions of messages, documents, and events.
This performance consistency has removed infrastructure limits as a factor in product planning. Tanka’s team can now build and scale without hesitation, knowing that their backend will keep up.
Conclusion
Tanka’s journey—from early infrastructure challenges to production success with Zilliz Cloud—highlights a powerful lesson: the right vector database foundation doesn’t just improve performance; it unlocks innovation.
By partnering with Zilliz Cloud, Tanka eliminated production incidents, boosted engineering productivity, and achieved consistent performance at scale. More importantly, the shift allowed Tanka to focus entirely on its core mission: building next-generation memory capabilities for AI Assistants that go far beyond basic retrieval.
For AI companies developing memory-intensive applications, Tanka’s experience shows how infrastructure decisions directly impact innovation velocity and product success. Performance, reliability, and operational simplicity aren’t just technical requirements—they’re strategic enablers.
With the right foundation in place, Tanka has transformed its ambitious vision into a market-leading reality—proving that when infrastructure empowers rather than constraints, breakthrough AI is not just possible, but inevitable.