Letta and Zilliz Cloud Integration
Letta (previously MemGPT) and Zilliz Cloud integrate to build stateful LLM agents with extended context windows, combining Letta's agent framework for managing hierarchical memory tiers with Zilliz Cloud's high-performance vector database for scalable archival memory storage and RAG retrieval.
Use this integration for FreeWhat is Letta
Letta (previously MemGPT) makes it easy to build and deploy stateful LLM agents. It uses a technique inspired by hierarchical memory systems in traditional operating systems, called virtual context management, to provide extended context within an LLM's limited context window by intelligently managing different storage tiers. Letta enables building agents with connections to external data sources for RAG, with the agent living on a server and accessed via REST API, maintaining interactions and queries in a stateful database.
By integrating with Zilliz Cloud (fully managed Milvus), Letta gains access to a scalable vector database for archival memory storage, enabling agents to efficiently store and retrieve external data sources at scale while significantly reducing token consumption compared to dumping entire conversation history in the prompt.
Benefits of the Letta + Zilliz Cloud Integration
- Extended LLM context window: Letta's virtual context management overcomes LLM context window limitations by using Zilliz Cloud as the archival memory tier, enabling agents to access vast knowledge bases without exceeding token limits.
- Reduced token consumption: Using Zilliz Cloud to manage agent memory significantly reduces token consumption compared to including the entire conversation history in the prompt, making agents more cost-effective.
- Stateful agent memory: Letta maintains agent state across interactions, with Zilliz Cloud providing persistent, scalable storage for the archival memory layer that persists even after the system is closed.
- RAG-powered agents: The integration enables building agents that can connect to external data sources, with documents loaded into Zilliz Cloud's vector store and retrieved through similarity search when the agent needs context.
How the Integration Works
Letta serves as the agent framework, providing the stateful LLM agent with hierarchical memory management (core memory, recall memory, and archival memory), conversation orchestration, and REST API access. It handles agent creation, persona management, and intelligent function calling to search archival memory when needed.
Zilliz Cloud serves as the archival memory backend, storing and indexing document embeddings loaded from external data sources. When the agent needs to search its archival memory, Zilliz Cloud provides fast similarity search to retrieve the most relevant passages.
Together, Letta and Zilliz Cloud create intelligent, stateful agents: external documents are loaded and embedded into Zilliz Cloud as the agent's archival memory. During conversations, the agent intelligently decides when to search this archival memory, Zilliz Cloud retrieves the most relevant context, and the agent generates informed responses — all while maintaining conversation state and managing token usage efficiently.
Step-by-Step Guide
1. Install Dependencies
Make sure Python version >= 3.10. Install the required dependencies with Milvus backend support:
$ pip install 'pymemgpt[milvus]'2. Configure Milvus as the Archival Storage Backend
Configure the Milvus connection via command:
$ memgpt configure... ? Select storage backend for archival data: milvus ? Enter the Milvus connection URI (Default: ~/.memgpt/milvus.db): ~/.memgpt/milvus.dbYou just set the URI to the local file path, e.g.
~/.memgpt/milvus.db, which will automatically invoke the local Milvus service instance through Milvus Lite. If you have large scale of data such as more than a million docs, we recommend setting up a more performant Milvus server on Docker or Kubernetes. And in this case, your URI should be the server URI, e.g.http://localhost:19530.3. Create an External Data Source
Download a document and create a data source using
memgpt load:# Download the MemGPT research paper $ curl -L -o memgpt_research_paper.pdf https://arxiv.org/pdf/2310.08560.pdf # Load it as a data source $ memgpt load directory --name memgpt_research_paper --input-files=memgpt_research_paper.pdfLoading files: 100%|███████████████████████████████████| 1/1 [00:00<00:00, 3.94file/s] Loaded 74 passages and 13 documents from memgpt_research_paper4. Create an Agent and Attach the Data Source
Create a new chatbot and attach the data source to its archival memory:
$ memgpt run --persona memgpt_docOnce chatting with the agent, attach the data source:
? Would you like to select an existing agent? No 🧬 Creating new agent... -> 🤖 Using persona profile: 'sam_pov' -> 🧑 Using human profile: 'basic' 🎉 Created new agent 'PoliteButterfly' Hit enter to begin (will request first MemGPT message) 💭 Interesting, I've got a first-time user. Time to present myself and get to understand the user's needs. 🤖 Greetings Chad! I'm MemGPT. How may I assist you today? > Enter your message: /attach ? Select data source memgpt_research_paper 100%|███████████████████████████████████| 1/1 [00:00<00:00, 4.81it/s]5. Chat with Your Documents
Now that the data has been loaded into the agent's archival memory, ask questions about it:
> Enter your message: The paper I loaded into your archival memory is called MemGPT. Can you tell me more about it? 💭 It seems Chad would like to know more about the paper titled 'MemGPT' that's stored in my archival memory. ⚡🧠 [function] searching memory with archival_memory_search 💭 Based on the found results, MemGPT is an AI system designed for tasks like extended conversations and document analysis. 🤖 Certainly, Chad. MemGPT, or MemoryGPT, is a system specially designed for tasks like extended conversations and document analysis which are traditionally hindered by the limited context windows of modern Large Language Models (LLMs). It uses a technique inspired by hierarchical memory systems in traditional operating systems, called virtual context management.Learn More
- MemGPT with Milvus Integration — Official Milvus tutorial for MemGPT integration
- Introduction to MemGPT and Its Integration with Milvus — Zilliz blog on MemGPT and Milvus integration
- Letta Documentation — Official Letta documentation
- Letta GitHub Repository — Letta source code and community resources
- MemGPT Research Paper — Original MemGPT research paper