Blog
Building a Conversational AI Agent with Long-Term Memory Using LangChain and Milvus

Building a Conversational AI Agent with Long-Term Memory Using LangChain and Milvus

Jul 15, 20248 min read

Large language models (LLMs) have changed the game in artificial intelligence (AI). These advanced models can easily understand and generate human-like text with impressive accuracy, making AI assistants and chatbots much smarter and more useful. Thanks to LLMs, we now have AI tools that can handle complex language tasks, from answering questions to translating languages.

Conversational agents are software programs that chat with users in natural language, just like talking to a real person. They power things like chatbots and virtual assistants, helping us with everyday tasks by understanding and responding to our questions and commands.

LangChain is an open-source framework that makes it easier to build these conversational agents. It provides handy tools and templates to create smart, context-aware chatbots and other AI applications quickly and efficiently.

Introduction to LangChain Agents

LangChain agents are advanced systems that use an LLM to interact with different tools and data sources to complete complex tasks. These agents can understand user inputs, make decisions, and create responses, using the LLM to offer more flexible and adaptive decision-making than traditional methods.

A big advantage of LangChain Agents is their ability to use external tools and data sources. This means they can gather information, perform calculations, and take actions beyond just processing language, making them more powerful and effective for various applications

LangChain Agents vs. Chains

Chains and agents are the two main tools used in LangChain. Chains allow you to create a pre-defined sequence of tool usage, which is useful for tasks that require a specific order of operation.

How LangChain Chains work

On the other hand, Agents enable the large language model to use tools in a loop, allowing it to decide how many times to use tools. This flexibility is ideal for tasks that require iterative processing or dynamic decision-making.

How LangChain Agents work

Build a Conversational Agent Using LangChain

Let’s build a conversational agent using LangChain in Python.

Install Dependencies

To build a LangChain agent, we need to install the following dependencies:

LangChain: LangChain is an open-source framework that helps developers create applications using large language models (LLMs).
Langchain OpenAI: This package contains the LangChain integrations for OpenAI through their openai SDK.
OpenAI API SDK: The OpenAI Python library provides convenient access to the OpenAI REST API from any Python 3.7+ application.
Dotenv: Python-dotenv reads key-value pairs from a .env file and can set them as environment variables.
Milvus: an open-source vector database best for billion-scale vector storage and similarity search. It is also a popular infrastructure component for building Retrieval Augmented Generation (RAG) applications.
Pymilvus: The Python SDK for Milvus. It integrates many popular embedding and reranking models which streamlines the building of RAG applications.
Tiktoken: a fast BPE tokeniser for use with OpenAI's models.

You can install them by executing the following command:

pip install langchain==0.1.20 langchain-openai openai python-dotenv pymilvus langchain_milvus tiktoken

Please note that we will specifically use LangChain version 0.1.20 in this example.

After we installed all the dependencies, let's write the code to set up a simple conversational agent.

Load Environment Variables

First, we'll load environment variables using the dotenv package. This package helps secure sensitive information like API keys.

from dotenv import load_dotenv
load_dotenv()

Ensure you have a .env file in your project directory containing your OpenAI API key.

OPENAI_API_KEY=sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Initialize the OpenAI LLM

Next, we'll initialize the OpenAI LLM using the langchain_openai package.

from langchain_openai import OpenAI
llm = OpenAI()

Create a Conversation Chain

We use the ConversationChain class from langchain.chains to create a conversation agent. This chain will handle the dialogue with the user.

from langchain.chains import ConversationChain
conversation = ConversationChain(
   llm=llm,
)

Make Predictions

Finally, we can make predictions by passing user input to the conversation chain. In this example, we ask the agent a simple question.

answer = conversation.predict(input="What's my name?")
print(answer)

The Full Code Example

Combining all the steps, here's the complete code.

    from dotenv import load_dotenv
    from langchain_openai import OpenAI
    from langchain.chains import ConversationChain


    load_dotenv()


    llm = OpenAI()


    conversation = ConversationChain(
       llm=llm,
    )


    answer = conversation.predict(input="What's my name?")


    print(answer)

Running the Code

To run the code, make sure you have set up your .env file with your OpenAI API key. Then, execute your Python script. You should see an output where the agent responds to the question, showcasing the conversational capabilities of the LangChain framework.

After running the Python script, you should get a response along these lines:

> I do not have access to your personal information, so I am unable to answer that question accurately. Could you please provide me with your name so I can address you properly?

Congratulations! You've successfully built a basic conversational agent using LangChain. This example is just the beginning. With LangChain, you can build more complex and capable AI agents tailored to your needs.

The Importance of Memory in Conversational Agents

However, when we asked our agent, "What's my name?" it couldn't answer correctly because it had no memory of previous interactions. This lack of memory limits the usefulness of conversational agents, as they can't retain important information about the user or the context of the conversation. By integrating memory, our agent can remember key details from past interactions, making responses more accurate and personalized.

Build a Conversational Agent with Long-Term Memory using LangChain and Milvus

Milvus is a high-performance open-source vector database built to efficiently store and retrieve billion-scale vectors. It is widely used for GenAI use cases like semantic search and Retrieval Augmented Generation (RAG). It is also a vital infrastructure component for adding long-term memories for LLMs.

Milvus Lite is a lightweight version of Milvus that can run on your local devices. In this example, we will use Milvus Lite as the vector store to store and retrieve my private data.

Now, let’s enhance our conversational agent with long-term memory using LangChain and Milvus Lite.

Install Requirements

First, install the required packages if you don’t have them.

pip install langchain==0.1.20 langchain-openai python-dotenv openai pymilvus tiktoken

Now, let's write the code step-by-step.

Load Environment Variables

Load the environment variables using the dotenv package. This step helps secure sensitive information like API keys.

from dotenv import load_dotenv
load_dotenv()

Ensure you have a .env file in your project directory containing your OpenAI API key.

OPENAI_API_KEY=sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Initialize the OpenAI LLM and Embeddings

Initialize the OpenAI LLM and embeddings using the langchain_openai package.

from langchain_openai import OpenAI, OpenAIEmbeddings
llm = OpenAI()
embeddings = OpenAIEmbeddings()

Set Up Milvus as the vector store

Set up a Milvus vector database to store and retrieve your data.

from langchain_milvus.vectorstores import Milvus
vectordb = Milvus(
   embeddings,
   connection_args={"uri": "./milvus_demo.db"},
# The easiest way is to use Milvus Lite where everything is stored in a local file.
# If you have a Milvus server you can use the server URI such as "http://localhost:19530".
)
retriever = vectordb.as_retriever( search_kwargs=dict(k=1))

Create Memory for the Agent

Set up the memory using the vector retriever.

from langchain.memory import VectorStoreRetrieverMemory
memory = VectorStoreRetrieverMemory(retriever=retriever)

Save Initial Context

Add some initial information to the memory.

about_me = [
   {"input": "My name is Bob.", "output": "Got it!"},
   {"input": "I'm from San Francisco.", "output": "Got it!"},
]
for example in about_me:
   memory.save_context({"input": example["input"]}, {"output": example["output"]})

Define the Prompt Template

Create a prompt template that includes memory.

from langchain.prompts import PromptTemplate
prompt_template = """The following is a friendly conversation between a user and a chatbot. The chatbot is talkative and provides lots of specific details from its context. If the chatbot does not know the answer to a question, it truthfully says it does not know.
Relevant pieces of previous conversation:
{history}
(You do not need to use these pieces of information if not relevant)
Current conversation:
User: {input}
Chatbot:"""
prompt = PromptTemplate(input_variables=["history", "input"], template=prompt_template)

Create a Conversation Chain with Memory

Set up the conversation chain to use the prompt and memory.

from langchain.chains import ConversationChain
conversation_with_memory = ConversationChain(
   llm=llm, prompt=prompt, memory=memory, verbose=True
)

Make Predictions

Finally, ask the agent a question to see how it uses its memory.

The Full Code Example

Combining all the steps, here's the complete code.

from dotenv import load_dotenv
from langchain_openai import OpenAIEmbeddings
from langchain_openai import OpenAI
from langchain.memory import VectorStoreRetrieverMemory
from langchain.chains import ConversationChain
from langchain.prompts import PromptTemplate
from langchain_milvus.vectorstores import Milvus


load_dotenv()


llm = OpenAI()
embeddings = OpenAIEmbeddings()


vectordb = Milvus(
   embeddings,
   connection_args={"uri": "./milvus_demo.db"},
)
retriever = vectordb.as_retriever(search_kwargs=dict(k=1))
memory = VectorStoreRetrieverMemory(retriever=retriever)


about_me = [
   {"input": "My name is Bob.", "output": "Got it!"},
   {"input": "I'm from San Francisco.", "output": "Got it!"},
]
for example in about_me:
   memory.save_context({"input": example["input"]}, {"output": example["output"]})


prompt_template = """The following is a friendly conversation between a user and a chatbot. The chatbot is talkative and provides lots of specific details from its context. If the chatbot does not know the answer to a question, it truthfully says it does not know.


Relevant pieces of previous conversation:
{history}


(You do not need to use these pieces of information if not relevant)


Current conversation:
User: {input}
Chatbot:"""


prompt = PromptTemplate(input_variables=["history", "input"], template=prompt_template)


conversation_with_memory = ConversationChain(
   llm=llm, prompt=prompt, memory=memory, verbose=True
)


answer = conversation_with_memory.predict(input="What's my name?")

print(answer)

Running the Code

Set up your .env file with your OpenAI API key to run the code. Then, execute your Python script. You should see an output where the agent responds to the question, showcasing the enhanced capabilities of the LangChain framework with memory integration.

After running the Python script, you should get a response along these lines:

> Your name is Bob. Did you know that the name Bob is a diminutive form of the name Robert, which means "bright fame" in Germanic languages? It was a popular name during the Middle Ages and has been used by many famous people throughout history. Do you know any famous people named Bob?

Congratulations! You've successfully built a conversational agent with long-term memory using LangChain and Milvus. This example demonstrates how memory can significantly enhance agents' ability to provide accurate and personalized responses. With LangChain and Milvus, you can build even more advanced and capable AI agents tailored to your needs.

Summary

In this article, we've explored the fascinating world of LangChain agents and their potential to transform conversational AI. We started with a basic introduction to LangChain, understanding its role in building advanced, context-aware conversational agents. Then, we explored the practical steps of creating a simple agent using LangChain, highlighting the limitations of agents without memory.

To address these limitations, we demonstrated how to integrate long-term memory into your agent using Milvus Lite, showcasing how this enhancement allows the agent to retain important information and provide more accurate, personalized responses.

Encouragement to Experiment

Now that you have a foundational understanding of LangChain and how to build conversational agents, it's time to experiment! Try building more complex agents, integrating additional tools, and exploring different use cases. LangChain's flexibility and power make it a fantastic framework for pushing the boundaries of what conversational AI can achieve.

For a deeper dive into LangChain and Milvus, check out their official documentation and recommended tutorials:

By exploring these resources, you'll gain a more comprehensive understanding and be better equipped to build advanced, intelligent conversational agents. Happy coding!

Updated on Aug 01, 2025

Rok Benko

Content

Start Free, Scale Easily

Try the fully-managed vector database built for your GenAI applications.

Try Zilliz Cloud for Free

Share this article

Keep Reading

Announcing the General Availability of Zilliz Cloud BYOC on Google Cloud Platform

Zilliz Cloud BYOC on GCP offers enterprise vector search with full data sovereignty and seamless integration.

Vector Databases vs. NoSQL Databases

Use a vector database for AI-powered similarity search; use NoSQL databases for flexibility, scalability, and diverse non-relational data storage needs.

LLaVA: Advancing Vision-Language Models Through Visual Instruction Tuning

LaVA is a multimodal model that combines text-based LLMs with visual processing capabilities through visual instruction tuning.

Building a Conversational AI Agent with Long-Term Memory Using LangChain and Milvus

Introduction to LangChain Agents

LangChain Agents vs. Chains

Build a Conversational Agent Using LangChain

Install Dependencies

Load Environment Variables

Initialize the OpenAI LLM

Create a Conversation Chain

Make Predictions

The Full Code Example

Running the Code

The Importance of Memory in Conversational Agents

Build a Conversational Agent with Long-Term Memory using LangChain and Milvus

Install Requirements

Load Environment Variables

Initialize the OpenAI LLM and Embeddings

Set Up Milvus as the vector store

Create Memory for the Agent

Save Initial Context

Define the Prompt Template

Create a Conversation Chain with Memory

Make Predictions

The Full Code Example

Running the Code

Summary

Encouragement to Experiment

Content

Start Free, Scale Easily

Share this article

Keep Reading

Announcing the General Availability of Zilliz Cloud BYOC on Google Cloud Platform

Vector Databases vs. NoSQL Databases

LLaVA: Advancing Vision-Language Models Through Visual Instruction Tuning

AI Assistant