Metadata Filtering, Hybrid Search or Agent When Building Your RAG Application
What is RAG?
Retrieval Augmented Generation (RAG) is a technique that enhances LLMs by integrating additional data sources. A typical RAG application involves:
Indexing - a pipeline for ingesting data from a source and indexing it, which usually consists of loading, splitting, and sorting the data in Milvus.
Retrieval and generation - At runtime, RAG processes the user's query, fetches relevant data from the index stored in Milvus, and the LLM generates a response based on this enriched context.
There are different ways to improve your RAG application. In this blog, we will discuss how the Milvus vector database can boost the performance of your RAG apps with its Metadata Filtering, Hybrid Search, and Agent capabilities.
Metadata Filtering
When inserting your data in Milvus, it’s helpful to include metadata about your data. For example, if you work with PDFs, you can insert the page the vectors belong to, the name of the PDF file, the author, etc.
Storing metadata in Milvus allows you to filter out irrelevant data, making your retrieval faster and more efficient. This approach can be particularly useful with your RAG application, as you can ensure that you only give back to the LLM the content that is relevant to your user query.
Milvus supports full-string metadata matching, making it possible to match strings using prefix, infix, postfix, or even character wildcard searches.
# Prefix example, matches any string starting with “The”.
expression='title like "The%"'
# Infix example, matches any string with the word “the” anywhere in the sentence.
expression='title like "%the%"'
# Postfix example, matches any string ending with “Rye”.
expression='title like "%Rye"'
# Single character wildcard example, matches any one single character at a specific position.
expression='title like "Flip_ed"'
It is also possible to use array values, either by exact matches or by checking if any elements in the array match with contains_any()
.
Hybrid Search
Milvus allows up to 10 vector fields in a single dataset collection. This support enables hybrid search, letting users search across multiple vector columns simultaneously. This capability facilitates multimodal search, hybrid sparse and dense search, and hybrid dense and full-text search, providing versatile and flexible search functionality.
Vectors in different columns represent various aspects of data generated by different embedding models with distinct processing methods. The hybrid search results are then combined and reranked using various re-ranking strategies.
How Hybrid Search Works in Milvus
Represent multiple perspectives of information. For instance, in e-commerce, product images include front, side, and top views. Different views can be represented with different types or dimensions of vectors.
Utilize various types of vector embeddings. This includes dense embeddings from models like BERT and Transformers and sparse embeddings from algorithms like BM25, BGE-M3, and SPLADE.
Support the fusion of multimodal vectors from various unstructured data types such as images, videos, audio, and text files. For example, in criminal investigations, suspects can be represented through biometric modalities such as fingerprints, voiceprints, and facial recognition, aiding in identifying individuals across different modalities.
Support the fusion of vector search and full-text search.
Agents in RAG Applications
Large language models can't take actions themselves—they just output text. Agents are systems that use LLMs as reasoning engines to determine which actions to take and the inputs to pass them. After executing actions, the results can be transmitted back into the LLM to determine whether more actions are needed or if it is okay to finish.
They can be used to perform actions such as searching the web, browsing your emails, correcting RAG to add self-reflection or self-grading on retrieved documents, and many more.
Once set up, the agents can add new data into Milvus if the data isn’t stored already, or it can also retrieve it and give it back to your LLM. This allows your RAG system to continue being up to date. Milvus also makes it easy to update your data if needed with the ``` upsert()` `` function.
This is why having metadata in your collection is important, as it allows you to upsert
only the necessary information.
Feel free to check out our blog where we showcase how to build a Local Agentic RAG system with LangGraph, Llama 3 and Milvus
Conclusion
To conclude, using Metadata Filtering, Hybrid Search, and Agents, all integrated in Milvus, can enhance your RAG application.
Metadata Filtering allows you to enrich your data with additional attributes, enabling precise and efficient searches. Hybrid Search expands the search capabilities by allowing queries across multiple vector columns. Agents take the functionality a step further by automating actions based on the LLM's output.
You can build a more robust and efficient RAG application using these strategies.
If you enjoyed this blog post, consider giving us a star on Github and joining our Discord to share your experiences with the community.
Start Free, Scale Easily
Try the fully-managed vector database built for your GenAI applications.
Try Zilliz Cloud for FreeKeep Reading
- Read Now
Building Secure RAG Workflows with Chunk-Level Data Partitioning
Rob Quiros shared how integrating permissions and authorization into partitions can secure data at the chunk level, addressing privacy concerns.
- Read Now
Building RAG Applications with Milvus, Qwen, and vLLM
In this blog, we will explore Qwen and vLLM and how combining both with the Milvus vector database can be used to build a robust RAG system.
- Read Now
Advanced Video Search: Leveraging Twelve Labs and Milvus for Semantic Retrieval
In August 2024, Twelve Labs and Milvus (vector database by Zilliz) joined hands to create powerful video search applications.