Blog
Metadata Filtering, Hybrid Search or Agent When Building Your RAG Application

Metadata Filtering, Hybrid Search or Agent When Building Your RAG Application

Jul 12, 20244 min read

What is RAG?

Retrieval Augmented Generation (RAG) is a technique that enhances LLMs by integrating additional data sources. A typical RAG application involves:

Indexing - a pipeline for ingesting data from a source and indexing it, which usually consists of loading, splitting, and sorting the data in Milvus.
Retrieval and generation - At runtime, RAG processes the user's query, fetches relevant data from the index stored in Milvus, and the LLM generates a response based on this enriched context.

There are different ways to improve your RAG application. In this blog, we will discuss how the Milvus vector database can boost the performance of your RAG apps with its Metadata Filtering, Hybrid Search, and Agent capabilities.

Metadata Filtering

When inserting your data in Milvus, it’s helpful to include metadata about your data. For example, if you work with PDFs, you can insert the page the vectors belong to, the name of the PDF file, the author, etc.

Storing metadata in Milvus allows you to filter out irrelevant data, making your retrieval faster and more efficient. This approach can be particularly useful with your RAG application, as you can ensure that you only give back to the LLM the content that is relevant to your user query.

Milvus supports full-string metadata matching, making it possible to match strings using prefix, infix, postfix, or even character wildcard searches.

    # Prefix example, matches any string starting with “The”.
    expression='title like "The%"'
    # Infix example, matches any string with the word “the” anywhere in the sentence.
    expression='title like "%the%"'
    # Postfix example, matches any string ending with “Rye”.
    expression='title like "%Rye"'
    # Single character wildcard example, matches any one single character at a specific position.
    expression='title like "Flip_ed"'

It is also possible to use array values, either by exact matches or by checking if any elements in the array match with contains_any().

Hybrid Search

Milvus allows up to 10 vector fields in a single dataset collection. This support enables hybrid search, letting users search across multiple vector columns simultaneously. This capability facilitates multimodal search, hybrid sparse and dense search, and hybrid dense and full-text search, providing versatile and flexible search functionality.

Vectors in different columns represent various aspects of data generated by different embedding models with distinct processing methods. The hybrid search results are then combined and reranked using various re-ranking strategies.

How Hybrid Search Works in Milvus

Represent multiple perspectives of information. For instance, in e-commerce, product images include front, side, and top views. Different views can be represented with different types or dimensions of vectors.
Utilize various types of vector embeddings. This includes dense embeddings from models like BERT and Transformers and sparse embeddings from algorithms like BM25, BGE-M3, and SPLADE.
Support the fusion of multimodal vectors from various unstructured data types such as images, videos, audio, and text files. For example, in criminal investigations, suspects can be represented through biometric modalities such as fingerprints, voiceprints, and facial recognition, aiding in identifying individuals across different modalities.
Support the fusion of vector search and full-text search.

Agents in RAG Applications

Large language models can't take actions themselves—they just output text. Agents are systems that use LLMs as reasoning engines to determine which actions to take and the inputs to pass them. After executing actions, the results can be transmitted back into the LLM to determine whether more actions are needed or if it is okay to finish.

They can be used to perform actions such as searching the web, browsing your emails, correcting RAG to add self-reflection or self-grading on retrieved documents, and many more.

Once set up, the agents can add new data into Milvus if the data isn’t stored already, or it can also retrieve it and give it back to your LLM. This allows your RAG system to continue being up to date. Milvus also makes it easy to update your data if needed with the ``` upsert()` `` function.

This is why having metadata in your collection is important, as it allows you to upsert only the necessary information.

Feel free to check out our blog where we showcase how to build a Local Agentic RAG system with LangGraph, Llama 3 and Milvus

Conclusion

To conclude, using Metadata Filtering, Hybrid Search, and Agents, all integrated in Milvus, can enhance your RAG application.

Metadata Filtering allows you to enrich your data with additional attributes, enabling precise and efficient searches. Hybrid Search expands the search capabilities by allowing queries across multiple vector columns. Agents take the functionality a step further by automating actions based on the LLM's output.

You can build a more robust and efficient RAG application using these strategies.

If you enjoyed this blog post, consider giving us a star on Github and joining our Discord to share your experiences with the community.

Updated on Aug 01, 2024

Stephen Batifol
Stephen Batifol is a Developer Advocate at Zilliz. He previously worked as a Machine Learning Engineer at Wolt, where he was working on the ML Platform and as a Data Scientist at Brevo. Stephen studied Computer Science and Artificial Intelligence. He enjoys dancing and surfing.

Content

Start Free, Scale Easily

Try the fully-managed vector database built for your GenAI applications.

Try Zilliz Cloud for Free

Share this article

Keep Reading

How to Build RAG with Milvus, QwQ-32B and Ollama

Hands-on tutorial on how to create a streamlined, powerful RAG pipeline that balances efficiency, accuracy, and scalability using the QwQ-32B model and Milvus.

Building Secure RAG Workflows with Chunk-Level Data Partitioning

Rob Quiros shared how integrating permissions and authorization into partitions can secure data at the chunk level, addressing privacy concerns.

Designing Multi-Tenancy RAG with Milvus: Best Practices for Scalable Enterprise Knowledge Bases

We’ve explored how multi-tenancy frameworks play a critical role in the scalability, security, and performance of RAG-powered knowledge bases.