Haystack and Zilliz Cloud Integration
Haystack and Zilliz Cloud integrate to build powerful Retrieval-Augmented Generation applications, combining Haystack's open-source Python framework for LLM-powered pipelines with Zilliz Cloud's high-performance vector database for efficient document storage, semantic search, and scalable retrieval.
Use this integration for FreeWhat is Haystack
Haystack is the open-source Python framework by deepset for building custom apps with large language models (LLMs). It enables users to construct pipelines incorporating large language models for diverse search scenarios, including retrieval-augmented generation, question-answering, and semantic document exploration. Its modular architecture allows seamless incorporation of external technologies, and users can query using natural language without complex syntax requirements.
By integrating with Zilliz Cloud (fully managed Milvus), Haystack gains access to a fully managed vector database that provides efficient storage and retrieval of high-dimensional vectors, fast similarity search operations critical for semantic exploration and RAG pipelines, and horizontal scaling capabilities to handle large-scale deployments and growing data volumes.
Benefits of the Haystack + Zilliz Cloud Integration
- Efficient storage and retrieval: Zilliz Cloud manages high-dimensional vectors effectively, particularly beneficial when handling extensive document collections and LLM-generated embeddings in Haystack pipelines.
- Fast similarity search: Zilliz Cloud optimizes similarity search operations critical for semantic exploration and RAG pipelines, significantly accelerating retrieval within Haystack workflows.
- Scalability: Horizontal scaling capabilities allow Haystack to manage large-scale deployments and growing data volumes efficiently with Zilliz Cloud as the vector storage backend.
- Modular pipeline integration: Haystack's modular architecture enables seamless incorporation of Zilliz Cloud as a document store, enhancing overall system efficiency without requiring complex configuration.
How the Integration Works
Haystack provides the pipeline framework for building LLM-powered applications. It offers modular components for document conversion, text splitting, embedding generation, retrieval, prompt building, and text generation — all connectable into end-to-end pipelines for indexing and querying.
Zilliz Cloud serves as the vector database layer through the MilvusDocumentStore, storing and indexing document embeddings for fast similarity search. It provides high-performance retrieval with low latency, enabling Haystack pipelines to find the most relevant documents from large knowledge bases.
Together, Haystack and Zilliz Cloud create a complete RAG solution: Haystack's indexing pipeline processes documents — converting, splitting, and embedding them — then stores them in Zilliz Cloud via MilvusDocumentStore. When a query comes in, Haystack's retrieval pipeline uses MilvusEmbeddingRetriever to find relevant documents through vector similarity search, then passes them to the LLM to generate contextually informed responses.
Step-by-Step Guide
1. Install Dependencies
Install the required packages:
! pip install --upgrade --quiet pymilvus milvus-haystack markdown-it-py mdit_plain2. Prepare the OpenAI API Key
Prepare the OpenAI API key as an environment variable:
import os os.environ["OPENAI_API_KEY"] = "sk-***********"3. Prepare the Data
Download an online content about Leonardo Da Vinci as a store of private knowledge for the RAG pipeline:
import urllib.request url = "https://www.gutenberg.org/cache/epub/7785/pg7785.txt" file_path = "./davinci.txt" if not os.path.exists(file_path): urllib.request.urlretrieve(url, file_path)4. Create the Indexing Pipeline
Create an indexing pipeline that converts the text into documents, splits them into sentences, embeds them, and writes them to the Milvus document store:
from haystack import Pipeline from haystack.components.converters import MarkdownToDocument from haystack.components.embedders import OpenAIDocumentEmbedder, OpenAITextEmbedder from haystack.components.preprocessors import DocumentSplitter from haystack.components.writers import DocumentWriter from haystack.utils import Secret from milvus_haystack import MilvusDocumentStore from milvus_haystack.milvus_embedding_retriever import MilvusEmbeddingRetriever document_store = MilvusDocumentStore( connection_args={"uri": "./milvus.db"}, # connection_args={"uri": "http://localhost:19530"}, # connection_args={"uri": YOUR_ZILLIZ_CLOUD_URI, "token": Secret.from_env_var("ZILLIZ_CLOUD_API_KEY")}, drop_old=False, )For the connection_args: Setting the
urias a local file, e.g../milvus.db, is the most convenient method, as it automatically utilizes Milvus Lite to store all data in this file. If you have large scale of data, you can set up a more performant Milvus server on Docker or Kubernetes. If you want to use Zilliz Cloud, the fully managed cloud service for Milvus, adjust theuriandtoken, which correspond to the Public Endpoint and API Key in Zilliz Cloud.indexing_pipeline = Pipeline() indexing_pipeline.add_component("converter", MarkdownToDocument()) indexing_pipeline.add_component( "splitter", DocumentSplitter(split_by="sentence", split_length=2) ) indexing_pipeline.add_component("embedder", OpenAIDocumentEmbedder()) indexing_pipeline.add_component("writer", DocumentWriter(document_store)) indexing_pipeline.connect("converter", "splitter") indexing_pipeline.connect("splitter", "embedder") indexing_pipeline.connect("embedder", "writer") indexing_pipeline.run({"converter": {"sources": [file_path]}}) print("Number of documents:", document_store.count_documents())5. Create the Retrieval Pipeline
Create a retrieval pipeline that retrieves documents from the Milvus document store using vector similarity search:
question = 'Where is the painting "Warrior" currently stored?' retrieval_pipeline = Pipeline() retrieval_pipeline.add_component("embedder", OpenAITextEmbedder()) retrieval_pipeline.add_component( "retriever", MilvusEmbeddingRetriever(document_store=document_store, top_k=3) ) retrieval_pipeline.connect("embedder", "retriever") retrieval_results = retrieval_pipeline.run({"embedder": {"text": question}}) for doc in retrieval_results["retriever"]["documents"]: print(doc.content) print("-" * 10)6. Create the RAG Pipeline
Create a RAG pipeline that combines the MilvusEmbeddingRetriever and the OpenAIGenerator to answer the question using the retrieved documents:
from haystack.components.builders import PromptBuilder from haystack.components.generators import OpenAIGenerator prompt_template = """Answer the following query based on the provided context. If the context does not include an answer, reply with 'I don't know'.\n Query: {{query}} Documents: {% for doc in documents %} {{ doc.content }} {% endfor %} Answer: """ rag_pipeline = Pipeline() rag_pipeline.add_component("text_embedder", OpenAITextEmbedder()) rag_pipeline.add_component( "retriever", MilvusEmbeddingRetriever(document_store=document_store, top_k=3) ) rag_pipeline.add_component("prompt_builder", PromptBuilder(template=prompt_template)) rag_pipeline.add_component( "generator", OpenAIGenerator( api_key=Secret.from_token(os.getenv("OPENAI_API_KEY")), generation_kwargs={"temperature": 0}, ), ) rag_pipeline.connect("text_embedder.embedding", "retriever.query_embedding") rag_pipeline.connect("retriever.documents", "prompt_builder.documents") rag_pipeline.connect("prompt_builder", "generator") results = rag_pipeline.run( { "text_embedder": {"text": question}, "prompt_builder": {"query": question}, } ) print("RAG answer:", results["generator"]["replies"][0])Learn More
- Retrieval-Augmented Generation (RAG) with Milvus and Haystack — Official Milvus tutorial for building RAG with Haystack
- Building a RAG Pipeline with Milvus and Haystack 2.0 — Zilliz tutorial on building efficient RAG pipelines
- milvus-haystack on PyPI — The official Milvus integration package for Haystack
- milvus-haystack GitHub Repository — Source code for the Milvus-Haystack integration
- Haystack Documentation — Official Haystack documentation by deepset