Cohere and Zilliz Cloud Integration
Cohere and Zilliz Cloud integrate to power semantic search and question-answering systems, combining Cohere's multilingual embedding models and NLP capabilities with Zilliz Cloud's high-performance vector database for efficient similarity search across large-scale datasets.
Use this integration for FreeWhat is Cohere
Cohere provides multilingual language models that enable developers to create vector embeddings to represent the meaning of text as a list of numbers. It utilizes state-of-the-art natural language processing (NLP) algorithms to comprehend and interpret human language effectively, supporting applications including question-answering systems, product recommendation engines, and reverse image search capabilities for LLM augmentation.
By integrating with Zilliz Cloud (fully managed Milvus), Cohere's powerful embedding models are paired with a highly scalable vector database that delivers semantic searches — retrieving results based on meaning and context rather than exact matches — with rapid query response times, enabling real-time analysis of unstructured data across industries like healthcare, finance, and e-commerce.
Benefits of the Cohere + Zilliz Cloud Integration
- Advanced semantic search: Cohere's embeddings capture the meaning of text, while Zilliz Cloud retrieves results based on semantic similarity rather than exact keyword matches, improving accuracy and relevance.
- Multilingual support: Cohere's
embed-multilingual-v3.0model generates embeddings across multiple languages, and Zilliz Cloud stores and retrieves them efficiently, enabling cross-language search applications. - Real-time data analysis: Zilliz Cloud's rapid query response times paired with Cohere's embedding generation enable real-time analysis of unstructured data for quick insights and informed decision-making.
- Scalability for large datasets: Zilliz Cloud is highly scalable and can handle massive amounts of data, making it suitable for large-scale embedding storage and retrieval powered by Cohere models.
How the Integration Works
Cohere serves as the embedding and NLP layer, converting text into high-dimensional vector embeddings using models like
embed-multilingual-v3.0. It provides separate input types for documents (search_document) and queries (search_query) to optimize retrieval accuracy.Zilliz Cloud serves as the vector database layer, storing and indexing the embeddings generated by Cohere. It provides high-performance similarity search using metrics like inner product (IP), enabling fast retrieval of the most relevant results from large collections.
Together, Cohere and Zilliz Cloud create an end-to-end semantic search solution: text data is embedded using Cohere's models and stored in Zilliz Cloud. When a user submits a query, Cohere embeds the query text, and Zilliz Cloud performs similarity search to find the closest matching documents, enabling applications like question answering, recommendations, and contextual retrieval.
Step-by-Step Guide
1. Install Required Packages
pip install pymilvus cohere pandas numpy tqdm2. Load Modules and Set Parameters
import cohere import pandas import numpy as np from tqdm import tqdm from pymilvus import connections, FieldSchema, CollectionSchema, DataType, Collection, utility FILE = 'https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v2.0.json' COLLECTION_NAME = 'question_answering_db' DIMENSION = 1024 COUNT = 5000 BATCH_SIZE = 96 MILVUS_HOST = 'localhost' MILVUS_PORT = '19530' COHERE_API_KEY = 'replace-this-with-the-cohere-api-key'3. Prepare the Dataset
Use the Stanford Question Answering Dataset (SQuAD) as the truth source for answering questions:
# Download the dataset dataset = pandas.read_json(FILE) # Clean up the dataset by grabbing all the question answer pairs simplified_records = [] for x in dataset['data']: for y in x['paragraphs']: for z in y['qas']: if len(z['answers']) != 0: simplified_records.append({'question': z['question'], 'answer': z['answers'][0]['text']}) # Grab the amount of records based on COUNT simplified_records = pandas.DataFrame.from_records(simplified_records) simplified_records = simplified_records.sample(n=min(COUNT, len(simplified_records)), random_state = 42) print(len(simplified_records))4. Create a Collection in Milvus
# Connect to Milvus Database connections.connect(host=MILVUS_HOST, port=MILVUS_PORT) # Remove collection if it already exists if utility.has_collection(COLLECTION_NAME): utility.drop_collection(COLLECTION_NAME) # Create collection which includes the id, title, and embedding. fields = [ FieldSchema(name='id', dtype=DataType.INT64, is_primary=True, auto_id=True), FieldSchema(name='original_question', dtype=DataType.VARCHAR, max_length=1000), FieldSchema(name='answer', dtype=DataType.VARCHAR, max_length=1000), FieldSchema(name='original_question_embedding', dtype=DataType.FLOAT_VECTOR, dim=DIMENSION) ] schema = CollectionSchema(fields=fields) collection = Collection(name=COLLECTION_NAME, schema=schema) # Create an IVF_FLAT index for collection. index_params = { 'metric_type':'IP', 'index_type':"IVF_FLAT", 'params':{"nlist": 1024} } collection.create_index(field_name="original_question_embedding", index_params=index_params) collection.load()5. Insert Data with Cohere Embeddings
# Set up a co:here client. cohere_client = cohere.Client(COHERE_API_KEY) # Extract embeddings from questions using Cohere def embed(texts, input_type): res = cohere_client.embed(texts, model='embed-multilingual-v3.0', input_type=input_type) return res.embeddings # Insert each question, answer, and question embedding total = pandas.DataFrame() for batch in tqdm(np.array_split(simplified_records, (COUNT/BATCH_SIZE) + 1)): questions = batch['question'].tolist() embeddings = embed(questions, "search_document") data = [ { 'original_question': x, 'answer': batch['answer'].tolist()[i], 'original_question_embedding': embeddings[i] } for i, x in enumerate(questions) ] collection.insert(data=data) time.sleep(10)6. Ask Questions and Search
Search the collection for answers by embedding the query with Cohere and performing similarity search in Milvus:
def search(text, top_k = 5): search_params = {} results = collection.search( data = embed([text], "search_query"), anns_field='original_question_embedding', param=search_params, limit = top_k, output_fields=['original_question', 'answer'] ) distances = results[0].distances entities = [ x.entity.to_dict()['entity'] for x in results[0] ] ret = [ { "answer": x[1]["answer"], "distance": x[0], "original_question": x[1]['original_question'] } for x in zip(distances, entities)] return ret search_questions = ['What kills bacteria?', 'What\'s the biggest dog?'] ret = [ { "question": x, "candidates": search(x) } for x in search_questions ]Learn More
- Question Answering Using Milvus and Cohere — Official Milvus tutorial for building QA with Cohere
- Build RAG Chatbot with LangChain, Milvus, and Cohere Command R+ — Zilliz RAG tutorial with Cohere
- Scaling Search with Milvus: Handling Massive Datasets with Ease — Zilliz blog on scaling search with Cohere embeddings
- Cohere Documentation — Official Cohere documentation
- Cohere Embed API Reference — Cohere embedding API reference