BAAI / bge-m3
Milvus Integrated
Task: Embedding
Modality: Text
Similarity Metric: Cosine, dot product
License: Mit
Dimensions: 1024
Max Input Tokens: 8192
Price: Free
Introduction to BGE-M3
BGE-M3 is a text embedding model developed by BAAI, tailored for versatility across different languages and tasks. It supports over 100 languages and handles input sizes of up to 8192 tokens. This capability enables BGE-M3 to excel in dense, sparse, and multi-vector retrieval tasks, making it highly effective for document retrieval and question-answering. The model's state-of-the-art performance is demonstrated in benchmarks like MKQA and MLDR. Its architecture includes self-knowledge distillation, enhancing adaptability and accuracy across various datasets, positioning it as a powerful tool for multilingual information retrieval.
How to create embeddings with BGE-M3
There are two ways to generate vector embeddings:
- PyMilvus: the Python SDK for Milvus that seamlessly integrates the
bge-m3
model. - FlagEmbedding library: the Python library
FlagEmbedding
.
Once the vector embeddings are created, they can be stored in Zilliz Cloud (a fully managed vector database service powered by Milvus) and used for semantic similarity search. Here are four key steps:
- Sign up for a Zilliz Cloud account for free.
- Set up a serverless cluster and obtain the Public Endpoint and API Key.
- Create a vector collection and insert your vector embeddings.
- Run a semantic search on the stored embeddings.
Create embeddings via PyMilvus
from pymilvus.model.hybrid import BGEM3EmbeddingFunction
from pymilvus import MilvusClient, DataType, AnnSearchRequest, WeightedRanker
ef = BGEM3EmbeddingFunction()
docs = [
"Artificial intelligence was founded as an academic discipline in 1956.",
"Alan Turing was the first person to conduct substantial research in AI.",
"Born in Maida Vale, London, Turing was raised in southern England."
]
# Generate embeddings for documents
docs_embeddings = ef.encode_documents(docs)
queries = ["When was artificial intelligence founded",
"Where was Alan Turing born?"]
# Generate embeddings for queries
query_embeddings = ef.encode_queries(queries)
client = MilvusClient(
uri=ZILLIZ_PUBLIC_ENDPOINT,
token=ZILLIZ_API_KEY)
schema = client.create_schema(
auto_id=True,
enable_dynamic_field=True,
)
schema.add_field(field_name="pk", datatype=DataType.INT64, is_primary=True)
schema.add_field(
field_name="dense_vector",
datatype=DataType.FLOAT_VECTOR,
dim=ef.dim["dense"],
)
schema.add_field(
field_name="sparse_vector", datatype=DataType.SPARSE_FLOAT_VECTOR
)
index_params = client.prepare_index_params()
index_params.add_index(
field_name="dense_vector", index_type="FLAT", metric_type="IP"
)
index_params.add_index(
field_name="sparse_vector",
index_type="SPARSE_INVERTED_INDEX",
metric_type="IP",
)
COLLECTION = "documents"
if client.has_collection(collection_name=COLLECTION):
client.drop_collection(collection_name=COLLECTION)
client.create_collection(
collection_name=COLLECTION,
schema=schema,
index_params=index_params,
enable_dynamic_field=True,
)
for i in range(len(docs)):
doc, dense_embedding, sparse_embedding = docs[i], docs_embeddings["dense"][i], docs_embeddings["sparse"][[i]]
client.insert(COLLECTION, {"text": doc, "dense_vector": dense_embedding, "sparse_vector": sparse_embedding})
k = 1
req_list = []
dense_embedding = query_embeddings["dense"][0]
sparse_embedding = query_embeddings["sparse"][[0]]
dense_search_param = {
"data": [dense_embedding],
"anns_field": "dense_vector",
"param": {"metric_type": "IP"},
"limit": k * 2,
}
dense_req = AnnSearchRequest(**dense_search_param)
req_list.append(dense_req)
sparse_search_param = {
"data": [sparse_embedding],
"anns_field": "sparse_vector",
"param": {"metric_type": "IP"},
"limit": k * 2,
}
sparse_req = AnnSearchRequest(**sparse_search_param)
req_list.append(sparse_req)
results = client.hybrid_search(
COLLECTION,
req_list,
WeightedRanker(1.0, 1.0),
k,
consistency_level="Strong",
output_fields=["text"]
)
Refer to our PyMilvus Documentation for a detailed step-by-step guide.
Generate vector embeddings via FlagEmbedding Library
from pymilvus import MilvusClient, DataType, AnnSearchRequest, WeightedRanker
from FlagEmbedding import BGEM3FlagModel
model = BGEM3FlagModel('BAAI/bge-m3', use_fp16=False)
docs = [
"Artificial intelligence was founded as an academic discipline in 1956.",
"Alan Turing was the first person to conduct substantial research in AI.",
"Born in Maida Vale, London, Turing was raised in southern England."
]
# Generate embeddings for documents
docs_embeddings = model.encode(docs, return_dense=True, return_sparse=True)
queries = ["When was artificial intelligence founded",
"Where was Alan Turing born?"]
# Generate embeddings for queries
query_embeddings = model.encode(queries, return_dense=True, return_sparse=True)
client = MilvusClient(
uri=ZILLIZ_PUBLIC_ENDPOINT,
token=ZILLIZ_API_KEY)
schema = client.create_schema(
auto_id=True,
enable_dynamic_field=True,
)
schema.add_field(field_name="pk", datatype=DataType.INT64, is_primary=True)
schema.add_field(
field_name="dense_vector",
datatype=DataType.FLOAT_VECTOR,
dim=1024,
)
schema.add_field(
field_name="sparse_vector", datatype=DataType.SPARSE_FLOAT_VECTOR
)
index_params = client.prepare_index_params()
index_params.add_index(
field_name="dense_vector", index_type="FLAT", metric_type="IP"
)
index_params.add_index(
field_name="sparse_vector",
index_type="SPARSE_INVERTED_INDEX",
metric_type="IP",
)
COLLECTION = "documents"
if client.has_collection(collection_name=COLLECTION):
client.drop_collection(collection_name=COLLECTION)
client.create_collection(
collection_name=COLLECTION,
schema=schema,
index_params=index_params,
enable_dynamic_field=True,
)
for i in range(len(docs)):
doc, dense_embedding, sparse_embedding = docs[i], docs_embeddings["dense_vecs"][i], docs_embeddings["lexical_weights"][i]
client.insert(COLLECTION, {"text": doc, "dense_vector": dense_embedding, "sparse_vector": {int(k): sparse_embedding[k] for k in sparse_embedding}})
k = 1
req_list = []
dense_embedding = query_embeddings["dense_vecs"][0]
sparse_embedding = {int(k): query_embeddings["lexical_weights"][0][k] for k in query_embeddings["lexical_weights"][0]}
dense_search_param = {
"data": [dense_embedding],
"anns_field": "dense_vector",
"param": {"metric_type": "IP"},
"limit": k * 2,
}
dense_req = AnnSearchRequest(**dense_search_param)
req_list.append(dense_req)
sparse_search_param = {
"data": [sparse_embedding],
"anns_field": "sparse_vector",
"param": {"metric_type": "IP"},
"limit": k * 2,
}
sparse_req = AnnSearchRequest(**sparse_search_param)
req_list.append(sparse_req)
results = client.hybrid_search(
COLLECTION,
req_list,
WeightedRanker(1.0, 1.0),
k,
consistency_level="Strong",
output_fields=["text"]
)
Further Reading
Seamless AI Workflows
From embeddings to scalable AI search—Zilliz Cloud lets you store, index, and retrieve embeddings with unmatched speed and efficiency.
Try Zilliz Cloud for Free