Docs Menu
Conduct a Vector Similarity Search
This topic describes how to search entities with Zilliz Cloud database.
A vector similarity search in Zilliz Cloud database calculates the distance between query vector(s) and vectors in the collection with specified similarity metrics, and returns the most similar results. By specifying a boolean expression that filters the scalar field or the primary key field, you can perform a hybrid search or even a search with Time Travel.
The following example shows how to perform a vector similarity search on a 2000-row dataset of book ID (primary key), word count (scalar field), and book introduction (vector field), simulating the situation that you search for certain books based on their vectorized introductions. Zilliz Cloud database will return the most similar results according to the query vector and search parameters you have defined.
Load collection
All search and query operations within Zilliz Cloud database are executed in memory. Load the collection to memory before conducting a vector similarity search.
from pymilvus import Collection
collection = Collection("book") # Get an existing collection.
collection.load()
await milvusClient.collectionManager.loadCollection({
collection_name: "book",
});
milvusClient.loadCollection(
LoadCollectionParam.newBuilder()
.withCollectionName("book")
.build()
);
Prepare search parameters
Prepare the parameters that suit your search scenario. The following example defines that the search will calculate the distance with Euclidean distance, and retrieve vectors from ten closest clusters built by the IVF_FLAT index.
search_params = {"metric_type": "L2", "params": {"nprobe": 10}}
final Integer SEARCH_K = 2; // TopK
final String SEARCH_PARAM = "{\"nprobe\":10}"; // Params
Parameter | Description |
---|---|
metric_type |
Metrics used to measure similarity of vectors. See Simlarity Metrics for more information. |
params |
Search parameter(s) specific to the index. See Vector Index for more information. |
Parameter | Description | Options |
---|---|---|
TopK |
Number of the most similar results to return. | N/A |
Params |
Search parameter(s) specific to the index. | See Vector Index for more information. |
Conduct a vector search
Search vectors with Zilliz Cloud database. To search in a specific partition, specify the list of partition names.
Zilliz Cloud database supports setting consistency level specifically for a search or query (only on PyMilvus currently). The consistency level set in the search or query requests overwrites the one set while creating the collection. In this example, the consistency level of the search request is set as "Strong", meaning Zilliz Cloud database will read the most updated data view at the exact time point when a search or query request comes. Without specifying the consistency level during a search or query, Zilliz Cloud database adopts the original consistency level of the collection.
results = collection.search(
data=[[0.1, 0.2]],
anns_field="book_intro",
param=search_params,
limit=10,
expr=None,
consistency_level="Strong"
)
List<String> search_output_fields = Arrays.asList("book_id");
List<List<Float>> search_vectors = Arrays.asList(Arrays.asList(0.1f, 0.2f));
SearchParam searchParam = SearchParam.newBuilder()
.withCollectionName("book")
.withMetricType(MetricType.L2)
.withOutFields(search_output_fields)
.withTopK(SEARCH_K)
.withVectors(search_vectors)
.withVectorFieldName("book_intro")
.withParams(SEARCH_PARAM)
.build();
R<SearchResults> respSearch = milvusClient.search(searchParam);
Parameter | Description |
---|---|
data |
Vectors to search with. |
anns_field |
Name of the field to search on. |
param |
Search parameter(s) specific to the index. See Vector Index for more information. |
limit |
Number of the most similar results to return. |
expr |
Boolean expression used to filter attribute. |
partition_names (optional) |
List of names of the partition to search in. |
output_fields (optional) |
Name of the field to return. Vector field is not supported in current release. |
timeout (optional) |
A duration of time in seconds to allow for RPC. Clients wait until server responds or error occurs when it is set to None. |
round_decimal (optional) |
Number of decimal places of returned distance. |
consistency_level (optional) |
Consistency level of the search. |
Parameter | Description | Options |
---|---|---|
CollectionName |
Name of the collection to load. | N/A |
MetricType |
Metric type used for search. | This parameter must be set identical to the metric type used for index building. |
OutFields |
Name of the field to return. | Vector field is not supported in current release. |
Vectors |
Vectors to search with. | N/A |
VectorFieldName |
Name of the field to search on. | N/A |
Expr |
Boolean expression used to filter attribute. | N/A |
Check the primary key values of the most similar vectors and their distances.
results[0].ids
results[0].distances
SearchResultsWrapper wrapperSearch = new SearchResultsWrapper(respSearch.getData().getResults());
System.out.println(wrapperSearch.getIDScore(0));
System.out.println(wrapperSearch.getFieldData("book_id", 0));
Release the collection loaded in Zilliz Cloud database to reduce memory consumption when the search is completed.
collection.release()
milvusClient.releaseCollection(
ReleaseCollectionParam.newBuilder()
.withCollectionName("book")
.build());
Limits
Feature | Maximum limit |
---|---|
Length of a collection name | 255 characters |
Number of partitions in a collection | 4,096 |
Number of fields in a collection | 256 |
Number of shards in a collection | 256 |
Dimensions of a vector | 32,768 |
Top K | 16,384 |
Target input vectors | 16,384 |