Conduct a Vector Similarity Search

This topic describes how to search entities with Zilliz Cloud database.

A vector similarity search in Zilliz Cloud database calculates the distance between query vector(s) and vectors in the collection with specified similarity metrics, and returns the most similar results. By specifying a boolean expression that filters the scalar field or the primary key field, you can perform a hybrid search or even a search with Time Travel.

The following example shows how to perform a vector similarity search on a 2000-row dataset of book ID (primary key), word count (scalar field), and book introduction (vector field), simulating the situation that you search for certain books based on their vectorized introductions. Zilliz Cloud database will return the most similar results according to the query vector and search parameters you have defined.

Load collection

All search and query operations within Zilliz Cloud database are executed in memory. Load the collection to memory before conducting a vector similarity search.

from pymilvus import Collection
collection = Collection("book")      # Get an existing collection.
collection.load()
await milvusClient.collectionManager.loadCollection({
  collection_name: "book",
});
milvusClient.loadCollection(
  LoadCollectionParam.newBuilder()
          .withCollectionName("book")
          .build()
);

Prepare search parameters

Prepare the parameters that suit your search scenario. The following example defines that the search will calculate the distance with Euclidean distance, and retrieve vectors from ten closest clusters built by the IVF_FLAT index.

search_params = {"metric_type": "L2", "params": {"nprobe": 10}}
final Integer SEARCH_K = 2;                       // TopK
final String SEARCH_PARAM = "{\"nprobe\":10}";    // Params
Parameter Description
metric_type Metrics used to measure similarity of vectors.
params Search parameter(s) specific to the index.
Parameter Description Options
TopK Number of the most similar results to return. N/A
Params Search parameter(s) specific to the index. N/A

Search vectors with Zilliz Cloud database. To search in a specific partition, specify the list of partition names.

Zilliz Cloud database supports setting consistency level specifically for a search or query (only on PyMilvus currently). The consistency level set in the search or query requests overwrites the one set while creating the collection. In this example, the consistency level of the search request is set as "Strong", meaning Zilliz Cloud database will read the most updated data view at the exact time point when a search or query request comes. Without specifying the consistency level during a search or query, Zilliz Cloud database adopts the original consistency level of the collection.

results = collection.search(
    data=[[0.1, 0.2]], 
    anns_field="book_intro", 
    param=search_params, 
    limit=10, 
    expr=None,
    consistency_level="Strong"
)
List<String> search_output_fields = Arrays.asList("book_id");
List<List<Float>> search_vectors = Arrays.asList(Arrays.asList(0.1f, 0.2f));

SearchParam searchParam = SearchParam.newBuilder()
        .withCollectionName("book")
        .withMetricType(MetricType.L2)
        .withOutFields(search_output_fields)
        .withTopK(SEARCH_K)
        .withVectors(search_vectors)
        .withVectorFieldName("book_intro")
        .withParams(SEARCH_PARAM)
        .build();
R<SearchResults> respSearch = milvusClient.search(searchParam);
Parameter Description
data Vectors to search with.
anns_field Name of the field to search on.
param Search parameter(s) specific to the index.
limit Number of the most similar results to return.
expr Boolean expression used to filter attribute.
partition_names (optional) List of names of the partition to search in.
output_fields (optional) Name of the field to return. Vector field is not supported in current release.
timeout (optional) A duration of time in seconds to allow for RPC. Clients wait until server responds or error occurs when it is set to None.
round_decimal (optional) Number of decimal places of returned distance.
consistency_level (optional) Consistency level of the search.
Parameter Description Options
CollectionName Name of the collection to load. N/A
MetricType Metric type used for search. This parameter must be set identical to the metric type used for index building.
OutFields Name of the field to return. Vector field is not supported in current release.
Vectors Vectors to search with. N/A
VectorFieldName Name of the field to search on. N/A
Expr Boolean expression used to filter attribute. N/A

Check the primary key values of the most similar vectors and their distances.

results[0].ids
results[0].distances
SearchResultsWrapper wrapperSearch = new SearchResultsWrapper(respSearch.getData().getResults());
System.out.println(wrapperSearch.getIDScore(0));
System.out.println(wrapperSearch.getFieldData("book_id", 0));

Release the collection loaded in Zilliz Cloud database to reduce memory consumption when the search is completed.

collection.release()
milvusClient.releaseCollection(
        ReleaseCollectionParam.newBuilder()
                .withCollectionName("book")
                .build());

Limits

FeatureMaximum limit
Length of a collection name255 characters
Number of partitions in a collection4,096
Number of fields in a collection256
Number of shards in a collection256
Dimensions of a vector32,768
Top K16,384
Target input vectors16,384