Collection()

This is the constructor method to create a collection with the specified schema or to get an existing collection with the name.

Invocation

Collection(name, schema=None, using='default', shards_num=2, **kwargs)

Parameters

ParameterDescriptionTypeRequired
nameName of the collectionStringTrue
schemaSchema of the collection to createclass schema.CollectionSchemaFalse
usingMilvus Connection used to create the collectionStringFalse
shards_numShard number of the collection to create.
It corresponds to the number of data nodes used to insert data.
INT32False
kwargs
  • consistency_level

String/IntegerFalse

A schema specifies the properties of a collection and the fields within. See Schema for more information.

Return

A new collection object created with the specified schema or an existing collection object by name.

Properties

PropertyDescriptionType
nameName of the collectionString
schemaSchema of the collectionclass schema.CollectionSchema
descriptionDescription of the collectionString
is_emptyBoolean value to indicate if the collection is emptyBool
num_entitiesNumber of entities in the collectionInteger
primary_fieldSchema of the primary field in the collectionclass schema.FieldSchema
partitionsList of all partitions in the collectionlist[String]
indexesList of all indexes in the collectionlist[String]

Raises

CollectionNotExistException: error if the collection does not exist.

Example

from pymilvus import CollectionSchema, FieldSchema, DataType, Collection
book_id = FieldSchema(
  name="book_id", 
  dtype=DataType.INT64, 
  is_primary=True, 
)
word_count = FieldSchema(
  name="word_count", 
  dtype=DataType.INT64,  
)
book_intro = FieldSchema(
  name="book_intro", 
  dtype=DataType.FLOAT_VECTOR, 
  dim=2
)
schema = CollectionSchema(
  fields=[book_id, word_count, book_intro], 
  description="Test book search"
)
collection_name = "book"
collection = Collection(
    name=collection_name, 
    schema=schema, 
    using='default', 
    shards_num=2,
    consistency_level="Strong"
)
collection.schema
{
  auto_id: False
  description: Test book search
  fields: [{
    name: book_id
    description: 
    type: 5
    is_primary: True
    auto_id: False
  }, {
    name: word_count
    description: 
    type: 5
  }, {
    name: book_intro
    description: 
    type: 101
    params: {'dim': 2}
  }]
}
collection.description
'Test book search'
collection.name
'book'
collection.is_empty
True
collection.primary_field
{
    name: book_id
    description: 
    type: 5
    is_primary: True
    auto_id: False
  }

create_index()

This method creates an index with the specified index parameter.

Invocation

create_index(field_name, index_params, timeout=None, **kwargs)

Parameters

ParameterDescriptionTypeRequired
field_nameName of the field to create index onStringTrue
index_paramsParameters of the index to createDictTrue
index_nameName of the index to createStringFalse
timeoutAn optional duration of time in seconds to allow for the RPC. If it is set to None, the client keeps waiting until the server responds or error occurs.FloatFalse

Return

The newly created index object.

Raises

  • CollectionNotExistException: error if the collection does not exist.
  • ParamError: error if the parameters are invalid.
  • BaseException: error if the specified field does not exist.
  • BaseException: error if the index has been created.

Example

index_params = {
  "metric_type":"L2",
  "index_type":"IVF_FLAT",
  "params":{"nlist":1024}
}
from pymilvus import Collection
collection = Collection("book")      # Get an existing collection.
collection.create_index(
  field_name="book_intro", 
  index_params=index_params
)

Note

  • The index_names of different indexes in the same collection must be different, so there can be at most one index whose name is _default_idx_ in a collection.
  • Using the same field_name, index_params, index_name to create the same index repeatedly will return success directly.
  • Indexes can be built for scalar fields. In this case, index_params can be omitted. Milvus will then build default dictionary tree indexes for fields of type VARCHAR, and sort fields data of other data types in ascending order.

create_partition()

This method creates a partition with the specified name.

Invocation

create_partition(partition_name, description="")

Parameters

ParameterDescriptionTypeRequired
partition_nameName of the partition to createStringTrue
descriptionDescription of the partition to createStringFalse

Return

The newly created partition object.

Raises

  • CollectionNotExistException: error if the collection does not exist.
  • BaseException: error if the specified partition does not exist.

Example

from pymilvus import Collection
collection = Collection("book")      # Get an existing collection.
collection.create_partition("novel")

delete()

This method deletes entities from a specified collection.

Invocation

delete(expr, partition_name=None, timeout=None, **kwargs)

Parameters

ParameterDescriptionTypeRequired
exprBoolean expression that specifies the primary keys of the entities to deleteStringTrue
partition_nameName of the partition to delete data fromStringFalse
timeoutAn optional duration of time in seconds to allow for the RPC. If it is set to None, the client keeps waiting until the server responds or error occurs.FloatFalse

Return

A MutationResult object.

Properties

PropertyDescriptionType
delete_countNumber of the entities to deleteInteger

Raises

  • RpcError: error if gRPC encounter an error.
  • ParamError: error if the parameters are invalid.
  • BaseException: error if the return result from server is not ok.

Example

expr = "book_id in [0,1]"
from pymilvus import Collection
collection = Collection("book")      # Get an existing collection.
collection.delete(expr)

drop_index()

This method drops the index and its corresponding index file in the collection.

Invocation

drop_index(timeout=None, **kwargs)

Parameters

ParameterDescriptionTypeRequired
timeoutAn optional duration of time in seconds to allow for the RPC. If it is set to None, the client keeps waiting until the server responds or error occurs.FloatFalse

Return

No return.

Raises

  • CollectionNotExistException: error if the collection does not exist.
  • BaseException: error if the index does not exist.

Example

from pymilvus import Collection
collection = Collection("book")      # Get an existing collection.
collection.drop_index()

drop_partition()

This method drops a partition and the data within by name in the specified collection.

Invocation

drop_partition(partition_name, timeout=None, **kwargs)

Parameters

ParameterDescriptionTypeRequired
partition_nameName of the partition to dropStringTrue
timeoutAn optional duration of time in seconds to allow for the RPC. If it is set to None, the client keeps waiting until the server responds or error occurs.FloatFalse

Return

No return.

Raises

  • CollectionNotExistException: error if the collection does not exist.
  • BaseException: error if the specified partition does not exist.

Example

from pymilvus import Collection
collection = Collection("book")      # Get an existing collection.
collection.drop_partition("novel")

get_replicas()

This method checks the replica information.

Invocation

get_replicas()

Return

The information about replica groups and the corresponding query nodes and shard.

Raises

Example

from pymilvus import Collection
collection = Collection("book")      # Get an existing collection.
collection.load(replica_number=2)    # Load collection as 2 replicas
result = collection.get_replicas()
print(result)

has_index()

This method verifies if a specified index exists.

Invocation

has_index(timeout=None)

Parameters

ParameterDescriptionTypeRequired
timeoutAn optional duration of time in seconds to allow for the RPC. If it is set to None, the client keeps waiting until the server responds or error occurs.FloatFalse

Return

A boolean value that indicates if the index exists.

Raises

  • CollectionNotExistException: error if the collection does not exist.

Example

from pymilvus import Collection
collection = Collection("book")      # Get an existing collection.
collection.has_index()

has_partition()

This method verifies if a partition exists in the specified collection.

Invocation

has_partition(partition_name, timeout=None)

Parameters

ParameterDescriptionTypeRequired
partition_nameName of the partition to verifyStringTrue
timeoutAn optional duration of time in seconds to allow for the RPC. If it is set to None, the client keeps waiting until the server responds or error occurs.FloatFalse

Return

A boolean value that indicates if the partition exists.

Raises

  • CollectionNotExistException: error if the collection does not exist.
  • BaseException: error if the specified partition does not exist.

Example

from pymilvus import Collection
collection = Collection("book")      # Get an existing collection.
collection.has_partition("novel")

index()

This method gets the index object in the collection.

Invocation

index()

Return

The index object.

Raises

  • CollectionNotExistException: error if the collection does not exist.
  • BaseException: error if the specified partition does not exist.

Example

from pymilvus import Collection
collection = Collection("book")      # Get an existing collection.
collection.index()

insert()

This method inserts data into a specified collection.

Invocation

insert(data, partition_name=None, timeout=None, **kwargs)

Parameters

ParameterDescriptionTypeRequired
dataData to insertlist-like(list, tuple)True
partition_nameName of the partition to insert data intoStringFalse
timeoutAn optional duration of time in seconds to allow for the RPC. If it is set to None, the client keeps waiting until the server responds or error occurs.FloatFalse

Return

A MutationResult object.

Properties

PropertyDescriptionType
insert_countNumber of the inserted entitiesInteger
primary_keysList of the primary keys of the inserted entitieslist[String]

Raises

  • CollectionNotExistException: error if the collection does not exist.
  • ParamError: error if the parameters are invalid.
  • BaseException: error if the specified partition does not exist.

Example

import random
data = [
  [i for i in range(2000)],
  [i for i in range(10000, 12000)],
  [[random.random() for _ in range(2)] for _ in range(2000)],
]
from pymilvus import Collection
collection = Collection("book")      # Get an existing collection.
mr = collection.insert(data)

load()

This method loads the specified collection to memory (for search or query).

Invocation

load(partition_names=None, timeout=None, **kwargs)

Parameters

ParameterDescriptionTypeRequired
partition_namesName of the partition(s) to loadlist[String]False
replica_numberNumber of the replica(s) to loadIntegerFalse
timeoutAn optional duration of time in seconds to allow for the RPC. If it is set to None, the client keeps waiting until the server responds or error occurs.FloatFalse
kwargs
  • _async

  • Boolean value to indicate if to invoke asynchronously
BoolFalse

Return

No return.

Raises

  • CollectionNotExistException: error if the collection does not exist.
  • ParamError: error if the parameters are invalid.
  • BaseException: error if the specified partition does not exist.

Example

from pymilvus import Collection
collection = Collection("book")      # Get an existing collection.
collection.load()

partition()

This method gets the specified partition object.

Invocation

partition(partition_name)

Parameters

ParameterDescriptionTypeRequired
partition_nameName of the partition to getStringtrue

Return

The specified partition object.

Raises

  • CollectionNotExistException: error if the collection does not exist.
  • BaseException: error if the specified partition does not exist.

Example

from pymilvus import Collection
collection = Collection("book")      # Get an existing collection.
collection.partition("novel")

query()

This method conducts a vector query.

Invocation

query(expr, output_fields=None, partition_names=None, timeout=None, **kwargs)

Parameters

ParameterDescriptionTypeRequired
exprBoolean expression to filter the dataStringTrue
partition_namesList of names of the partitions to search on.
All partition will be searched if it is left empty.
list[String]False
output_fieldsList of names of fields to outputlist[String]False
timeoutAn optional duration of time in seconds to allow for the RPC. If it is set to None, the client keeps waiting until the server responds or error occurs.FloatFalse
kwargs
  • consistency_level
  • guarantee_timestamp
  • graceful_time
  • travel_timestamp

  • Consistency level used in the search
  • Milvus searches on the data view before this timestamp when it is provided. Otherwise, it searches the most updated data view. It can be only used in Customized level of consistency.
  • PyMilvus will use current timestamp minus the graceful_time as the guarantee_timestamp for search. It can be only used in Bounded level of consistency.
  • Timestamp that is used for Time Travel. Users can specify a timestamp in a search to get results based on a data view at a specified point in time.

  • String/Integer
  • Integer
  • Integer
  • Integer
False

Return

A list that contains all results.

Raises

  • RpcError: error if gRPC encounter an error.
  • ParamError: error if the parameters are invalid.
  • DataTypeNotMatchException: error if wrong type of data is passed to server.
  • BaseException: error if the return result from server is not ok.

Example

from pymilvus import Collection
collection = Collection("book")      # Get an existing collection.
res = collection.query(
  expr = "book_id in [2,4,6,8]", 
  output_fields = ["book_id", "book_intro"],
  consistency_level="Strong"
)
sorted_res = sorted(res, key=lambda k: k['book_id'])
sorted_res

release()

This method releases the specified collection from memory.

Invocation

release(timeout=None, **kwargs)

Parameters

ParameterDescriptionTypeRequired
timeoutAn optional duration of time in seconds to allow for the RPC. If it is set to None, the client keeps waiting until the server responds or error occurs.FloatFalse

Return

No return.

Raises

  • CollectionNotExistException: error if the collection does not exist.
  • BaseException: error if the collection has not been loaded to memory.

Example

from pymilvus import Collection
collection = Collection("book")      # Get an existing collection.
collection.release()

This method conducts a vector similarity search.

Invocation

search(data, anns_field, param, limit, expr=None, partition_names=None, output_fields=None, timeout=None, round_decimal=-1, **kwargs)

Parameters

ParameterDescriptionTypeRequired
dataData to search withlist[list[Float]]True
anns_fieldName of the vector field to search onStringTrue
paramSpecific search parameter(s) of the index on the vector fieldDictTrue
limitNumber of nearest records to returnIntegerTrue
exprBoolean expression to filter the dataStringFalse
partition_namesList of names of the partitions to search on.
All partition will be searched if it is left empty.
list[String]False
output_fieldsList of names of fields to outputlist[String]False
timeoutAn optional duration of time in seconds to allow for the RPC. If it is set to None, the client keeps waiting until the server responds or error occurs.FloatFalse
round_decimalNumber of the decimal places of the returned distanceIntegerFalse
kwargs
  • _async
  • _callback
  • consistency_level
  • guarantee_timestamp
  • graceful_time
  • travel_timestamp

  • Boolean value to indicate if to invoke asynchronously
  • Function that will be invoked after server responds successfully. It takes effect only if _async is set to True.
  • Consistency level used in the search
  • Milvus searches on the data view before this timestamp when it is provided. Otherwise, it searches the most updated data view. It can be only used in Customized level of consistency.
  • PyMilvus will use current timestamp minus the graceful_time as the guarantee_timestamp for search. It can be only used in Bounded level of consistency.
  • Timestamp that is used for Time Travel. Users can specify a timestamp in a search to get results based on a data view at a specified point in time.

  • Bool
  • Function
  • String/Integer
  • Integer
  • Integer
  • Integer
False

Return

A SearchResult object, an iterable, 2d-array-like class whose first dimension is the number of vectors to query (nq), and the second dimension is the number of limit (topk).

Raises

  • RpcError: error if gRPC encounter an error.
  • ParamError: error if the parameters are invalid.
  • DataTypeNotMatchException: error if wrong type of data is passed to server.
  • BaseException: error if the return result from server is not ok.

Example

search_params = {"metric_type": "L2", "params": {"nprobe": 10}}
from pymilvus import Collection
collection = Collection("book")      # Get an existing collection.
results = collection.search(
    data=[[0.1, 0.2]], 
    anns_field="book_intro", 
    param=search_params, 
    limit=10, 
    expr=None,
    consistency_level="Strong"
)
results[0].ids
results[0].distances