Insert Data

This topic describes how to insert data into your vector database on Zilliz Cloud via client.

Before inserting data to your cloud vector database, you must create a collection in your database. A collection in the vector database is equivalent to a table in relational database.

The following example inserts 2,000 rows of randomly generated data as the example data. Real applications will likely use much higher dimensional vectors than the example. You can prepare your own data to replace the example.

Create a collection

First, prepare necessary parameters, including field schema, collection schema, and collection name.

from pymilvus import CollectionSchema, FieldSchema, DataType
book_id = FieldSchema(
  name="book_id",
  dtype=DataType.INT64,
  is_primary=True,
)
book_name = FieldSchema(
  name="book_name",
  dtype=DataType.VARCHAR,
  max_length_per_row=200,
)
word_count = FieldSchema(
  name="word_count",
  dtype=DataType.INT64,
)
book_intro = FieldSchema(
  name="book_intro",
  dtype=DataType.FLOAT_VECTOR,
  dim=2
)
schema = CollectionSchema(
  fields=[book_id, word_count, book_intro],
  description="Test book search"
)
collection_name = "book"
FieldType fieldType1 = FieldType.newBuilder()
        .withName("book_id")
        .withDataType(DataType.Int64)
        .withPrimaryKey(true)
        .withAutoID(false)
        .build();
FieldType fieldType2 = FieldType.newBuilder()
        .withName("word_count")
        .withDataType(DataType.Int64)
        .build();
FieldType fieldType3 = FieldType.newBuilder()
        .withName("book_intro")
        .withDataType(DataType.FloatVector)
        .withDimension(2)
        .build();
CreateCollectionParam createCollectionReq = CreateCollectionParam.newBuilder()
        .withCollectionName("book")
        .withDescription("Test book search")
        .withShardsNum(2)
        .addFieldType(fieldType1)
        .addFieldType(fieldType2)
        .addFieldType(fieldType3)
        .build();
Parameter Description Option
FieldSchema Schema of the fields within the collection to create. Refer to Schema for more information. N/A
name Name of the field to create. N/A
dtype Data type of the field to create. For primary key field:
  • DataType.INT64 (numpy.int64)
  • DataType.VARCHAR (VARCHAR)
For scalar field:
  • DataType.BOOL (Boolean)
  • DataType.INT64 (numpy.int64)
  • DataType.FLOAT (numpy.float32)
  • DataType.DOUBLE (numpy.double)
For vector field:
  • BINARY_VECTOR (Binary vector)
  • FLOAT_VECTOR (Float vector)
is_primary (Mandatory for primary key field) Switch to control if the field is primary key field. True or False
auto_id (Mandatory for primary key field) Switch to enable or disable Automatic ID (primary key) allocation. True or False
max_length_per_row (Mandatory for VARCHAR field) Maximum length of strings allowed to be inserted. [1, 65535]
dim (Mandatory for vector field) Dimension of the vector. [1, 32,768]
description (Optional) Description of the field. N/A
CollectionSchema Schema of the collection to create. Refer to Schema for more information. N/A
fields Fields of the collection to create. N/A
description (Optional) Description of the collection to create. N/A
collection_name Name of the collection to create. N/A
Parameter Description Option
Name Name of the field to create. N/A
Description Description of the field to create. N/A
DataType Data type of the field to create. For primary key field:
  • entity.FieldTypeInt64 (numpy.int64)
For scalar field:
  • entity.FieldTypeBool (Boolean)
  • entity.FieldTypeInt64 (numpy.int64)
  • entity.FieldTypeFloat (numpy.float32)
  • entity.FieldTypeDouble (numpy.double)
For vector field:
  • entity.FieldTypeBinaryVector (Binary vector)
  • entity.FieldTypeFloatVector (Float vector)
PrimaryKey (Mandatory for primary key field) Switch to control if the field is primary key field. True or False
AutoID Switch to enable or disable Automatic ID (primary key) allocation. True or False
Dimension (Mandatory for vector field) Dimension of the vector. [1, 32768]
CollectionName Name of the collection to create. N/A
Description (Optional) Description of the collection to create. N/A
ShardsNum Number of the shards for the collection to create. [1,256]

Then, create a collection with strong consistency level and the schema you specified above.

from pymilvus import Collection
collection = Collection(
    name=collection_name,
    schema=schema,
    using='default',
    shards_num=2,
    consistency_level="Strong"
    )
milvusClient.createCollection(createCollectionReq);
Parameter Description Option
using (optional) By specifying the server alias here, you can choose in which Zilliz Cloud database you create a collection. N/A
shards_num (optional) Number of the shards for the collection to create. [1,256]
consistency_level (optional) Consistency level of the collection to create.
  • Strong
  • Bounded
  • Session
  • Eventually
  • Customized

Prepare data

Prepare the data to insert. Data type of the data to insert must match the schema of the collection, otherwise Zilliz Cloud database will raise exception.

import random
data = [
  [i for i in range(2000)],
  [str(i) for i in range(2000)],
  [i for i in range(10000, 12000)],
  [[random.random() for _ in range(2)] for _ in range(2000)],
]
Random ran = new Random();
List<Long> book_id_array = new ArrayList<>();
List<Long> word_count_array = new ArrayList<>();
List<List<Float>> book_intro_array = new ArrayList<>();
for (long i = 0L; i < 2000; ++i) {
    book_id_array.add(i);
    word_count_array.add(i + 10000);
    List<Float> vector = new ArrayList<>();
    for (int k = 0; k < 2; ++k) {
        vector.add(ran.nextFloat());
    }
    book_intro_array.add(vector);
}

Insert data to Zilliz Cloud database

Insert the data to the collection.

from pymilvus import Collection
collection = Collection("book")      # Get an existing collection.
mr = collection.insert(data)
List<InsertParam.Field> fields = new ArrayList<>();
fields.add(new InsertParam.Field("book_id", DataType.Int64, book_id_array));
fields.add(new InsertParam.Field("word_count", DataType.Int64, word_count_array));
fields.add(new InsertParam.Field("book_intro", DataType.FloatVector, book_intro_array));

InsertParam insertParam = InsertParam.newBuilder()
  .withCollectionName("book")
  .withPartitionName("novel")
  .withFields(fields)
  .build();
milvusClient.insert(insertParam);
Parameter Description
data Data to insert into Zilliz Cloud database.
partition_name (optional) Name of the partition to insert data into.
Parameter Description
fieldName Name of the field to insert data in.
DataType Data type of the field to insert data in.
data Data to insert into each field.
CollectionName Name of the collection to insert data into.
PartitionName (optional) Name of the partition to insert data into.

Limits

FeatureMaximum limit
Dimensions of a vector32,768