Docs Menu
Prepare Schema
A schema is a skeletal structure representing the set of fields shared across all entities in a collection. Before creating a collection, you need to prepare a schema by defining all the fields in a specific order with their names, types, and optional descriptions.
Check your data
In the dataset prepared for this example, each data record has eight attributes. You need to create a field for each attribute. The following table lists the details:
Field name | Type | Dimension / Max length |
---|---|---|
id | INT64 | N/A |
title_vector | FLOAT_VECTOR | 768 |
title | VARCHAR | 512 |
link | VARCHAR | 512 |
reading_time | INT64 | N/A |
publication | VARCHAR | 512 |
claps | INT64 | N/A |
responses | INT64 | N/A |
Create fields
The following snippet defines the schema according to the above table.
from pymilvus import FieldSchema, CollectionSchema, DataType
fields = [
FieldSchema(name="id", dtype=DataType.INT64, is_primary=True),
FieldSchema(name="title", dtype=DataType.VARCHAR, max_length=512),
FieldSchema(name="title_vector", dtype=DataType.FLOAT_VECTOR, dim=768),
FieldSchema(name="link", dtype=DataType.VARCHAR, max_length=512),
FieldSchema(name="reading_time", dtype=DataType.INT64),
FieldSchema(name="publication", dtype=DataType.VARCHAR, max_length=512),
FieldSchema(name="claps", dtype=DataType.INT64),
FieldSchema(name="responses", dtype=DataType.INT64)
]
In this example,
id
is the primary field. For this field, the parameteris_primary
is set toTrue
.title_vector
is a vector field. The parameterdim
specifies the vector dimension.title
,link
, andpublication
are string fields. The parametermax_length
specifies the maximum number of characters allowed in the string.reading_time
,claps
, andresponses
are integer fields. No extra parameters need to be set on these fields.
Data types
For your reference, Zilliz Cloud supports the following field data types:
- Binary vector (BINARY_VECTOR)
- Boolean value (BOOLEAN)
- 8-byte floating-point (DOUBLE)
- 4-byte floating-point (FLOAT)
- Float vector (FLOAT_VECTOR)
- 8-bit integer (INT8)
- 32-bit integer (INT32)
- 64-bit integer (INT64)
- Variable character (VARCHAR)
Note that binary and float vectors are only suitable for the vector fields.
Next steps
- Check your data
- Create fields
- Data types
- Next steps
On this page