- Events
Monthly Product Demo: Discover the Power of Zilliz Cloud
Webinar
Monthly Product Demo: Discover the Power of Zilliz Cloud
Join the Webinar
Loading...

About this webinar
Join our monthly demo for a technical overview of Zilliz Cloud, a highly scalable and performant vector database service for AI applications
Topics covered
- Zilliz Cloud's scalable architecture
- Key features of the developer-friendly UI
- Security best practices and data privacy
- Highlights from recent product releases
This webinar is an excellent opportunity for developers to learn about Zilliz Cloud's capabilities and how it can support their AI projects. Register now to join our community and stay up-to-date with the latest vector database technology.
WEBVTT
1 00:00:03.495 --> 00:00:06.315 My name is Chris Ello and I work here at Zilliz.
2 00:00:06.845 --> 00:00:10.075 Today we are going to do our monthly, uh, cloud demo,
3 00:00:10.535 --> 00:00:11.595 and I encourage everybody
4 00:00:11.735 --> 00:00:15.795 to put your questions in either the chat or the q and a.
5 00:00:16.375 --> 00:00:20.555 Um, and, um, what we'll do is, uh, during this session, if,
6 00:00:20.735 --> 00:00:22.675 if JRI, um, see questions,
7 00:00:22.755 --> 00:00:24.555 we'll be answering them as quick as we can.
8 00:00:25.175 --> 00:00:28.355 Uh, but at the end of the demo I'll also open up lines.
9 00:00:28.555 --> 00:00:30.275 'cause I think we have a manageable group of people,
10 00:00:30.295 --> 00:00:32.595 so we can just ask questions directly to Jay.
11 00:00:33.215 --> 00:00:34.835 Uh, and then I'll, um, make sure
12 00:00:34.835 --> 00:00:36.235 that we keep those, uh, recorded.
13 00:00:36.735 --> 00:00:40.035 Um, and, uh, yeah, let's just, uh, get started here.
14 00:00:40.215 --> 00:00:44.915 So I'm just gonna quickly go over, uh, cloud.
15 00:00:45.015 --> 00:00:47.275 And I know that everybody here, um,
16 00:00:47.305 --> 00:00:49.195 must already know a little bit about Milvus
17 00:00:49.255 --> 00:00:51.155 or else you wouldn't be, uh, joining us today.
18 00:00:51.255 --> 00:00:53.835 But basically, you know, Zilliz cloud is built on top
19 00:00:53.895 --> 00:00:55.955 of our open source project, uh, Milvus.
20 00:00:56.695 --> 00:01:00.715 And, um, you know, I think a lot of people,
21 00:01:00.855 --> 00:01:02.515 or a lot of companies when they, um,
22 00:01:02.515 --> 00:01:03.715 have an open source project,
23 00:01:04.065 --> 00:01:06.315 they simply offer a hosted version
24 00:01:06.575 --> 00:01:09.115 of the open source project, maybe with a little bit of,
25 00:01:09.215 --> 00:01:11.915 you know, kind of extra, you know, billing roles
26 00:01:12.055 --> 00:01:13.475 or maybe there's a little bit of security.
27 00:01:14.055 --> 00:01:16.675 But we decided from the get go that we needed
28 00:01:16.675 --> 00:01:19.075 to be much more than just a hosted version.
29 00:01:20.015 --> 00:01:22.835 And, um, these are kind of three of the, the kind
30 00:01:22.835 --> 00:01:26.195 of the core differences between Milvus and Zilliz.
31 00:01:26.695 --> 00:01:29.475 So the first thing is, even though we have a very performant
32 00:01:29.615 --> 00:01:32.075 search engine under Milvus, so if you go to GitHub,
33 00:01:32.075 --> 00:01:35.675 you might see something called nowhere, K-N-O-W-H-E-R-E.
34 00:01:36.065 --> 00:01:38.155 That is our search engine in Milvus.
35 00:01:38.935 --> 00:01:43.195 And, um, and we also support, uh, 11 different indexes.
36 00:01:43.375 --> 00:01:45.555 Uh, so, you know, it makes it really useful
37 00:01:45.555 --> 00:01:47.835 because, um, you know, every one
38 00:01:47.835 --> 00:01:50.235 of us is gonna have a very unique, uh, set
39 00:01:50.235 --> 00:01:51.635 of requirements tied to our use case.
40 00:01:51.635 --> 00:01:55.115 So having the ability to pick the index that's gonna fit
41 00:01:55.115 --> 00:01:56.675 with our use cases is really good.
42 00:01:57.335 --> 00:01:59.755 But we decided that when we created Zilliz Cloud,
43 00:01:59.985 --> 00:02:01.035 that wasn't good enough.
44 00:02:01.415 --> 00:02:03.435 We wanted to make sure that we were even more
45 00:02:03.435 --> 00:02:04.515 performant than Milvus.
46 00:02:04.895 --> 00:02:08.315 And also we wanted to make sure that we take the burden off
47 00:02:08.315 --> 00:02:09.915 of your shoulders of picking the index.
48 00:02:09.935 --> 00:02:11.515 So we have something called Auto Index
49 00:02:11.515 --> 00:02:14.155 that Jay will talk about briefly in the demo.
50 00:02:14.935 --> 00:02:18.035 And then also, because you have very unique requirements,
51 00:02:18.425 --> 00:02:21.275 some of your use cases are gonna have, um, you know,
52 00:02:21.365 --> 00:02:24.035 maybe you have a really strict latency requirements,
53 00:02:24.095 --> 00:02:25.155 or you have a lot of people
54 00:02:25.305 --> 00:02:27.275 that are attacking your application.
55 00:02:27.295 --> 00:02:28.355 And so there's a lot of queries
56 00:02:28.355 --> 00:02:29.595 that are hitting the database.
57 00:02:29.905 --> 00:02:32.115 Everybody's gonna have different requirements.
58 00:02:32.815 --> 00:02:35.075 And so we wanna make sure that we can help you
59 00:02:35.135 --> 00:02:38.635 to tune the database to fit what your needs actually are,
60 00:02:38.695 --> 00:02:40.915 and try to do this in a really simple way.
61 00:02:41.695 --> 00:02:43.995 In addition, it's, uh, a cloud native database.
62 00:02:43.995 --> 00:02:45.915 And of course, that seems like a lot of jargon
63 00:02:45.915 --> 00:02:48.155 or what just talks about it, but at the end of the day,
64 00:02:48.335 --> 00:02:51.995 it really is about making sure that, um, we are scalable,
65 00:02:52.215 --> 00:02:54.475 but we allow you to autoscale up and down,
66 00:02:54.615 --> 00:02:56.315 and Jay will go over that as well.
67 00:02:56.935 --> 00:03:00.325 And then finally, um, you know, SaaS applications need
68 00:03:00.325 --> 00:03:02.445 to be, uh, secure and, uh,
69 00:03:02.545 --> 00:03:04.685 and provide, uh, all the, uh, compliance
70 00:03:04.685 --> 00:03:08.005 and regulatory, um, certifications that your, uh,
71 00:03:08.325 --> 00:03:09.485 security teams are asking for.
72 00:03:11.485 --> 00:03:14.665 Um, so Cardinal, as I mentioned, is our search engine.
73 00:03:15.325 --> 00:03:18.785 And, um, I'm gonna actually skip this
74 00:03:18.785 --> 00:03:20.865 and let Jay go into a lot more details here.
75 00:03:21.725 --> 00:03:23.625 Um, and then you can see here, you know,
76 00:03:23.625 --> 00:03:25.665 what's the difference in a little bit more detail
77 00:03:25.665 --> 00:03:28.025 between open source Milvus and Zilliz.
78 00:03:28.445 --> 00:03:31.265 So, uh, of course, you know, I picked all the features, uh,
79 00:03:31.265 --> 00:03:33.025 that are not available in Milvus.
80 00:03:33.025 --> 00:03:35.825 So you, it looks like, oh my God, Zilliz is super wonderful,
81 00:03:35.845 --> 00:03:37.945 but hopefully you can appreciate that.
82 00:03:37.965 --> 00:03:40.425 You know, things like migrations, backup
83 00:03:41.025 --> 00:03:43.705 capacity planning updates, autoscale, you know,
84 00:03:43.705 --> 00:03:46.185 these are the things that we figured that, you know,
85 00:03:46.185 --> 00:03:47.825 let's take that burden off of your shoulder
86 00:03:47.925 --> 00:03:51.025 and put that into the, uh, fully managed, uh, Zilliz cloud.
87 00:03:52.875 --> 00:03:56.615 Um, we also work really hard to make sure that we maintain,
88 00:03:56.755 --> 00:03:58.055 uh, enterprise readiness.
89 00:03:58.635 --> 00:04:01.335 So we have, um, uh, a lot
90 00:04:01.335 --> 00:04:03.175 of this details on our security page,
91 00:04:03.515 --> 00:04:05.175 but we wanna make sure that, you know,
92 00:04:05.235 --> 00:04:08.015 we look at security from all levels, you know,
93 00:04:08.035 --> 00:04:12.055 all the way from, you know, uh, the, your data to the roles,
94 00:04:12.405 --> 00:04:15.135 even how we interface, uh, with Zilliz Cloud.
95 00:04:16.725 --> 00:04:19.585 And then finally, just as a reminder, you know, we have, um,
96 00:04:19.655 --> 00:04:21.905 basically, uh, three different offerings.
97 00:04:22.005 --> 00:04:24.665 We have Milvus, which comes in three different versions.
98 00:04:24.845 --> 00:04:27.065 Uh, a light version, which is, uh, can,
99 00:04:27.125 --> 00:04:28.265 is actually an embedded version.
100 00:04:28.325 --> 00:04:29.905 So you can put that on something really small
101 00:04:30.005 --> 00:04:31.665 or just throw them to a Jupyter Notebook.
102 00:04:32.285 --> 00:04:35.145 We have a, um, a standalone,
103 00:04:35.145 --> 00:04:36.825 and then we have a fully distributed version.
104 00:04:36.965 --> 00:04:40.425 So the really powerful, uh, um, set of databases
105 00:04:40.445 --> 00:04:41.705 for everyone to get started with.
106 00:04:42.205 --> 00:04:43.905 We also have Zilliz Cloud, which is where
107 00:04:43.905 --> 00:04:45.545 what we're gonna go over today in our demo.
108 00:04:46.045 --> 00:04:48.905 And then we also have, uh, Zilliz Cloud BYOC,
109 00:04:48.915 --> 00:04:53.145 where we have done, where we've separated the, uh, the, uh,
110 00:04:53.145 --> 00:04:54.545 data plane from the control plane.
111 00:04:54.605 --> 00:04:56.265 So this is gonna be for enterprises
112 00:04:56.265 --> 00:04:59.785 that have very stringent, um, security requirements,
113 00:05:00.245 --> 00:05:01.585 but really at the end of the day,
114 00:05:01.925 --> 00:05:03.105 you just have to build once.
115 00:05:03.105 --> 00:05:06.265 We're not gonna make you rebuild, uh, your database, uh,
116 00:05:06.285 --> 00:05:09.105 if you need to migrate them to any of these instances.
117 00:05:09.605 --> 00:05:11.985 But today we're gonna focus on Zilliz cloud.
118 00:05:12.565 --> 00:05:14.305 So, with that, I'm gonna stop talking
119 00:05:15.125 --> 00:05:17.905 and I'm gonna pass the baton over to Jay.
120 00:05:19.485 --> 00:05:21.995 Thank you so much, Chris. Good morning.
121 00:05:22.145 --> 00:05:25.395 Good evening everyone. I am gonna share my screen.
122 00:05:25.395 --> 00:05:26.515 Gimme one moment please.
123 00:05:30.215 --> 00:05:33.695 Alright, you guys can see my screen right? Cool.
124 00:05:34.075 --> 00:05:36.455 Um, okay, so I'll talk about recall, uh,
125 00:05:36.455 --> 00:05:37.735 tuning recall rate in a second,
126 00:05:37.835 --> 00:05:39.855 but I think I kinda wanna step back a little bit
127 00:05:40.475 --> 00:05:43.695 and talk a little bit about rag, um,
128 00:05:43.875 --> 00:05:45.855 or retrieval augmented generation.
129 00:05:46.355 --> 00:05:47.495 Uh, just because you know, it,
130 00:05:47.495 --> 00:05:49.775 it is the most popular use case in, in,
131 00:05:49.775 --> 00:05:51.975 in arguably like the current killer app
132 00:05:52.095 --> 00:05:53.095 for vector databases.
133 00:05:53.155 --> 00:05:56.015 So I want to kind of put a lot of the Zilliz features kind
134 00:05:56.015 --> 00:05:58.125 of into that context as, uh, I would presume a lot
135 00:05:58.125 --> 00:05:59.565 of you guys are, are really interested in
136 00:05:59.565 --> 00:06:00.845 that particular use case, right?
137 00:06:01.225 --> 00:06:03.885 So what does RAG allow us to do, right?
138 00:06:03.905 --> 00:06:07.365 It allows us to, you know, include any data
139 00:06:08.225 --> 00:06:11.445 in the context window of modern large language models.
140 00:06:11.745 --> 00:06:13.405 Um, so it, in other words, it allows you
141 00:06:13.405 --> 00:06:15.725 to answer questions right, about topics
142 00:06:15.955 --> 00:06:18.725 that the LLMs were not originally trained on, right?
143 00:06:18.725 --> 00:06:21.245 So this could be internal facing documentation.
144 00:06:21.625 --> 00:06:23.805 If you're a law firm, it could be motions
145 00:06:23.805 --> 00:06:25.325 that you filed in your current litigation.
146 00:06:25.345 --> 00:06:27.405 If you're a doctor, it could be any data on any
147 00:06:27.405 --> 00:06:28.405 of your patients, and kind of, you
148 00:06:28.405 --> 00:06:29.365 know, the list goes on and on, right?
149 00:06:29.705 --> 00:06:32.525 Um, and the way this works is by way of embedding models,
150 00:06:32.525 --> 00:06:35.725 which are these very expensive pre-trained, right?
151 00:06:36.025 --> 00:06:39.285 Uh, spatial like representations of semantic context, right?
152 00:06:39.345 --> 00:06:42.205 In, in space. So in other words, it's like a blob of text
153 00:06:42.205 --> 00:06:43.525 that has some semantic meaning,
154 00:06:44.105 --> 00:06:46.805 and that semantic meaning is represented by some location,
155 00:06:47.145 --> 00:06:49.245 uh, with coordinates that are, are, are measurable, right?
156 00:06:49.705 --> 00:06:53.925 So vector databases measure the distance between one blob
157 00:06:53.925 --> 00:06:55.285 of text to another blo of text,
158 00:06:55.335 --> 00:06:58.685 which gives us some indication of how semantically relevant
159 00:06:58.835 --> 00:07:01.645 that text is to that other piece of text, regardless
160 00:07:01.665 --> 00:07:04.885 of whether there are any exact lexical matches, right?
161 00:07:04.885 --> 00:07:07.325 And so this is a very, uh, this is kind of a departure,
162 00:07:07.505 --> 00:07:09.125 you know, from, from traditional search,
163 00:07:09.275 --> 00:07:12.285 like things like Apache Lucin that have relied heavily on,
164 00:07:12.305 --> 00:07:14.045 on Lexile matching for a long time, right?
165 00:07:14.045 --> 00:07:15.605 So you can have like a blob of text
166 00:07:15.755 --> 00:07:18.485 that has no lexile matches to another blob of text,
167 00:07:18.705 --> 00:07:20.205 but could be semantically relevant,
168 00:07:20.345 --> 00:07:23.005 and those would be placed closer together in vector space.
169 00:07:23.005 --> 00:07:25.405 And that's what vector databases allow us to do, is, is, is
170 00:07:25.405 --> 00:07:26.765 to measure those distances, right?
171 00:07:27.585 --> 00:07:31.005 So, okay, how does Zilliz help us here, right?
172 00:07:31.865 --> 00:07:34.525 The, the, once we have all of the data, so all
173 00:07:34.525 --> 00:07:36.845 of our patient data or, you know, it's the, the, the, the,
174 00:07:36.845 --> 00:07:38.485 the stuff that we're interested in, in,
175 00:07:38.485 --> 00:07:41.365 in searching over into bite-sized chunks with dents
176 00:07:41.365 --> 00:07:42.325 and beddings attached to them
177 00:07:42.395 --> 00:07:43.605 with loaded into Zilliz, right?
178 00:07:43.905 --> 00:07:47.685 We can send Zilliz a vector query, uh,
179 00:07:47.785 --> 00:07:50.525 and ask it to give us the vectors that are closest
180 00:07:50.785 --> 00:07:52.725 to the one that we just sent, right?
181 00:07:52.755 --> 00:07:55.765 This is where Zilliz does its distance measuring
182 00:07:55.765 --> 00:07:58.085 and performs what's called an approximate nearest
183 00:07:58.285 --> 00:07:59.365 neighbor or a and n.
184 00:07:59.705 --> 00:08:02.285 Uh, and we utilize approximate instead
185 00:08:02.285 --> 00:08:04.965 of no nearest neighbor, because doing a brute force,
186 00:08:05.185 --> 00:08:07.605 you know, geometric distance measurement on every single
187 00:08:07.635 --> 00:08:10.605 node and every other node attached to it, um, you know, in,
188 00:08:10.625 --> 00:08:11.765 in, in space, uh,
189 00:08:11.765 --> 00:08:13.765 and returning in some reasonable amount of time,
190 00:08:14.245 --> 00:08:15.845 i i is currently infeasible for,
191 00:08:15.945 --> 00:08:17.245 for a lot of use cases, right?
192 00:08:17.245 --> 00:08:19.285 You want, you want to be able to get results back relatively
193 00:08:19.285 --> 00:08:21.885 quickly, you know, even if you say,
194 00:08:22.015 --> 00:08:23.405 don't get all the matches, right?
195 00:08:23.825 --> 00:08:25.685 Uh, you know, it's, it's actually more important
196 00:08:25.705 --> 00:08:28.005 to return in, you know, a hundred milliseconds
197 00:08:28.005 --> 00:08:30.525 or 500 milliseconds, not like in, in, in minutes, right?
198 00:08:30.525 --> 00:08:33.085 Which is un uh, unusable for most of the time.
199 00:08:33.545 --> 00:08:34.885 Uh, so the way that the a
200 00:08:34.885 --> 00:08:37.285 and n indexes work is, you know, obviously outside the scope
201 00:08:37.285 --> 00:08:40.165 of this webinar, but Zilliz has a very simple abstraction
202 00:08:40.165 --> 00:08:43.365 layer for you, uh, that allows you to make trade-offs
203 00:08:43.595 --> 00:08:46.605 with recall accuracy and curry latency.
204 00:08:46.865 --> 00:08:49.965 So, if you remember right, we're approximating, right?
205 00:08:50.185 --> 00:08:53.085 If we got all the neighbors in the search, so we
206 00:08:53.915 --> 00:08:56.845 exposed something called a level parameter in Zilliz.
207 00:08:56.845 --> 00:08:58.245 So I'll, I'll get in that in a second.
208 00:08:58.395 --> 00:09:00.125 It's, it's right here. Um,
209 00:09:01.225 --> 00:09:03.845 and this is incredibly useful depending on
210 00:09:04.135 --> 00:09:07.245 where your vector database sits in your pipeline.
211 00:09:07.545 --> 00:09:09.165 So if it's a user facing pipeline,
212 00:09:09.165 --> 00:09:12.525 you probably wanna tune more towards faster latency just
213 00:09:12.525 --> 00:09:13.685 because your user's probably
214 00:09:13.685 --> 00:09:15.125 waiting on, on you for something.
215 00:09:15.425 --> 00:09:16.885 If, if you're, you know, if you have more
216 00:09:16.885 --> 00:09:19.005 of an analytics pipeline, uh, and,
217 00:09:19.025 --> 00:09:21.245 and you're really interested in getting, you know, all
218 00:09:21.245 --> 00:09:23.805 of the, you know, the best recall possible, you might want
219 00:09:23.805 --> 00:09:25.165 to lean more towards recall, right?
220 00:09:25.185 --> 00:09:26.645 And the balance is up to you.
221 00:09:26.945 --> 00:09:29.925 We provide this through a very simple abstraction, right?
222 00:09:29.945 --> 00:09:32.645 So when you, this is, this is an example of a, of a query
223 00:09:32.675 --> 00:09:34.525 with our Python, SDK, uh,
224 00:09:34.625 --> 00:09:36.125 at the top you'll see the query vector.
225 00:09:36.225 --> 00:09:37.765 So that's the vector that we're actually
226 00:09:37.765 --> 00:09:38.845 telling the database.
227 00:09:39.075 --> 00:09:40.965 Give me all the neighbors close to closest
228 00:09:41.105 --> 00:09:42.285 to this particular location.
229 00:09:42.825 --> 00:09:45.165 And we're basically saying, you know, I want, uh,
230 00:09:45.295 --> 00:09:46.325 gimme the closest three.
231 00:09:46.465 --> 00:09:49.245 So this is the top K and the levels parameter, which is
232 00:09:49.245 --> 00:09:50.645 what I just mentioned, is this right here.
233 00:09:50.705 --> 00:09:55.205 So this defaults to one, uh, and will go up to 10,
234 00:09:55.785 --> 00:09:57.445 and you can kind of play around with this.
235 00:09:57.465 --> 00:09:58.925 And this is done by query, right?
236 00:09:58.925 --> 00:10:00.325 So you can send one query at one
237 00:10:00.325 --> 00:10:01.885 and one query at 10, one query at five,
238 00:10:01.885 --> 00:10:03.925 and kind of play around with it to see, you know,
239 00:10:04.075 --> 00:10:06.325 what the recall looks like in each of these scenarios
240 00:10:06.825 --> 00:10:09.885 and what the latency looks like in each of those scenarios.
241 00:10:09.885 --> 00:10:11.125 So it's very flexible in that nature.
242 00:10:11.185 --> 00:10:13.005 You don't have to rebuild the entire index every
243 00:10:13.005 --> 00:10:14.125 time you do this exercise.
244 00:10:14.475 --> 00:10:16.485 It's, it's, it's very, um, you know,
245 00:10:16.585 --> 00:10:18.325 on the fly kind of adjustments, right?
246 00:10:18.325 --> 00:10:19.365 Which is, which is very nice.
247 00:10:19.705 --> 00:10:21.165 Uh, you know, for context, a lot
248 00:10:21.165 --> 00:10:22.725 of other vector database solutions,
249 00:10:23.155 --> 00:10:25.965 they pick a single index most of the time.
250 00:10:26.385 --> 00:10:28.685 Uh, most of the time they'll pick one index, you know,
251 00:10:28.685 --> 00:10:31.925 whether it's HNSW or disk NN or face,
252 00:10:31.985 --> 00:10:33.805 or you know, any of the other kind of, you know,
253 00:10:33.825 --> 00:10:36.325 off the shelf, uh, index algorithms
254 00:10:36.325 --> 00:10:38.325 that are out there right now, uh,
255 00:10:38.465 --> 00:10:39.925 all have their own trade offs, right?
256 00:10:39.925 --> 00:10:41.405 They all have their, they made their trade off
257 00:10:41.405 --> 00:10:44.005 between recall accuracy and query latency,
258 00:10:44.005 --> 00:10:45.365 and you're just kind of, that's the one you get
259 00:10:45.365 --> 00:10:46.605 and you're just kind of stuck with that.
260 00:10:46.905 --> 00:10:49.645 Um, in Milvus, you're allowed to swap those out, right?
261 00:10:49.645 --> 00:10:52.605 So you can swap them out for something else if you want to,
262 00:10:52.985 --> 00:10:54.845 but it's a, you know, manual process.
263 00:10:55.065 --> 00:10:57.965 You have to know what the characteristics of each, uh,
264 00:10:57.975 --> 00:10:59.325 index algorithm is, right?
265 00:10:59.325 --> 00:11:02.845 So this, we kind of expose to Zilliz cloud customers,
266 00:11:03.185 --> 00:11:06.965 and again, it's, it's another way that we, we say, look,
267 00:11:06.985 --> 00:11:08.685 here's a managed servers that will kind
268 00:11:08.685 --> 00:11:11.165 of abstract away a lot of this like vector complexity
269 00:11:11.225 --> 00:11:15.765 for you, and expose this very nice, easy to understand, uh,
270 00:11:15.855 --> 00:11:19.445 lever that you can pull to kind of change the way, uh, that,
271 00:11:19.445 --> 00:11:20.565 that, that the recall is done.
272 00:11:20.625 --> 00:11:22.565 So, um, again, don't feel free
273 00:11:22.565 --> 00:11:24.645 to not take any notes on this, you know, well,
274 00:11:24.645 --> 00:11:26.445 we can send all this documentation at, at the end
275 00:11:26.445 --> 00:11:28.605 of this call, but I think this is really important
276 00:11:28.605 --> 00:11:29.765 to highlight, uh, just
277 00:11:29.765 --> 00:11:32.405 because we find that a lot of customers find a lot
278 00:11:32.405 --> 00:11:33.645 of use in this, uh,
279 00:11:33.705 --> 00:11:36.525 and it gives us a lot of flexibility for many,
280 00:11:36.525 --> 00:11:38.365 many use cases where, you know, you don't have
281 00:11:38.365 --> 00:11:39.725 to necessarily swap things out.
282 00:11:40.225 --> 00:11:42.765 And again, this is all part of that cardinal, uh,
283 00:11:42.855 --> 00:11:45.565 index engine that Chris, uh, mentioned earlier.
284 00:11:45.755 --> 00:11:48.245 There's a lot of other benefits to Cardinal as well that,
285 00:11:48.245 --> 00:11:49.885 you know, I won't go into too much detail,
286 00:11:49.905 --> 00:11:52.325 but you know, off the top of my head, uh, there's a lot
287 00:11:52.325 --> 00:11:54.485 of interesting things that we're doing with Quantization
288 00:11:54.695 --> 00:11:59.085 where, um, you know, a lot of the vectors have, uh, many,
289 00:11:59.115 --> 00:12:01.605 many, you know, floating point numbers attached to them.
290 00:12:01.985 --> 00:12:04.325 If you truncate them, uh, you know, you'll might,
291 00:12:04.345 --> 00:12:07.685 you might lose maybe two or 3% in recall accuracy,
292 00:12:08.105 --> 00:12:12.485 but, you know, you can save upwards of 30, 40, 50% on, on,
293 00:12:12.505 --> 00:12:14.165 on your compute and storage, right?
294 00:12:14.225 --> 00:12:17.445 So that's a trade off that some folks, you know, wanna make.
295 00:12:17.585 --> 00:12:20.565 Uh, and, and, and Cardinal looks at your stack
296 00:12:20.565 --> 00:12:22.045 and looks at your pipeline
297 00:12:22.065 --> 00:12:23.805 and makes those necessary adjustments.
298 00:12:23.805 --> 00:12:25.365 So there's a lot of intelligence built into it,
299 00:12:25.365 --> 00:12:26.565 and there's a lot of other tuning in
300 00:12:26.785 --> 00:12:30.685 and, um, you know, uh, adjustments that, that we've made
301 00:12:30.685 --> 00:12:33.645 to kind of make it fit almost every use case, right?
302 00:12:33.985 --> 00:12:35.645 Uh, you know, we also work very closely
303 00:12:35.645 --> 00:12:37.285 with customers if there's, you know, a,
304 00:12:37.405 --> 00:12:38.645 a bleeding edge use case
305 00:12:38.785 --> 00:12:42.045 or something that's, uh, that, that they like to do,
306 00:12:42.045 --> 00:12:43.885 that falls outside of the bounds of
307 00:12:43.885 --> 00:12:45.805 what auto Index is currently capable of,
308 00:12:46.025 --> 00:12:48.085 and we're able to tune that as well and,
309 00:12:48.085 --> 00:12:49.805 and kind of work with you guys to make sure
310 00:12:49.805 --> 00:12:52.005 that Zilliz is performing, uh, you know, and,
311 00:12:52.005 --> 00:12:53.805 and we're making the right technical trade-offs.
312 00:12:53.805 --> 00:12:56.165 Uh, and, and it's while still kind of maintaining this,
313 00:12:56.475 --> 00:12:59.765 this managed service, um, uh, solution, right?
314 00:12:59.865 --> 00:13:02.885 So, uh, so that, that's, that's pretty much it on the, the,
315 00:13:02.985 --> 00:13:04.725 the, the, the level parameter.
316 00:13:04.725 --> 00:13:07.365 I want to move a little bit towards scale.
317 00:13:07.645 --> 00:13:09.525 'cause I think scale is also important to talk about.
318 00:13:09.985 --> 00:13:13.805 Uh, we at Zilliz, um, think
319 00:13:13.865 --> 00:13:15.845 of scale from the very beginning, right?
320 00:13:15.845 --> 00:13:18.925 So the entire database was architected for many,
321 00:13:18.925 --> 00:13:20.725 many billions of vectors, right?
322 00:13:20.725 --> 00:13:23.205 And we do have customers running, you know, north
323 00:13:23.205 --> 00:13:26.445 of 20 billion, 30 billion, 40 billion vector workloads.
324 00:13:26.505 --> 00:13:28.165 Uh, and it is performing very well.
325 00:13:28.625 --> 00:13:31.085 And this is where Zilliz really shines.
326 00:13:31.465 --> 00:13:34.605 Uh, you know, if you're talking about a vector space that's,
327 00:13:34.865 --> 00:13:37.645 you know, sub, uh, 20 million vectors,
328 00:13:37.645 --> 00:13:39.165 maybe you sub 10 million vectors, right?
329 00:13:39.225 --> 00:13:40.325 You could probably pretty much use
330 00:13:40.485 --> 00:13:41.525 anything else that's out there right now.
331 00:13:41.605 --> 00:13:44.525 I think there's a lot of really great, uh, you know,
332 00:13:44.525 --> 00:13:46.485 vector solutions that have popped up recently.
333 00:13:46.565 --> 00:13:49.285 A lot of bolt-ons from existing legacy players that,
334 00:13:49.505 --> 00:13:51.725 you know, want, uh, their customers to be able
335 00:13:51.725 --> 00:13:53.645 to use this capability while, you know,
336 00:13:53.665 --> 00:13:54.845 not migrating away from
337 00:13:54.845 --> 00:13:56.005 what they're already currently using.
338 00:13:56.005 --> 00:13:58.085 It could be SQL database, non SQL database, uh,
339 00:13:58.085 --> 00:14:00.445 it could be a data warehouse, you know, handful
340 00:14:00.465 --> 00:14:01.685 of other solutions, right?
341 00:14:02.225 --> 00:14:07.135 Um, and it, it's, it's, it's generally okay, you know,
342 00:14:07.135 --> 00:14:09.495 if if you're, if you're looking at, you know, a handful
343 00:14:09.555 --> 00:14:11.295 of millions of vectors, it's, it's fine, right?
344 00:14:11.525 --> 00:14:14.335 When you start getting to hundreds of millions, you know,
345 00:14:14.395 --> 00:14:17.215 and, and up north of a billion vectors, uh, the,
346 00:14:17.215 --> 00:14:19.055 the architecture starts to matter a lot more, right?
347 00:14:19.875 --> 00:14:22.215 And those databases were designed
348 00:14:22.515 --> 00:14:24.095 for different use cases, right?
349 00:14:24.095 --> 00:14:26.375 They were, they were designed for a NoSQL use case,
350 00:14:26.375 --> 00:14:28.055 or they were designed for a SQL use case.
351 00:14:28.055 --> 00:14:31.615 They were not designed to store data where you want
352 00:14:31.615 --> 00:14:32.775 to store the vectors
353 00:14:32.775 --> 00:14:35.535 that are closest in vector space together on disc, so
354 00:14:35.535 --> 00:14:37.815 that when you read it in right, you'll get all the vectors
355 00:14:37.815 --> 00:14:40.495 that you need an all in one shot instead
356 00:14:40.495 --> 00:14:42.775 of loading in shards from all over the place,
357 00:14:42.775 --> 00:14:45.375 because that's how your, your, your database is architected.
358 00:14:45.375 --> 00:14:47.575 So there's a lot of little things to consider like that.
359 00:14:47.755 --> 00:14:49.895 Uh, when, when choosing a vector database, again,
360 00:14:50.235 --> 00:14:52.685 if you're talking about scale, that's,
361 00:14:52.685 --> 00:14:54.805 that's not gonna reach, you know, the hundreds of millions,
362 00:14:54.805 --> 00:14:55.645 you're probably fine, you know,
363 00:14:55.645 --> 00:14:56.725 with, with a lot of other things.
364 00:14:56.785 --> 00:14:59.165 But if you're really serious about running a production
365 00:14:59.165 --> 00:15:01.885 vector workload and you have a ton of data, uh,
366 00:15:01.885 --> 00:15:04.205 that you need to search across, it might be multi-tenant,
367 00:15:04.205 --> 00:15:05.925 it might not be multi-tenant, uh, you know,
368 00:15:06.125 --> 00:15:07.925 Zilliz is really shines in in this respect.
369 00:15:08.105 --> 00:15:11.565 So, uh, I'll, I'll walk through a little bit about
370 00:15:11.665 --> 00:15:13.965 how we've abstracted this for, for,
371 00:15:13.965 --> 00:15:15.325 for Zilliz cloud customers.
372 00:15:15.665 --> 00:15:18.285 Uh, you know, essentially we have, uh, a handful of
373 00:15:18.285 --> 00:15:19.765 what we call CU types, right?
374 00:15:19.765 --> 00:15:20.965 So they're compute units.
375 00:15:21.105 --> 00:15:24.445 And you can think of these as instances in EC2 instances in
376 00:15:24.445 --> 00:15:27.685 AWS it's a very, you know, it, uh, similar analogy, uh,
377 00:15:27.825 --> 00:15:31.565 you know, where each cu would constitute some amount
378 00:15:31.565 --> 00:15:32.845 of vector capacity, right?
379 00:15:32.845 --> 00:15:34.125 So if you have, say, uh,
380 00:15:34.125 --> 00:15:37.685 5 million vectors that'll fit on some amount of cu, right?
381 00:15:37.685 --> 00:15:40.925 And we have different types of cu that are optimized
382 00:15:41.105 --> 00:15:42.205 for different use cases.
383 00:15:42.665 --> 00:15:45.605 Uh, the primary two we have are performance and capacity.
384 00:15:46.185 --> 00:15:48.725 So if you need the maximum performance
385 00:15:48.725 --> 00:15:51.285 and your, your latency is the most important thing to you,
386 00:15:51.745 --> 00:15:54.645 we have a performance optimized cu that you can use.
387 00:15:55.185 --> 00:15:58.725 Uh, if, if, uh, you're, you're, you're more interested in
388 00:15:59.365 --> 00:16:00.925 capacity per cu, but you're willing
389 00:16:00.925 --> 00:16:02.445 to sacrifice a little bit of latency
390 00:16:02.445 --> 00:16:06.525 and a little bit of concurrency, we also have a capacity ICU
391 00:16:06.525 --> 00:16:08.645 and you get to pick, uh, at, at, at the beginning.
392 00:16:08.945 --> 00:16:11.645 Uh, and it really is based on your workload
393 00:16:11.945 --> 00:16:13.565 and the, the type of application
394 00:16:13.565 --> 00:16:15.445 and ob obviously where it sits in your pipeline.
395 00:16:15.745 --> 00:16:17.445 Uh, and, and our, our team, our,
396 00:16:17.445 --> 00:16:19.885 our solutions architect team is obviously available, uh,
397 00:16:19.885 --> 00:16:21.445 to kind of walk through the differences
398 00:16:21.445 --> 00:16:22.525 between these two products
399 00:16:22.785 --> 00:16:23.845 and make sure that, you know,
400 00:16:23.845 --> 00:16:25.405 you pick the appropriate solution for you.
401 00:16:25.865 --> 00:16:29.085 So, uh, the main, the, the, the primary way
402 00:16:29.085 --> 00:16:31.325 that most people do scaling is manual scaling.
403 00:16:31.585 --> 00:16:34.245 Uh, you know, you're able to just pick, uh, and,
404 00:16:34.245 --> 00:16:36.005 and I'll just, I'll just look through that.
405 00:16:36.115 --> 00:16:37.405 I'll just give you guys the actual
406 00:16:37.545 --> 00:16:38.805 ui 'cause it'll, it'll be helpful to see.
407 00:16:38.825 --> 00:16:41.205 So this is what a Zilliz cluster
408 00:16:41.425 --> 00:16:43.725 and it looks like inside of Zilliz cloud.
409 00:16:44.105 --> 00:16:47.165 Uh, there's a handful of things in here, uh, that the,
410 00:16:47.165 --> 00:16:49.085 the main thing that I want to point your attention
411 00:16:49.145 --> 00:16:51.285 to is the CU size, which is right here.
412 00:16:51.585 --> 00:16:53.965 You have two cu and right now the capacity is at 2%.
413 00:16:54.005 --> 00:16:55.085 I don't have too much in this cluster.
414 00:16:55.665 --> 00:16:58.365 And the way that most people do, uh,
415 00:16:58.365 --> 00:17:00.285 scaling is they just click on the scale button,
416 00:17:00.465 --> 00:17:03.365 and you're able to see the cu that you're able to add.
417 00:17:03.365 --> 00:17:05.085 And obviously, this list is very, very large.
418 00:17:05.465 --> 00:17:07.605 Um, and it could get even higher.
419 00:17:07.745 --> 00:17:10.565 Uh, if you need us to help you, obviously we can help you.
420 00:17:10.585 --> 00:17:12.445 But, you know, 256 is quite a lot
421 00:17:12.875 --> 00:17:14.645 that sits in the many, many billions.
422 00:17:14.945 --> 00:17:17.485 And, um, you know, we, we do have customers that,
423 00:17:17.515 --> 00:17:18.525 that exceed that, of course.
424 00:17:18.705 --> 00:17:20.565 Uh, and obviously we'll, we'll work with you very closely,
425 00:17:20.945 --> 00:17:22.525 but this is how most people do it.
426 00:17:22.625 --> 00:17:26.485 Uh, and for the most part, your, your vector count grow,
427 00:17:26.505 --> 00:17:29.045 as long as your vector count has been growing by, uh,
428 00:17:29.065 --> 00:17:32.405 you know, an, an insane rate, uh, every day, uh,
429 00:17:32.405 --> 00:17:33.805 you're generally fine with this approach.
430 00:17:34.225 --> 00:17:37.325 We do have an auto scale feature as well
431 00:17:37.325 --> 00:17:38.485 that you're seeing on the right.
432 00:17:38.485 --> 00:17:39.845 Lemme just move this video outta the way.
433 00:17:40.065 --> 00:17:42.165 Uh, you have this autoscale feature on the right
434 00:17:42.195 --> 00:17:44.565 that allows you to set a threshold
435 00:17:45.025 --> 00:17:46.925 so your cu capacity a threshold.
436 00:17:46.985 --> 00:17:48.725 So, you know, whatever you guys are comfortable
437 00:17:48.725 --> 00:17:52.045 with in terms of, you know, risking the, the possibility
438 00:17:52.045 --> 00:17:54.525 of maybe, you know, elevated query latency.
439 00:17:54.755 --> 00:17:57.445 Some people like to set it all the way at 90%.
440 00:17:57.465 --> 00:17:58.685 We personally don't recommend that,
441 00:17:58.705 --> 00:18:00.525 but, you know, it's something that, that you can do.
442 00:18:00.865 --> 00:18:02.365 Uh, but generally between 70
443 00:18:02.365 --> 00:18:06.725 and 80% is, is a, is a good, safe, happy medium where, uh,
444 00:18:07.025 --> 00:18:10.805 if the, if the capacity of your cluster gets to that level,
445 00:18:10.895 --> 00:18:14.205 it'll automatically scale you up to the next tier, which is,
446 00:18:14.205 --> 00:18:15.725 you know, you'll add two cu at a time.
447 00:18:16.185 --> 00:18:18.885 And I think that's also important to, to, to mention
448 00:18:18.945 --> 00:18:21.525 as well, just because, uh, you're, you're,
449 00:18:21.525 --> 00:18:23.205 you're never gonna be in a situation where you're,
450 00:18:23.205 --> 00:18:24.765 you're over allocated, right?
451 00:18:24.825 --> 00:18:27.925 So a lot of other solutions out there will force you
452 00:18:27.925 --> 00:18:29.845 to pick, um, the number
453 00:18:29.845 --> 00:18:31.885 of horizontally scaled nodes that you have.
454 00:18:31.905 --> 00:18:34.085 So let's say you have, you know, 10 horizontally scaled
455 00:18:34.085 --> 00:18:37.005 nodes, and, uh, they, they make it very difficult
456 00:18:37.005 --> 00:18:38.045 for you to change that number.
457 00:18:38.105 --> 00:18:40.005 And the reason is because the way the index is,
458 00:18:40.025 --> 00:18:42.045 is built across those, those nodes.
459 00:18:42.265 --> 00:18:44.765 So, uh, if you want to go up, that's fine,
460 00:18:44.765 --> 00:18:47.405 but you have to, you have to vertically scale all 10, right?
461 00:18:47.405 --> 00:18:48.645 Which is generally not what you want.
462 00:18:48.645 --> 00:18:50.765 You're basically doubling, you're, you're doubling the,
463 00:18:50.765 --> 00:18:52.245 the entire capacity of your cluster,
464 00:18:52.245 --> 00:18:54.445 which is sometimes it might be okay,
465 00:18:54.445 --> 00:18:56.005 but like most of the time, that's not what you want.
466 00:18:56.025 --> 00:18:58.205 You, what you wanna do is you want to add things linearly
467 00:18:58.265 --> 00:18:59.485 as your vector count grows.
468 00:18:59.985 --> 00:19:02.365 And zills allows you to do that very, very nicely.
469 00:19:02.425 --> 00:19:03.525 You're, you're, you're,
470 00:19:03.545 --> 00:19:07.285 you're just tack on two cu at a time, uh, either manually
471 00:19:07.345 --> 00:19:10.205 by yourself or you can use our autoscale feature to do that.
472 00:19:10.625 --> 00:19:14.845 Uh, you can also use our modify cluster endpoint as well.
473 00:19:14.985 --> 00:19:19.765 So, uh, if you want to plug this into some of the automation
474 00:19:19.765 --> 00:19:23.085 that you have, uh, in, in your DevOps pipeline, uh, we,
475 00:19:23.085 --> 00:19:24.125 we do offer that as well.
476 00:19:24.225 --> 00:19:27.365 So you can read all the cluster metrics through the API,
477 00:19:27.385 --> 00:19:31.485 you can see historically where the cu capacity has been and,
478 00:19:31.485 --> 00:19:33.405 and where it might be going based on other things
479 00:19:33.405 --> 00:19:34.645 that are happening within your system.
480 00:19:35.105 --> 00:19:36.405 Uh, and, and you can scale up
481 00:19:36.405 --> 00:19:38.645 or scale down, uh, with, with that endpoint as well.
482 00:19:38.645 --> 00:19:41.605 So we offer a lot of opportunities for, you know,
483 00:19:41.605 --> 00:19:42.645 both manual auto
484 00:19:42.905 --> 00:19:44.325 and, you know, plugging into your
485 00:19:44.325 --> 00:19:45.805 DevOps pipeline in terms of scaling.
486 00:19:46.385 --> 00:19:48.925 And also, it's really important to note that, you know, the,
487 00:19:48.945 --> 00:19:50.245 the scaling itself is,
488 00:19:50.505 --> 00:19:52.525 is done in a nice linear fashion, right?
489 00:19:52.745 --> 00:19:54.525 Uh, I'm just making sure I'm okay on time, okay?
490 00:19:54.745 --> 00:19:58.045 Um, and, uh, it, it, it ends up being very flexible
491 00:19:58.045 --> 00:20:00.125 to your use case, and it, it's something that, you know,
492 00:20:00.205 --> 00:20:01.485 a lot of our customers take advantage of
493 00:20:01.485 --> 00:20:02.645 and, and really appreciate.
494 00:20:02.745 --> 00:20:04.765 So, um, that's a little bit about scaling.
495 00:20:05.105 --> 00:20:09.325 Um, uh, I wanna touch a little bit on security as well.
496 00:20:09.625 --> 00:20:13.725 Uh, we get a lot of, uh, enterprise customers that are very,
497 00:20:13.955 --> 00:20:18.045 very curious about our security apparatus, what we do,
498 00:20:18.105 --> 00:20:21.125 and how it fits into their, uh, their policy network.
499 00:20:21.825 --> 00:20:24.285 And the, the primary thing really that, especially
500 00:20:24.505 --> 00:20:26.645 for any managed service is, is making sure
501 00:20:26.675 --> 00:20:30.605 that the managed service is deployed in not only the same
502 00:20:30.605 --> 00:20:33.325 region, but hopefully the same availability zone as the rest
503 00:20:33.325 --> 00:20:35.045 of the microservices that are gonna be able,
504 00:20:35.155 --> 00:20:37.645 that are gonna be calling it on, on a regular basis,
505 00:20:37.745 --> 00:20:39.325 not just for queries, but also for metrics
506 00:20:39.505 --> 00:20:41.485 and, you know, scaling things up and, and, and that nature.
507 00:20:41.905 --> 00:20:45.605 Uh, so we do support private link on AWS
508 00:20:45.785 --> 00:20:48.525 and the, all of their equivalents on, on GCP,
509 00:20:48.865 --> 00:20:50.085 uh, and Azure as well.
510 00:20:50.105 --> 00:20:51.765 So you can create a private endpoint,
511 00:20:52.145 --> 00:20:54.765 and that way all of the traffic from your microservices
512 00:20:55.145 --> 00:20:57.285 to your Zilliz installation, uh,
513 00:20:57.355 --> 00:20:58.765 will not go over the open internet.
514 00:20:58.765 --> 00:21:00.805 So you're not gonna have any in ingress or egress issues.
515 00:21:01.225 --> 00:21:04.685 Uh, but more importantly, that traffic, it'll stay inside
516 00:21:04.685 --> 00:21:07.245 of your VPC or it'll stay inside of the, the,
517 00:21:07.265 --> 00:21:08.805 the AWS network, right?
518 00:21:08.805 --> 00:21:11.525 So you're not, uh, exposing anything potentially to,
519 00:21:11.525 --> 00:21:12.765 to, to the outside world.
520 00:21:13.265 --> 00:21:16.405 Um, the other thing that we, that we offer
521 00:21:16.405 --> 00:21:19.045 as well is customer managed encryption keys.
522 00:21:19.385 --> 00:21:22.165 Uh, this has also been a really popularly requested feature
523 00:21:22.255 --> 00:21:25.405 where, uh, we can plug into your KMS.
524 00:21:25.405 --> 00:21:27.605 So it's A-W-S-K-M-S, we'll plug into that, uh,
525 00:21:27.665 --> 00:21:30.285 you can issue us keys from that KMS
526 00:21:30.545 --> 00:21:32.765 and we'll encrypt, uh, essentially everything.
527 00:21:32.765 --> 00:21:36.565 So we'll encrypt the, the, uh, the, the vector embeddings,
528 00:21:36.625 --> 00:21:39.565 uh, we'll encrypt all of the metadata associated with it.
529 00:21:39.985 --> 00:21:43.645 Um, and at query time, we'll just decrypt on the fly,
530 00:21:44.105 --> 00:21:46.845 run the a and n, and then re-encrypt everything so
531 00:21:46.845 --> 00:21:47.965 that it stays nice
532 00:21:47.965 --> 00:21:51.885 and encrypted at rest, um, in, in, in our, in our VPC.
533 00:21:51.885 --> 00:21:54.245 And obviously, if, you know, through your KMS,
534 00:21:54.245 --> 00:21:55.765 you can revoke those keys at any time.
535 00:21:56.105 --> 00:21:58.845 And then Zilliz no longer has access to any of your data.
536 00:21:58.845 --> 00:22:01.005 So that's a very popularly requested feature
537 00:22:01.025 --> 00:22:02.325 as well by enterprises.
538 00:22:02.625 --> 00:22:04.885 Uh, and it's something that we worked really hard on, uh,
539 00:22:04.885 --> 00:22:07.085 just to make sure that, you know, uh, the, the folks
540 00:22:07.105 --> 00:22:10.245 who have these very stringent requirements, uh, in terms of,
541 00:22:10.265 --> 00:22:11.845 you know, PII and all the other,
542 00:22:12.005 --> 00:22:13.805 'cause, you know, a lot of this data is, is again,
543 00:22:13.805 --> 00:22:14.725 like I said at the very beginning,
544 00:22:15.405 --> 00:22:16.485 internal to your organization.
545 00:22:16.545 --> 00:22:19.605 So it's very important that, you know, the, the, the CISOs
546 00:22:19.765 --> 00:22:22.325 of those organizations feel comfortable that all that data,
547 00:22:22.705 --> 00:22:24.725 uh, is firmly in their control
548 00:22:24.985 --> 00:22:28.605 and if necessary, could be, uh, revoked at any time,
549 00:22:28.705 --> 00:22:29.925 uh, if, if there is an issue.
550 00:22:30.025 --> 00:22:31.085 So I'll stop there.
551 00:22:31.125 --> 00:22:33.805 I know we have about four minutes left at the end, uh, for,
552 00:22:33.825 --> 00:22:36.325 for, for questions, but, uh, yeah, uh, happy
553 00:22:36.325 --> 00:22:37.325 to take any questions now,
554 00:22:37.325 --> 00:22:38.605 or Chris, I'll pass it back to you.
555 00:22:38.905 --> 00:22:41.245 Do you, what about, um, you talked about scaling up.
556 00:22:41.245 --> 00:22:43.765 What about scaling down, uh, on the previous topic?
557 00:22:44.075 --> 00:22:48.205 Yeah, so scaling down, um, is, is done the same way.
558 00:22:48.345 --> 00:22:50.405 Uh, you can go to this scale button here,
559 00:22:50.405 --> 00:22:51.725 you can scale back down to one cu,
560 00:22:51.845 --> 00:22:52.885 I can just do that right now if you want.
561 00:22:53.265 --> 00:22:56.725 Um, and then the cluster will scale back down to one.
562 00:22:57.065 --> 00:22:58.485 Uh, it's also important to note
563 00:22:58.485 --> 00:23:01.005 that auto scale does not scale back down, right?
564 00:23:01.185 --> 00:23:04.085 So that's the, and the reason why that is, is
565 00:23:04.085 --> 00:23:08.685 because scaling down is inherently a more dangerous
566 00:23:08.685 --> 00:23:10.125 operation than scaling up.
567 00:23:10.465 --> 00:23:12.965 Uh, and there's the opportunity for, uh,
568 00:23:12.965 --> 00:23:14.125 increased query latency.
569 00:23:14.125 --> 00:23:17.245 There's the opportunity for your production application to,
570 00:23:17.425 --> 00:23:19.725 um, experience issues if it is part an
571 00:23:20.005 --> 00:23:21.045 integral part of that pipeline.
572 00:23:21.785 --> 00:23:23.605 So that's the decision we made there.
573 00:23:23.745 --> 00:23:26.085 Uh, you know, if, if, if you wanna scale back down,
574 00:23:26.085 --> 00:23:29.685 like I said before in a programmatic fashion, you can use
575 00:23:30.425 --> 00:23:33.085 our query cluster metrics endpoint
576 00:23:33.105 --> 00:23:34.725 and our modify cluster endpoint
577 00:23:34.825 --> 00:23:37.085 to scale back down if you wanna do it programmatically,
578 00:23:37.265 --> 00:23:40.245 or you can do it through the UI that, uh, that, uh, that,
579 00:23:40.245 --> 00:23:41.325 that I just showed you right now.
580 00:23:41.705 --> 00:23:43.885 Um, but yeah, that, that's, that's a really, uh,
581 00:23:43.885 --> 00:23:45.640 common thing that a lot of folks, folks do just
582 00:23:45.640 --> 00:23:48.565 because they might be running some import job,
583 00:23:48.825 --> 00:23:52.245 or they might be, uh, experimenting with a lot of vectors
584 00:23:52.245 --> 00:23:53.525 and they just remove them all at once
585 00:23:53.525 --> 00:23:55.205 and they, they have this extra capacity.
586 00:23:55.205 --> 00:23:57.125 But again, you can, you can scale it down,
587 00:23:57.125 --> 00:23:58.205 you know, in step, right?
588 00:23:58.205 --> 00:24:00.605 So you don't have to go all the way down from 32 to 16.
589 00:24:00.605 --> 00:24:02.765 You can go, you know, you can go in step just to kind of see
590 00:24:03.085 --> 00:24:04.125 what the performance looks like.
591 00:24:04.385 --> 00:24:07.245 And again, it's, it's, it's really a, a testament to how
592 00:24:08.005 --> 00:24:09.845 flexible and horizontally scalable illus is.
593 00:24:10.695 --> 00:24:13.235 Oh. So, uh, if anybody has any questions, uh,
594 00:24:13.235 --> 00:24:14.475 feel free to raise your hand.
595 00:24:14.475 --> 00:24:16.835 I'll unmute, uh, you so you can ask,
596 00:24:16.855 --> 00:24:19.115 or if you prefer talking or typing it in, that's fine.
597 00:24:20.015 --> 00:24:23.275 Um, but, uh, Jay, I actually have a couple of questions.
598 00:24:23.415 --> 00:24:26.875 So, um, let's talk a little bit about bringing data in.
599 00:24:27.265 --> 00:24:29.875 What are the ways that we can bring in data into
600 00:24:29.875 --> 00:24:30.955 the database efficiently?
601 00:24:31.825 --> 00:24:36.195 Sure. So we offer a bulk, uh,
602 00:24:36.415 --> 00:24:38.035 insert, uh, API,
603 00:24:38.615 --> 00:24:42.995 and essentially the way it works is you'll define the schema
604 00:24:43.215 --> 00:24:44.635 for your Zilliz database,
605 00:24:45.375 --> 00:24:48.515 and you can send us either Parquet files
606 00:24:48.815 --> 00:24:51.315 or you can send us JSO files that have all
607 00:24:51.315 --> 00:24:52.915 of the data, uh, in them.
608 00:24:53.415 --> 00:24:56.715 And you can just send us those, you know, links to those S3,
609 00:24:56.715 --> 00:24:58.755 you know, wherever is object storage, wherever it is,
610 00:24:59.015 --> 00:25:00.315 as long as it's authenticated.
611 00:25:00.615 --> 00:25:02.075 Uh, and then we will handle all
612 00:25:02.075 --> 00:25:03.395 of the import parallelization
613 00:25:03.495 --> 00:25:06.315 and all that, um, uh, on our end, right?
614 00:25:06.975 --> 00:25:08.675 The, the other way to do it is
615 00:25:08.935 --> 00:25:11.755 to just use the uper endpoint, and that also works as well.
616 00:25:11.755 --> 00:25:13.395 If you want to do it in a more iterative fashion,
617 00:25:13.495 --> 00:25:14.595 that's, that's also fine.
618 00:25:14.595 --> 00:25:16.075 We have tons of customers do it that way.
619 00:25:16.575 --> 00:25:20.155 Uh, but if you have, say, let's say, you know, uh,
620 00:25:20.635 --> 00:25:23.435 150 million vectors that you wanna load all at once,
621 00:25:23.695 --> 00:25:25.635 and you have them somewhere, right?
622 00:25:25.635 --> 00:25:28.155 Whether they're in some other database or Snowflake
623 00:25:28.295 --> 00:25:31.195 or wherever, uh, you know, parquet
624 00:25:31.195 --> 00:25:33.715 and JSON are are pretty industry standard file types.
625 00:25:33.815 --> 00:25:37.595 And, uh, most ETL pipelines are able to, to write to those,
626 00:25:38.135 --> 00:25:39.515 uh, to those formats.
627 00:25:39.515 --> 00:25:40.675 And, and, and we accept both
628 00:25:40.675 --> 00:25:42.315 of those formats in, in a bulk fashion.
629 00:25:42.375 --> 00:25:45.675 So it allows you to kind of, uh, uh, you know, import, uh,
630 00:25:45.675 --> 00:25:47.715 very quickly if you're coming from a, uh,
631 00:25:47.715 --> 00:25:49.355 another vector database solution.
632 00:25:49.415 --> 00:25:51.275 Uh, we have import pipelines specific
633 00:25:51.275 --> 00:25:52.555 to those vector database solutions.
634 00:25:52.735 --> 00:25:54.315 Uh, so if you're coming from quadrant,
635 00:25:54.315 --> 00:25:57.235 or if you're coming from pi, from Pine Cone, um, we have,
636 00:25:57.415 --> 00:25:59.155 uh, native support for both of those.
637 00:25:59.155 --> 00:26:00.755 You just give us the API keys to each
638 00:26:00.755 --> 00:26:02.275 of those hosted solutions, uh,
639 00:26:02.275 --> 00:26:03.395 and we're able to suck the data
640 00:26:03.395 --> 00:26:04.715 in directly from that database.
641 00:26:05.055 --> 00:26:08.955 Um, there are a couple trade offs there. Excuse me.
642 00:26:09.285 --> 00:26:10.395 There are a couple trade offs there.
643 00:26:10.455 --> 00:26:12.715 Uh, you know, especially around like the schema.
644 00:26:13.015 --> 00:26:14.515 So, you know, you're, you're kind of stuck
645 00:26:14.515 --> 00:26:15.675 to the schema that you already had.
646 00:26:15.975 --> 00:26:17.075 Um, when you're,
647 00:26:17.075 --> 00:26:19.195 when you're coming from the previous solution,
648 00:26:19.815 --> 00:26:21.115 if you wanna change the schema
649 00:26:21.115 --> 00:26:22.355 or if you wanna change your embedding model,
650 00:26:22.415 --> 00:26:24.315 if there's things that you'd like to change, uh,
651 00:26:24.535 --> 00:26:27.115 you would just have to create those parquet
652 00:26:27.115 --> 00:26:28.355 and JSON files that I mentioned earlier,
653 00:26:28.615 --> 00:26:29.675 uh, and, and go that way.
654 00:26:29.675 --> 00:26:30.715 And that's, you know, generally
655 00:26:30.715 --> 00:26:33.235 what most people do if they do wanna change their schema,
656 00:26:33.615 --> 00:26:36.395 But you're here to help, right, Jay, uh, in, in making
657 00:26:36.395 --> 00:26:38.035 that, helping them make that decision.
658 00:26:38.545 --> 00:26:40.395 Yeah, of course. I mean, when we work with customers
659 00:26:40.395 --> 00:26:42.915 who are moving from something else, uh, you know, we,
660 00:26:43.025 --> 00:26:45.485 we ask a lot of questions about how they're currently set up
661 00:26:45.485 --> 00:26:48.285 and, you know, we, we have a very, uh, helpful team that's,
662 00:26:48.285 --> 00:26:50.405 that's able to not only recommend solutions,
663 00:26:50.405 --> 00:26:52.725 but also help you guys, uh, make sure that the,
664 00:26:52.825 --> 00:26:54.165 the import process is as smooth
665 00:26:54.165 --> 00:26:55.525 as possible from wherever you're coming from.
666 00:26:55.955 --> 00:26:58.245 Yeah. And so don't forget the, uh, bulk import
667 00:26:58.245 --> 00:26:59.845 or the bulk rider capability.
668 00:27:00.085 --> 00:27:01.365 I don't know why, but we've had a number
669 00:27:01.445 --> 00:27:02.765 of customers that overlooked it.
670 00:27:03.115 --> 00:27:05.885 They were frustrated with doing it one by one,
671 00:27:05.885 --> 00:27:09.165 and then Jay was like, Hey, we have this other capability.
672 00:27:09.225 --> 00:27:11.845 So, yep. Saves a lot of time.
673 00:27:12.035 --> 00:27:13.165 It's there. It, it's definitely there.
674 00:27:14.385 --> 00:27:17.685 Now let's talk a little bit about, um, uh,
675 00:27:18.165 --> 00:27:19.965 multiple embeddings in,
676 00:27:20.125 --> 00:27:22.045 especially when you're trying to do hybrid search.
677 00:27:22.145 --> 00:27:24.045 So, you know, when you talk about, uh,
678 00:27:24.115 --> 00:27:25.925 when we look at a row, you can have more
679 00:27:25.925 --> 00:27:26.965 than just one right?
680 00:27:26.965 --> 00:27:29.805 Vector bening. So what is that and why is that important?
681 00:27:29.955 --> 00:27:31.645 Yeah, that's, that's really, uh,
682 00:27:32.125 --> 00:27:33.685 I think something that is pretty unique to us.
683 00:27:33.745 --> 00:27:37.725 So we support four vector embeddings per entry.
684 00:27:38.545 --> 00:27:42.325 So you can have, uh, two dents and one sparse.
685 00:27:42.325 --> 00:27:43.805 You could have four dents in one spar,
686 00:27:43.825 --> 00:27:45.125 or three dents in one sparse.
687 00:27:45.585 --> 00:27:50.485 Um, and you can run queries both sliced down each
688 00:27:50.485 --> 00:27:51.765 of those vector embedding.
689 00:27:51.765 --> 00:27:53.085 So there's like one index for each of those.
690 00:27:53.085 --> 00:27:55.285 So if you only want to say, for example, run the a
691 00:27:55.285 --> 00:27:58.845 and n across the first set of dense vectors,
692 00:27:58.845 --> 00:27:59.965 and then you can run a separate a
693 00:27:59.965 --> 00:28:02.085 and n across the second set of dense vectors.
694 00:28:02.505 --> 00:28:04.685 Uh, that's very helpful when you're testing
695 00:28:05.315 --> 00:28:06.605 different embedding models.
696 00:28:06.865 --> 00:28:08.325 So there's a lot of embedding models now
697 00:28:08.325 --> 00:28:10.205 that are fine tuned, or maybe you're tuning them yourself.
698 00:28:10.505 --> 00:28:12.245 Um, and, and, and you want to see
699 00:28:12.915 --> 00:28:16.725 what those vector spaces look like, uh, for,
700 00:28:16.745 --> 00:28:17.885 for each of those embedding.
701 00:28:17.965 --> 00:28:19.565 'cause you know, it might be, you could take the same tease
702 00:28:19.565 --> 00:28:21.165 of text running through different embedding models,
703 00:28:21.165 --> 00:28:22.925 and you could be totally, the location could be totally
704 00:28:22.925 --> 00:28:25.445 different based on, uh, what the tuning looks like.
705 00:28:25.545 --> 00:28:29.085 So we see that a lot with customers that want to see,
706 00:28:29.145 --> 00:28:31.445 is it worth it for me to use this fine tuned model?
707 00:28:31.465 --> 00:28:32.925 How much benefit do I get from it?
708 00:28:33.425 --> 00:28:36.085 And you can just send the queries to each of them, you know,
709 00:28:36.105 --> 00:28:38.405 all day long, uh, and, and see what they look like.
710 00:28:38.465 --> 00:28:40.285 So that's a very popular use case.
711 00:28:40.625 --> 00:28:42.685 Um, the other popular use case that we see is
712 00:28:42.685 --> 00:28:43.845 with sparse vectors.
713 00:28:44.265 --> 00:28:48.965 And sparse vectors are primarily used for, uh, blending, uh,
714 00:28:49.225 --> 00:28:51.445 lexile search with semantic search, right?
715 00:28:52.315 --> 00:28:54.415 So there are, uh, the most popular ones,
716 00:28:54.495 --> 00:28:56.295 probably BM two five, which has been around for decades.
717 00:28:56.795 --> 00:28:59.455 Uh, but there are are newer ones coming up called, you know,
718 00:28:59.455 --> 00:29:01.575 there's one called Splay that's also very interesting.
719 00:29:01.915 --> 00:29:05.055 Uh, and it allow, we, we allow you to have them sit side
720 00:29:05.075 --> 00:29:06.135 by side, uh,
721 00:29:06.155 --> 00:29:09.655 and you can run a, a, a hybrid search on both of them.
722 00:29:09.835 --> 00:29:13.215 So, you know, this, this is very helpful for, for example,
723 00:29:13.835 --> 00:29:18.055 um, in e-commerce use cases where you have, say,
724 00:29:18.495 --> 00:29:22.135 a SKU or a UPC code that's very unique and,
725 00:29:22.135 --> 00:29:24.015 and you're like a hundred percent sure that
726 00:29:24.015 --> 00:29:25.895 that chunk will have that UPC code in it.
727 00:29:26.275 --> 00:29:29.975 So if you use a sparse factor on that,
728 00:29:30.265 --> 00:29:32.215 it'll combine the dense
729 00:29:32.435 --> 00:29:35.335 and sparse together, so you're combining semantic
730 00:29:35.595 --> 00:29:36.855 and lexical together,
731 00:29:37.115 --> 00:29:39.575 and it'll overweight the ones that have
732 00:29:39.575 --> 00:29:41.205 that exact UPC code in them.
733 00:29:41.205 --> 00:29:43.845 So it's, it basically, it, it, it increases the probability,
734 00:29:43.845 --> 00:29:45.245 it'll push those results to the top,
735 00:29:45.315 --> 00:29:47.405 whereas if you just did it as semantic,
736 00:29:47.905 --> 00:29:48.925 it may or may not be there.
737 00:29:48.925 --> 00:29:51.125 You might need to, you know, jack up the, the top K to kind
738 00:29:51.125 --> 00:29:52.245 of get exactly what you're looking for.
739 00:29:52.505 --> 00:29:55.085 But it allows you to fine tune your queries
740 00:29:55.085 --> 00:29:58.765 for those use cases where, uh, you know, semantic is great
741 00:29:58.945 --> 00:30:00.005 and we want to use it,
742 00:30:00.185 --> 00:30:03.325 but it's, it's also, you know, we we're, we're pretty sure
743 00:30:03.325 --> 00:30:04.325 that it has this in it,
744 00:30:04.345 --> 00:30:05.965 and we want you to do a lexical match
745 00:30:05.965 --> 00:30:07.165 as well, so we support that as well.
746 00:30:07.955 --> 00:30:10.375 So, I mean, I think there's, there used to be kind
747 00:30:10.375 --> 00:30:11.975 of a hacky way that you could do this, right?
748 00:30:11.995 --> 00:30:13.735 You could have these different embeddings
749 00:30:13.735 --> 00:30:15.695 and, you know, different collections, so mm-hmm.
750 00:30:16.285 --> 00:30:20.015 What, you know, what did we do to make it more useful
751 00:30:20.015 --> 00:30:22.255 besides, you know, doing a hybrid search by putting it,
752 00:30:22.315 --> 00:30:23.815 you know, under one entity.
753 00:30:25.035 --> 00:30:27.615 Uh, so it's, it's, it's part of the way
754 00:30:27.615 --> 00:30:29.535 that our index strategy works.
755 00:30:29.565 --> 00:30:31.535 Like we, we look at each of the, uh,
756 00:30:31.635 --> 00:30:34.495 vector entries individually, um, as opposed
757 00:30:34.515 --> 00:30:37.455 to having a single index for the entire collection,
758 00:30:37.455 --> 00:30:39.535 which is generally how, uh, a lot
759 00:30:39.535 --> 00:30:41.095 of the other vector databases do it.
760 00:30:41.155 --> 00:30:45.655 So we're a database first, index second, I think a lot
761 00:30:45.655 --> 00:30:47.615 of the vector databases out there are a index
762 00:30:47.615 --> 00:30:48.775 first, database second.
763 00:30:48.835 --> 00:30:50.215 So that's, that's a very, uh,
764 00:30:50.215 --> 00:30:51.815 important distinction, I think to make.
765 00:30:52.195 --> 00:30:53.455 Uh, and it just comes from our history.
766 00:30:53.475 --> 00:30:56.095 You know, our our founder, you know, was, was at Oracle
767 00:30:56.155 --> 00:30:58.015 for a long time and, uh, you know,
768 00:30:58.015 --> 00:31:01.095 he's very deeply knowledgeable in, in database design, uh,
769 00:31:01.115 --> 00:31:02.535 and, and, and what constitutes a
770 00:31:02.535 --> 00:31:03.655 good database design, right?
771 00:31:03.655 --> 00:31:05.455 So that's a really good foundation for us.
772 00:31:05.755 --> 00:31:07.135 And then we've added, you know,
773 00:31:07.255 --> 00:31:09.575 a world-class vector capability on top of it,
774 00:31:09.745 --> 00:31:11.495 which gives us the best of both worlds, right?
775 00:31:11.495 --> 00:31:13.415 It gives us, you know, a lot of the flexibility
776 00:31:13.415 --> 00:31:15.055 of being an actual database, right?
777 00:31:15.355 --> 00:31:18.815 Um, but at the same time having this, you know, great, um,
778 00:31:19.165 --> 00:31:21.175 very intelligent, constantly changing,
779 00:31:21.435 --> 00:31:25.055 but also, uh, you know, simple for you guys to use, right?
780 00:31:25.115 --> 00:31:26.175 And, and through, through a lot of
781 00:31:26.175 --> 00:31:27.415 the abstractions that we've created.
782 00:31:27.835 --> 00:31:31.055 Uh, and, and you kind of get to utilize a lot of the power
783 00:31:31.055 --> 00:31:32.815 that, that that's under the hood, you know,
784 00:31:32.815 --> 00:31:34.895 while having an interface that's very easy to understand.
785 00:31:35.675 --> 00:31:36.885 Cool. So don't forget,
786 00:31:36.885 --> 00:31:38.765 we'll leave the lines open for a little bit longer.
787 00:31:38.985 --> 00:31:41.965 If you have any questions, pop it in the q and a or the chat
788 00:31:41.965 --> 00:31:43.605 or raise your hand and I'll unmute lines.
789 00:31:44.145 --> 00:31:46.285 Um, but I have one more question for you, Jay.
790 00:31:46.285 --> 00:31:47.925 Well, I have always more than just one,
791 00:31:48.305 --> 00:31:51.325 but if you don't mind if we can talk a little bit about
792 00:31:51.515 --> 00:31:54.565 multi-tenancy and all the different ways
793 00:31:54.675 --> 00:31:56.205 that we can do multi-tenancy.
794 00:31:56.435 --> 00:31:59.165 Sure. And then what are the pros and cons for each?
795 00:31:59.985 --> 00:32:04.925 So, uh, in Zilliz, we, we, we generally recommend
796 00:32:05.155 --> 00:32:09.725 that you use logical isolation for multi-tenancy.
797 00:32:09.825 --> 00:32:12.765 So you put all your vectors in a single, uh, collection,
798 00:32:13.345 --> 00:32:16.285 and you use something called a partition key
799 00:32:16.665 --> 00:32:20.645 to logically isolate vector space, um, amongst your tenants.
800 00:32:20.745 --> 00:32:23.965 So if you have, say, uh, you know, a handful of tenants,
801 00:32:24.305 --> 00:32:27.925 you would basically, uh, when you upstart them, give,
802 00:32:28.065 --> 00:32:29.845 assign a partition key to all
803 00:32:29.845 --> 00:32:31.485 of those vectors that belong to that tenant.
804 00:32:31.825 --> 00:32:34.525 So that query time, when you send the query to us,
805 00:32:34.625 --> 00:32:36.165 you can give it that partition key.
806 00:32:36.165 --> 00:32:38.005 And Zilliz will essentially ignore the rest
807 00:32:38.005 --> 00:32:41.045 of the vector space and only do the perform the a
808 00:32:41.045 --> 00:32:43.445 and n across the, the, the, the tenants vectors, right?
809 00:32:43.905 --> 00:32:47.085 Um, the other way to do it is with physical isolation, uh,
810 00:32:47.085 --> 00:32:50.725 where you create one collection per tenant, uh,
811 00:32:50.865 --> 00:32:52.205 and that also works as well.
812 00:32:52.505 --> 00:32:54.765 Uh, there's a lot of customers that prefer that just
813 00:32:54.765 --> 00:32:57.125 because, you know, they may have SLAs with their customers
814 00:32:57.125 --> 00:32:58.845 that say, no, you need physical isolation.
815 00:32:58.865 --> 00:33:00.045 You can't be commingled
816 00:33:00.045 --> 00:33:02.005 with other tenants, which is also fine.
817 00:33:02.065 --> 00:33:03.925 The downside there is, you know, there are,
818 00:33:04.215 --> 00:33:05.645 there are some upper limits on
819 00:33:05.645 --> 00:33:07.405 how many collections you can have in a cluster.
820 00:33:07.785 --> 00:33:09.565 You can obviously have more clusters, which is fine,
821 00:33:09.565 --> 00:33:11.885 you know, a lot of folks use that, uh, to, to get around it.
822 00:33:11.885 --> 00:33:14.525 But with logical isolation, with partition keys, I mean,
823 00:33:14.525 --> 00:33:15.845 you can have upwards.
824 00:33:15.845 --> 00:33:17.725 I mean, you can have millions of tenants right in, in,
825 00:33:17.725 --> 00:33:18.925 in the same collection.
826 00:33:19.225 --> 00:33:21.725 Um, and then you can send all of the queries to
827 00:33:21.725 --> 00:33:23.685 that same collection, uh, and, and,
828 00:33:23.685 --> 00:33:25.365 and still get exactly what you're looking for.
829 00:33:25.425 --> 00:33:28.045 And more importantly, you know, if you want to run
830 00:33:28.765 --> 00:33:29.885 a global a
831 00:33:29.885 --> 00:33:31.965 and n across all the tenants, you know, maybe say
832 00:33:31.965 --> 00:33:33.205 for analytics use cases,
833 00:33:33.305 --> 00:33:34.845 or, you know, you just, you just wanna see,
834 00:33:34.925 --> 00:33:36.165 I, I wanna see what this looks like.
835 00:33:36.505 --> 00:33:39.965 Um, you're, you would have to, in the, in, in the, um,
836 00:33:40.585 --> 00:33:43.765 in the one tenant per collection model, you would have
837 00:33:43.765 --> 00:33:45.765 to send, you know, a single query to all
838 00:33:45.765 --> 00:33:46.845 of those collections, right?
839 00:33:47.385 --> 00:33:49.405 And then kind of merge them together versus, you know,
840 00:33:49.405 --> 00:33:50.925 if you had them all in a single collection,
841 00:33:50.925 --> 00:33:52.845 you can just send one A and n and it'll just do all of them.
842 00:33:52.925 --> 00:33:55.045 So tho those are, those are the, we, we support both.
843 00:33:55.145 --> 00:33:57.165 Uh, you know, it's, it's, it's entirely up to you.
844 00:33:57.165 --> 00:33:59.245 And obviously we, we as a team, work with you very closely
845 00:33:59.385 --> 00:34:01.205 to, to figure out what trade off makes sense.
846 00:34:01.515 --> 00:34:03.485 Yeah. And typically it looks to me
847 00:34:03.485 --> 00:34:05.325 that people usually start, like, they start
848 00:34:05.325 --> 00:34:07.925 with the database, the collection, and then partition.
849 00:34:07.925 --> 00:34:12.045 But I think, uh, our advice to everybody is, uh, sit down
850 00:34:12.045 --> 00:34:14.845 with your solution architect, really describe your use case,
851 00:34:14.845 --> 00:34:15.845 what you're trying to achieve,
852 00:34:16.265 --> 00:34:18.565 and then they can help you, you know, go down the path
853 00:34:18.565 --> 00:34:20.965 that's gonna make the most sense for you from the get go,
854 00:34:20.965 --> 00:34:22.925 instead of like, kind of having to redo things.
855 00:34:23.745 --> 00:34:26.685 And then, um, also tied to multi-tenancy is
856 00:34:26.685 --> 00:34:29.725 that we have a pretty sophisticated set of, um,
857 00:34:29.775 --> 00:34:31.885 rules associated with RAC, right?
858 00:34:31.945 --> 00:34:32.945 Jay?
859 00:34:33.575 --> 00:34:38.225 Yeah. So, uh, RBAC is, is, is really around,
860 00:34:38.565 --> 00:34:41.585 um, what your organization looks like
861 00:34:42.045 --> 00:34:44.545 and who is allowed to do what on your side.
862 00:34:44.545 --> 00:34:45.745 So we obviously work with you
863 00:34:45.745 --> 00:34:47.985 to look at your org chart, right?
864 00:34:48.135 --> 00:34:50.785 Make sure that the right folks have access
865 00:34:50.805 --> 00:34:52.865 to the right things, and more importantly,
866 00:34:52.865 --> 00:34:54.225 the right microservices have
867 00:34:54.225 --> 00:34:55.385 access to the right things, right?
868 00:34:55.385 --> 00:34:57.625 If you have a microservice that is,
869 00:34:57.645 --> 00:35:00.025 should only be reading from certain clusters
870 00:35:00.025 --> 00:35:01.225 and not other clusters, right?
871 00:35:01.225 --> 00:35:04.465 We wanna make sure that, um, that, that that's enforced, um,
872 00:35:04.645 --> 00:35:06.065 at the, at the API level.
873 00:35:06.125 --> 00:35:10.065 So, uh, there, there's a lot of very granular controls that,
874 00:35:10.065 --> 00:35:12.305 that we provide not only for, for users,
875 00:35:12.405 --> 00:35:14.545 but also for, for authentication keys.
876 00:35:15.005 --> 00:35:17.425 Um, and, you know, obviously we work with, with,
877 00:35:17.455 --> 00:35:20.145 with your teams very closely just to make sure that, uh,
878 00:35:20.405 --> 00:35:22.145 you know, we're, we're enforcing everything
879 00:35:22.145 --> 00:35:24.465 and all of the, all of the RAC is set up correctly on your
880 00:35:24.465 --> 00:35:27.265 side so that you feel confident that, you know, if, uh,
881 00:35:27.485 --> 00:35:29.865 you know, if, if a microservice goes rogue, it's,
882 00:35:29.865 --> 00:35:31.705 it's not allowed to see inside
883 00:35:31.705 --> 00:35:33.185 of clusters that it's not supposed to see.
884 00:35:34.335 --> 00:35:35.935 Excellent. Uh, and then if you got,
885 00:35:35.935 --> 00:35:38.405 if everyone's looking at the ui, you can see there's a lot
886 00:35:38.405 --> 00:35:40.085 of other capabilities that Jay didn't go over,
887 00:35:40.085 --> 00:35:41.205 but I think they're pretty simple.
888 00:35:41.505 --> 00:35:43.005 Uh, he talked about backups,
889 00:35:43.005 --> 00:35:46.205 obviously you can set those up migrations, uh,
890 00:35:46.385 --> 00:35:49.085 you can also look and see how the jobs are doing.
891 00:35:49.385 --> 00:35:51.925 And then of course, uh, the metrics, you can do any kind
892 00:35:51.925 --> 00:35:54.845 of monitoring, even if you use, uh, other DevOps tools like,
893 00:35:54.905 --> 00:35:57.765 uh, like a Datadog to be able to see, you know, how well,
894 00:35:58.025 --> 00:36:01.125 um, your instances are, uh, utilizing their resources.
895 00:36:02.995 --> 00:36:05.725 Cool. Um, so let me just share the last slide.
896 00:36:06.065 --> 00:36:07.365 So we don't have any questions,
897 00:36:07.425 --> 00:36:10.435 but I do wanna, um, just, let's see.
898 00:36:10.985 --> 00:36:12.395 I'll stop sharing so that you can, thank you.
899 00:36:14.825 --> 00:36:17.075 Just wanna, uh, let everybody know.
900 00:36:17.095 --> 00:36:18.755 So how, how else can you get help from us
901 00:36:18.755 --> 00:36:20.515 besides coming to these webinars?
902 00:36:20.515 --> 00:36:22.115 You can join us on the Discord channel.
903 00:36:22.855 --> 00:36:25.755 Um, we also have, and there's the link,
904 00:36:25.755 --> 00:36:28.235 and I'll, I'll, we'll send all this information after this.
905 00:36:28.295 --> 00:36:32.235 Uh, after, um, today's session, uh, you can also set up a,
906 00:36:32.655 --> 00:36:35.075 uh, 20 minute, uh, private office hours.
907 00:36:35.655 --> 00:36:37.915 Uh, we can do this, you know, 24 7.
908 00:36:38.175 --> 00:36:41.395 So, uh, we're available to make sure that we help in, uh,
909 00:36:41.395 --> 00:36:43.355 both your Milvus and your Zilliz implementations.
910 00:36:43.855 --> 00:36:45.955 Uh, we can also, uh, um,
911 00:36:46.215 --> 00:36:48.915 put in your issues in GitHub, uh, as well.
912 00:36:48.915 --> 00:36:50.915 And that's where the entire engineering team is also
913 00:36:50.915 --> 00:36:52.795 available to answer any questions,
914 00:36:53.025 --> 00:36:55.755 whether it's a feature request or you found a bug,
915 00:36:55.775 --> 00:36:57.755 or you, something's a little bit tricky, just pop
916 00:36:57.755 --> 00:36:59.195 that in there if that's easy for you.
917 00:36:59.895 --> 00:37:01.675 Uh, we also have a little chatbot.
918 00:37:01.675 --> 00:37:04.435 Of course, we should, since we drive a lot
919 00:37:04.435 --> 00:37:06.115 of the chatbots on our docs pages,
920 00:37:06.235 --> 00:37:08.155 I just put a little screen there, screenshot of that.
921 00:37:08.815 --> 00:37:11.915 And then, uh, we can also set up a, uh,
922 00:37:12.155 --> 00:37:14.235 a private Slack channel, which I didn't put in here,
923 00:37:14.255 --> 00:37:15.995 but that's also a possibility if
924 00:37:15.995 --> 00:37:18.915 that's your preferred method for getting any kind of help.
925 00:37:19.815 --> 00:37:21.115 So, and, uh,
926 00:37:21.175 --> 00:37:24.755 we wish you great success in implementing your Zilliz
927 00:37:24.755 --> 00:37:26.475 incense, uh, but don't be a stranger.
928 00:37:26.535 --> 00:37:27.955 Let us know how you're, what you're building,
929 00:37:28.375 --> 00:37:30.075 how you're doing, how we can help.
930 00:37:30.415 --> 00:37:31.475 Uh, we're always here
931 00:37:31.495 --> 00:37:33.275 to help if we wanna make you successful.
932 00:37:34.335 --> 00:37:36.315 Jay, any last words of, uh, advice?
933 00:37:37.705 --> 00:37:39.765 No, uh, have fun. Uh, it's, it's a cool product.
934 00:37:40.205 --> 00:37:41.445 You'll, you'll, and, uh,
935 00:37:41.445 --> 00:37:42.525 we're always here to help. Of course.
936 00:37:43.325 --> 00:37:45.405 Excellent. All right. Have a great one everyone.
937 00:37:45.405 --> 00:37:46.565 We'll see you again. Bye.
Meet the Speaker
Join the session for live Q&A with the speaker
Jay Byoun
Solutions Architect, Zilliz
Solutions Architect at Zilliz. Previously Staff Solutions Engineer at Pinecone. Previous background in software engineering and Blockchain architecture