You’re in!
Webinar
Monthly Product Demo: Discover the Power of Zilliz Cloud
Resources
WEBVTT
1 00:00:03.495 --> 00:00:06.315 My name is Chris Ello and I work here at Zilliz.
2 00:00:06.845 --> 00:00:10.075 Today we are going to do our monthly, uh, cloud demo,
3 00:00:10.535 --> 00:00:11.595 and I encourage everybody
4 00:00:11.735 --> 00:00:15.795 to put your questions in either the chat or the q and a.
5 00:00:16.375 --> 00:00:20.555 Um, and, um, what we'll do is, uh, during this session, if,
6 00:00:20.735 --> 00:00:22.675 if JRI, um, see questions,
7 00:00:22.755 --> 00:00:24.555 we'll be answering them as quick as we can.
8 00:00:25.175 --> 00:00:28.355 Uh, but at the end of the demo I'll also open up lines.
9 00:00:28.555 --> 00:00:30.275 'cause I think we have a manageable group of people,
10 00:00:30.295 --> 00:00:32.595 so we can just ask questions directly to Jay.
11 00:00:33.215 --> 00:00:34.835 Uh, and then I'll, um, make sure
12 00:00:34.835 --> 00:00:36.235 that we keep those, uh, recorded.
13 00:00:36.735 --> 00:00:40.035 Um, and, uh, yeah, let's just, uh, get started here.
14 00:00:40.215 --> 00:00:44.915 So I'm just gonna quickly go over, uh, cloud.
15 00:00:45.015 --> 00:00:47.275 And I know that everybody here, um,
16 00:00:47.305 --> 00:00:49.195 must already know a little bit about Milvus
17 00:00:49.255 --> 00:00:51.155 or else you wouldn't be, uh, joining us today.
18 00:00:51.255 --> 00:00:53.835 But basically, you know, Zilliz cloud is built on top
19 00:00:53.895 --> 00:00:55.955 of our open source project, uh, Milvus.
20 00:00:56.695 --> 00:01:00.715 And, um, you know, I think a lot of people,
21 00:01:00.855 --> 00:01:02.515 or a lot of companies when they, um,
22 00:01:02.515 --> 00:01:03.715 have an open source project,
23 00:01:04.065 --> 00:01:06.315 they simply offer a hosted version
24 00:01:06.575 --> 00:01:09.115 of the open source project, maybe with a little bit of,
25 00:01:09.215 --> 00:01:11.915 you know, kind of extra, you know, billing roles
26 00:01:12.055 --> 00:01:13.475 or maybe there's a little bit of security.
27 00:01:14.055 --> 00:01:16.675 But we decided from the get go that we needed
28 00:01:16.675 --> 00:01:19.075 to be much more than just a hosted version.
29 00:01:20.015 --> 00:01:22.835 And, um, these are kind of three of the, the kind
30 00:01:22.835 --> 00:01:26.195 of the core differences between Milvus and Zilliz.
31 00:01:26.695 --> 00:01:29.475 So the first thing is, even though we have a very performant
32 00:01:29.615 --> 00:01:32.075 search engine under Milvus, so if you go to GitHub,
33 00:01:32.075 --> 00:01:35.675 you might see something called nowhere, K-N-O-W-H-E-R-E.
34 00:01:36.065 --> 00:01:38.155 That is our search engine in Milvus.
35 00:01:38.935 --> 00:01:43.195 And, um, and we also support, uh, 11 different indexes.
36 00:01:43.375 --> 00:01:45.555 Uh, so, you know, it makes it really useful
37 00:01:45.555 --> 00:01:47.835 because, um, you know, every one
38 00:01:47.835 --> 00:01:50.235 of us is gonna have a very unique, uh, set
39 00:01:50.235 --> 00:01:51.635 of requirements tied to our use case.
40 00:01:51.635 --> 00:01:55.115 So having the ability to pick the index that's gonna fit
41 00:01:55.115 --> 00:01:56.675 with our use cases is really good.
42 00:01:57.335 --> 00:01:59.755 But we decided that when we created Zilliz Cloud,
43 00:01:59.985 --> 00:02:01.035 that wasn't good enough.
44 00:02:01.415 --> 00:02:03.435 We wanted to make sure that we were even more
45 00:02:03.435 --> 00:02:04.515 performant than Milvus.
46 00:02:04.895 --> 00:02:08.315 And also we wanted to make sure that we take the burden off
47 00:02:08.315 --> 00:02:09.915 of your shoulders of picking the index.
48 00:02:09.935 --> 00:02:11.515 So we have something called Auto Index
49 00:02:11.515 --> 00:02:14.155 that Jay will talk about briefly in the demo.
50 00:02:14.935 --> 00:02:18.035 And then also, because you have very unique requirements,
51 00:02:18.425 --> 00:02:21.275 some of your use cases are gonna have, um, you know,
52 00:02:21.365 --> 00:02:24.035 maybe you have a really strict latency requirements,
53 00:02:24.095 --> 00:02:25.155 or you have a lot of people
54 00:02:25.305 --> 00:02:27.275 that are attacking your application.
55 00:02:27.295 --> 00:02:28.355 And so there's a lot of queries
56 00:02:28.355 --> 00:02:29.595 that are hitting the database.
57 00:02:29.905 --> 00:02:32.115 Everybody's gonna have different requirements.
58 00:02:32.815 --> 00:02:35.075 And so we wanna make sure that we can help you
59 00:02:35.135 --> 00:02:38.635 to tune the database to fit what your needs actually are,
60 00:02:38.695 --> 00:02:40.915 and try to do this in a really simple way.
61 00:02:41.695 --> 00:02:43.995 In addition, it's, uh, a cloud native database.
62 00:02:43.995 --> 00:02:45.915 And of course, that seems like a lot of jargon
63 00:02:45.915 --> 00:02:48.155 or what just talks about it, but at the end of the day,
64 00:02:48.335 --> 00:02:51.995 it really is about making sure that, um, we are scalable,
65 00:02:52.215 --> 00:02:54.475 but we allow you to autoscale up and down,
66 00:02:54.615 --> 00:02:56.315 and Jay will go over that as well.
67 00:02:56.935 --> 00:03:00.325 And then finally, um, you know, SaaS applications need
68 00:03:00.325 --> 00:03:02.445 to be, uh, secure and, uh,
69 00:03:02.545 --> 00:03:04.685 and provide, uh, all the, uh, compliance
70 00:03:04.685 --> 00:03:08.005 and regulatory, um, certifications that your, uh,
71 00:03:08.325 --> 00:03:09.485 security teams are asking for.
72 00:03:11.485 --> 00:03:14.665 Um, so Cardinal, as I mentioned, is our search engine.
73 00:03:15.325 --> 00:03:18.785 And, um, I'm gonna actually skip this
74 00:03:18.785 --> 00:03:20.865 and let Jay go into a lot more details here.
75 00:03:21.725 --> 00:03:23.625 Um, and then you can see here, you know,
76 00:03:23.625 --> 00:03:25.665 what's the difference in a little bit more detail
77 00:03:25.665 --> 00:03:28.025 between open source Milvus and Zilliz.
78 00:03:28.445 --> 00:03:31.265 So, uh, of course, you know, I picked all the features, uh,
79 00:03:31.265 --> 00:03:33.025 that are not available in Milvus.
80 00:03:33.025 --> 00:03:35.825 So you, it looks like, oh my God, Zilliz is super wonderful,
81 00:03:35.845 --> 00:03:37.945 but hopefully you can appreciate that.
82 00:03:37.965 --> 00:03:40.425 You know, things like migrations, backup
83 00:03:41.025 --> 00:03:43.705 capacity planning updates, autoscale, you know,
84 00:03:43.705 --> 00:03:46.185 these are the things that we figured that, you know,
85 00:03:46.185 --> 00:03:47.825 let's take that burden off of your shoulder
86 00:03:47.925 --> 00:03:51.025 and put that into the, uh, fully managed, uh, Zilliz cloud.
87 00:03:52.875 --> 00:03:56.615 Um, we also work really hard to make sure that we maintain,
88 00:03:56.755 --> 00:03:58.055 uh, enterprise readiness.
89 00:03:58.635 --> 00:04:01.335 So we have, um, uh, a lot
90 00:04:01.335 --> 00:04:03.175 of this details on our security page,
91 00:04:03.515 --> 00:04:05.175 but we wanna make sure that, you know,
92 00:04:05.235 --> 00:04:08.015 we look at security from all levels, you know,
93 00:04:08.035 --> 00:04:12.055 all the way from, you know, uh, the, your data to the roles,
94 00:04:12.405 --> 00:04:15.135 even how we interface, uh, with Zilliz Cloud.
95 00:04:16.725 --> 00:04:19.585 And then finally, just as a reminder, you know, we have, um,
96 00:04:19.655 --> 00:04:21.905 basically, uh, three different offerings.
97 00:04:22.005 --> 00:04:24.665 We have Milvus, which comes in three different versions.
98 00:04:24.845 --> 00:04:27.065 Uh, a light version, which is, uh, can,
99 00:04:27.125 --> 00:04:28.265 is actually an embedded version.
100 00:04:28.325 --> 00:04:29.905 So you can put that on something really small
101 00:04:30.005 --> 00:04:31.665 or just throw them to a Jupyter Notebook.
102 00:04:32.285 --> 00:04:35.145 We have a, um, a standalone,
103 00:04:35.145 --> 00:04:36.825 and then we have a fully distributed version.
104 00:04:36.965 --> 00:04:40.425 So the really powerful, uh, um, set of databases
105 00:04:40.445 --> 00:04:41.705 for everyone to get started with.
106 00:04:42.205 --> 00:04:43.905 We also have Zilliz Cloud, which is where
107 00:04:43.905 --> 00:04:45.545 what we're gonna go over today in our demo.
108 00:04:46.045 --> 00:04:48.905 And then we also have, uh, Zilliz Cloud BYOC,
109 00:04:48.915 --> 00:04:53.145 where we have done, where we've separated the, uh, the, uh,
110 00:04:53.145 --> 00:04:54.545 data plane from the control plane.
111 00:04:54.605 --> 00:04:56.265 So this is gonna be for enterprises
112 00:04:56.265 --> 00:04:59.785 that have very stringent, um, security requirements,
113 00:05:00.245 --> 00:05:01.585 but really at the end of the day,
114 00:05:01.925 --> 00:05:03.105 you just have to build once.
115 00:05:03.105 --> 00:05:06.265 We're not gonna make you rebuild, uh, your database, uh,
116 00:05:06.285 --> 00:05:09.105 if you need to migrate them to any of these instances.
117 00:05:09.605 --> 00:05:11.985 But today we're gonna focus on Zilliz cloud.
118 00:05:12.565 --> 00:05:14.305 So, with that, I'm gonna stop talking
119 00:05:15.125 --> 00:05:17.905 and I'm gonna pass the baton over to Jay.
120 00:05:19.485 --> 00:05:21.995 Thank you so much, Chris. Good morning.
121 00:05:22.145 --> 00:05:25.395 Good evening everyone. I am gonna share my screen.
122 00:05:25.395 --> 00:05:26.515 Gimme one moment please.
123 00:05:30.215 --> 00:05:33.695 Alright, you guys can see my screen right? Cool.
124 00:05:34.075 --> 00:05:36.455 Um, okay, so I'll talk about recall, uh,
125 00:05:36.455 --> 00:05:37.735 tuning recall rate in a second,
126 00:05:37.835 --> 00:05:39.855 but I think I kinda wanna step back a little bit
127 00:05:40.475 --> 00:05:43.695 and talk a little bit about rag, um,
128 00:05:43.875 --> 00:05:45.855 or retrieval augmented generation.
129 00:05:46.355 --> 00:05:47.495 Uh, just because you know, it,
130 00:05:47.495 --> 00:05:49.775 it is the most popular use case in, in,
131 00:05:49.775 --> 00:05:51.975 in arguably like the current killer app
132 00:05:52.095 --> 00:05:53.095 for vector databases.
133 00:05:53.155 --> 00:05:56.015 So I want to kind of put a lot of the Zilliz features kind
134 00:05:56.015 --> 00:05:58.125 of into that context as, uh, I would presume a lot
135 00:05:58.125 --> 00:05:59.565 of you guys are, are really interested in
136 00:05:59.565 --> 00:06:00.845 that particular use case, right?
137 00:06:01.225 --> 00:06:03.885 So what does RAG allow us to do, right?
138 00:06:03.905 --> 00:06:07.365 It allows us to, you know, include any data
139 00:06:08.225 --> 00:06:11.445 in the context window of modern large language models.
140 00:06:11.745 --> 00:06:13.405 Um, so it, in other words, it allows you
141 00:06:13.405 --> 00:06:15.725 to answer questions right, about topics
142 00:06:15.955 --> 00:06:18.725 that the LLMs were not originally trained on, right?
143 00:06:18.725 --> 00:06:21.245 So this could be internal facing documentation.
144 00:06:21.625 --> 00:06:23.805 If you're a law firm, it could be motions
145 00:06:23.805 --> 00:06:25.325 that you filed in your current litigation.
146 00:06:25.345 --> 00:06:27.405 If you're a doctor, it could be any data on any
147 00:06:27.405 --> 00:06:28.405 of your patients, and kind of, you
148 00:06:28.405 --> 00:06:29.365 know, the list goes on and on, right?
149 00:06:29.705 --> 00:06:32.525 Um, and the way this works is by way of embedding models,
150 00:06:32.525 --> 00:06:35.725 which are these very expensive pre-trained, right?
151 00:06:36.025 --> 00:06:39.285 Uh, spatial like representations of semantic context, right?
152 00:06:39.345 --> 00:06:42.205 In, in space. So in other words, it's like a blob of text
153 00:06:42.205 --> 00:06:43.525 that has some semantic meaning,
154 00:06:44.105 --> 00:06:46.805 and that semantic meaning is represented by some location,
155 00:06:47.145 --> 00:06:49.245 uh, with coordinates that are, are, are measurable, right?
156 00:06:49.705 --> 00:06:53.925 So vector databases measure the distance between one blob
157 00:06:53.925 --> 00:06:55.285 of text to another blo of text,
158 00:06:55.335 --> 00:06:58.685 which gives us some indication of how semantically relevant
159 00:06:58.835 --> 00:07:01.645 that text is to that other piece of text, regardless
160 00:07:01.665 --> 00:07:04.885 of whether there are any exact lexical matches, right?
161 00:07:04.885 --> 00:07:07.325 And so this is a very, uh, this is kind of a departure,
162 00:07:07.505 --> 00:07:09.125 you know, from, from traditional search,
163 00:07:09.275 --> 00:07:12.285 like things like Apache Lucin that have relied heavily on,
164 00:07:12.305 --> 00:07:14.045 on Lexile matching for a long time, right?
165 00:07:14.045 --> 00:07:15.605 So you can have like a blob of text
166 00:07:15.755 --> 00:07:18.485 that has no lexile matches to another blob of text,
167 00:07:18.705 --> 00:07:20.205 but could be semantically relevant,
168 00:07:20.345 --> 00:07:23.005 and those would be placed closer together in vector space.
169 00:07:23.005 --> 00:07:25.405 And that's what vector databases allow us to do, is, is, is
170 00:07:25.405 --> 00:07:26.765 to measure those distances, right?
171 00:07:27.585 --> 00:07:31.005 So, okay, how does Zilliz help us here, right?
172 00:07:31.865 --> 00:07:34.525 The, the, once we have all of the data, so all
173 00:07:34.525 --> 00:07:36.845 of our patient data or, you know, it's the, the, the, the,
174 00:07:36.845 --> 00:07:38.485 the stuff that we're interested in, in,
175 00:07:38.485 --> 00:07:41.365 in searching over into bite-sized chunks with dents
176 00:07:41.365 --> 00:07:42.325 and beddings attached to them
177 00:07:42.395 --> 00:07:43.605 with loaded into Zilliz, right?
178 00:07:43.905 --> 00:07:47.685 We can send Zilliz a vector query, uh,
179 00:07:47.785 --> 00:07:50.525 and ask it to give us the vectors that are closest
180 00:07:50.785 --> 00:07:52.725 to the one that we just sent, right?
181 00:07:52.755 --> 00:07:55.765 This is where Zilliz does its distance measuring
182 00:07:55.765 --> 00:07:58.085 and performs what's called an approximate nearest
183 00:07:58.285 --> 00:07:59.365 neighbor or a and n.
184 00:07:59.705 --> 00:08:02.285 Uh, and we utilize approximate instead
185 00:08:02.285 --> 00:08:04.965 of no nearest neighbor, because doing a brute force,
186 00:08:05.185 --> 00:08:07.605 you know, geometric distance measurement on every single
187 00:08:07.635 --> 00:08:10.605 node and every other node attached to it, um, you know, in,
188 00:08:10.625 --> 00:08:11.765 in, in space, uh,
189 00:08:11.765 --> 00:08:13.765 and returning in some reasonable amount of time,
190 00:08:14.245 --> 00:08:15.845 i i is currently infeasible for,
191 00:08:15.945 --> 00:08:17.245 for a lot of use cases, right?
192 00:08:17.245 --> 00:08:19.285 You want, you want to be able to get results back relatively
193 00:08:19.285 --> 00:08:21.885 quickly, you know, even if you say,
194 00:08:22.015 --> 00:08:23.405 don't get all the matches, right?
195 00:08:23.825 --> 00:08:25.685 Uh, you know, it's, it's actually more important
196 00:08:25.705 --> 00:08:28.005 to return in, you know, a hundred milliseconds
197 00:08:28.005 --> 00:08:30.525 or 500 milliseconds, not like in, in, in minutes, right?
198 00:08:30.525 --> 00:08:33.085 Which is un uh, unusable for most of the time.
199 00:08:33.545 --> 00:08:34.885 Uh, so the way that the a
200 00:08:34.885 --> 00:08:37.285 and n indexes work is, you know, obviously outside the scope
201 00:08:37.285 --> 00:08:40.165 of this webinar, but Zilliz has a very simple abstraction
202 00:08:40.165 --> 00:08:43.365 layer for you, uh, that allows you to make trade-offs
203 00:08:43.595 --> 00:08:46.605 with recall accuracy and curry latency.
204 00:08:46.865 --> 00:08:49.965 So, if you remember right, we're approximating, right?
205 00:08:50.185 --> 00:08:53.085 If we got all the neighbors in the search, so we
206 00:08:53.915 --> 00:08:56.845 exposed something called a level parameter in Zilliz.
207 00:08:56.845 --> 00:08:58.245 So I'll, I'll get in that in a second.
208 00:08:58.395 --> 00:09:00.125 It's, it's right here. Um,
209 00:09:01.225 --> 00:09:03.845 and this is incredibly useful depending on
210 00:09:04.135 --> 00:09:07.245 where your vector database sits in your pipeline.
211 00:09:07.545 --> 00:09:09.165 So if it's a user facing pipeline,
212 00:09:09.165 --> 00:09:12.525 you probably wanna tune more towards faster latency just
213 00:09:12.525 --> 00:09:13.685 because your user's probably
214 00:09:13.685 --> 00:09:15.125 waiting on, on you for something.
215 00:09:15.425 --> 00:09:16.885 If, if you're, you know, if you have more
216 00:09:16.885 --> 00:09:19.005 of an analytics pipeline, uh, and,
217 00:09:19.025 --> 00:09:21.245 and you're really interested in getting, you know, all
218 00:09:21.245 --> 00:09:23.805 of the, you know, the best recall possible, you might want
219 00:09:23.805 --> 00:09:25.165 to lean more towards recall, right?
220 00:09:25.185 --> 00:09:26.645 And the balance is up to you.
221 00:09:26.945 --> 00:09:29.925 We provide this through a very simple abstraction, right?
222 00:09:29.945 --> 00:09:32.645 So when you, this is, this is an example of a, of a query
223 00:09:32.675 --> 00:09:34.525 with our Python, SDK, uh,
224 00:09:34.625 --> 00:09:36.125 at the top you'll see the query vector.
225 00:09:36.225 --> 00:09:37.765 So that's the vector that we're actually
226 00:09:37.765 --> 00:09:38.845 telling the database.
227 00:09:39.075 --> 00:09:40.965 Give me all the neighbors close to closest
228 00:09:41.105 --> 00:09:42.285 to this particular location.
229 00:09:42.825 --> 00:09:45.165 And we're basically saying, you know, I want, uh,
230 00:09:45.295 --> 00:09:46.325 gimme the closest three.
231 00:09:46.465 --> 00:09:49.245 So this is the top K and the levels parameter, which is
232 00:09:49.245 --> 00:09:50.645 what I just mentioned, is this right here.
233 00:09:50.705 --> 00:09:55.205 So this defaults to one, uh, and will go up to 10,
234 00:09:55.785 --> 00:09:57.445 and you can kind of play around with this.
235 00:09:57.465 --> 00:09:58.925 And this is done by query, right?
236 00:09:58.925 --> 00:10:00.325 So you can send one query at one
237 00:10:00.325 --> 00:10:01.885 and one query at 10, one query at five,
238 00:10:01.885 --> 00:10:03.925 and kind of play around with it to see, you know,
239 00:10:04.075 --> 00:10:06.325 what the recall looks like in each of these scenarios
240 00:10:06.825 --> 00:10:09.885 and what the latency looks like in each of those scenarios.
241 00:10:09.885 --> 00:10:11.125 So it's very flexible in that nature.
242 00:10:11.185 --> 00:10:13.005 You don't have to rebuild the entire index every
243 00:10:13.005 --> 00:10:14.125 time you do this exercise.
244 00:10:14.475 --> 00:10:16.485 It's, it's, it's very, um, you know,
245 00:10:16.585 --> 00:10:18.325 on the fly kind of adjustments, right?
246 00:10:18.325 --> 00:10:19.365 Which is, which is very nice.
247 00:10:19.705 --> 00:10:21.165 Uh, you know, for context, a lot
248 00:10:21.165 --> 00:10:22.725 of other vector database solutions,
249 00:10:23.155 --> 00:10:25.965 they pick a single index most of the time.
250 00:10:26.385 --> 00:10:28.685 Uh, most of the time they'll pick one index, you know,
251 00:10:28.685 --> 00:10:31.925 whether it's HNSW or disk NN or face,
252 00:10:31.985 --> 00:10:33.805 or you know, any of the other kind of, you know,
253 00:10:33.825 --> 00:10:36.325 off the shelf, uh, index algorithms
254 00:10:36.325 --> 00:10:38.325 that are out there right now, uh,
255 00:10:38.465 --> 00:10:39.925 all have their own trade offs, right?
256 00:10:39.925 --> 00:10:41.405 They all have their, they made their trade off
257 00:10:41.405 --> 00:10:44.005 between recall accuracy and query latency,
258 00:10:44.005 --> 00:10:45.365 and you're just kind of, that's the one you get
259 00:10:45.365 --> 00:10:46.605 and you're just kind of stuck with that.
260 00:10:46.905 --> 00:10:49.645 Um, in Milvus, you're allowed to swap those out, right?
261 00:10:49.645 --> 00:10:52.605 So you can swap them out for something else if you want to,
262 00:10:52.985 --> 00:10:54.845 but it's a, you know, manual process.
263 00:10:55.065 --> 00:10:57.965 You have to know what the characteristics of each, uh,
264 00:10:57.975 --> 00:10:59.325 index algorithm is, right?
265 00:10:59.325 --> 00:11:02.845 So this, we kind of expose to Zilliz cloud customers,
266 00:11:03.185 --> 00:11:06.965 and again, it's, it's another way that we, we say, look,
267 00:11:06.985 --> 00:11:08.685 here's a managed servers that will kind
268 00:11:08.685 --> 00:11:11.165 of abstract away a lot of this like vector complexity
269 00:11:11.225 --> 00:11:15.765 for you, and expose this very nice, easy to understand, uh,
270 00:11:15.855 --> 00:11:19.445 lever that you can pull to kind of change the way, uh, that,
271 00:11:19.445 --> 00:11:20.565 that, that the recall is done.
272 00:11:20.625 --> 00:11:22.565 So, um, again, don't feel free
273 00:11:22.565 --> 00:11:24.645 to not take any notes on this, you know, well,
274 00:11:24.645 --> 00:11:26.445 we can send all this documentation at, at the end
275 00:11:26.445 --> 00:11:28.605 of this call, but I think this is really important
276 00:11:28.605 --> 00:11:29.765 to highlight, uh, just
277 00:11:29.765 --> 00:11:32.405 because we find that a lot of customers find a lot
278 00:11:32.405 --> 00:11:33.645 of use in this, uh,
279 00:11:33.705 --> 00:11:36.525 and it gives us a lot of flexibility for many,
280 00:11:36.525 --> 00:11:38.365 many use cases where, you know, you don't have
281 00:11:38.365 --> 00:11:39.725 to necessarily swap things out.
282 00:11:40.225 --> 00:11:42.765 And again, this is all part of that cardinal, uh,
283 00:11:42.855 --> 00:11:45.565 index engine that Chris, uh, mentioned earlier.
284 00:11:45.755 --> 00:11:48.245 There's a lot of other benefits to Cardinal as well that,
285 00:11:48.245 --> 00:11:49.885 you know, I won't go into too much detail,
286 00:11:49.905 --> 00:11:52.325 but you know, off the top of my head, uh, there's a lot
287 00:11:52.325 --> 00:11:54.485 of interesting things that we're doing with Quantization
288 00:11:54.695 --> 00:11:59.085 where, um, you know, a lot of the vectors have, uh, many,
289 00:11:59.115 --> 00:12:01.605 many, you know, floating point numbers attached to them.
290 00:12:01.985 --> 00:12:04.325 If you truncate them, uh, you know, you'll might,
291 00:12:04.345 --> 00:12:07.685 you might lose maybe two or 3% in recall accuracy,
292 00:12:08.105 --> 00:12:12.485 but, you know, you can save upwards of 30, 40, 50% on, on,
293 00:12:12.505 --> 00:12:14.165 on your compute and storage, right?
294 00:12:14.225 --> 00:12:17.445 So that's a trade off that some folks, you know, wanna make.
295 00:12:17.585 --> 00:12:20.565 Uh, and, and, and Cardinal looks at your stack
296 00:12:20.565 --> 00:12:22.045 and looks at your pipeline
297 00:12:22.065 --> 00:12:23.805 and makes those necessary adjustments.
298 00:12:23.805 --> 00:12:25.365 So there's a lot of intelligence built into it,
299 00:12:25.365 --> 00:12:26.565 and there's a lot of other tuning in
300 00:12:26.785 --> 00:12:30.685 and, um, you know, uh, adjustments that, that we've made
301 00:12:30.685 --> 00:12:33.645 to kind of make it fit almost every use case, right?
302 00:12:33.985 --> 00:12:35.645 Uh, you know, we also work very closely
303 00:12:35.645 --> 00:12:37.285 with customers if there's, you know, a,
304 00:12:37.405 --> 00:12:38.645 a bleeding edge use case
305 00:12:38.785 --> 00:12:42.045 or something that's, uh, that, that they like to do,
306 00:12:42.045 --> 00:12:43.885 that falls outside of the bounds of
307 00:12:43.885 --> 00:12:45.805 what auto Index is currently capable of,
308 00:12:46.025 --> 00:12:48.085 and we're able to tune that as well and,
309 00:12:48.085 --> 00:12:49.805 and kind of work with you guys to make sure
310 00:12:49.805 --> 00:12:52.005 that Zilliz is performing, uh, you know, and,
311 00:12:52.005 --> 00:12:53.805 and we're making the right technical trade-offs.
312 00:12:53.805 --> 00:12:56.165 Uh, and, and it's while still kind of maintaining this,
313 00:12:56.475 --> 00:12:59.765 this managed service, um, uh, solution, right?
314 00:12:59.865 --> 00:13:02.885 So, uh, so that, that's, that's pretty much it on the, the,
315 00:13:02.985 --> 00:13:04.725 the, the, the level parameter.
316 00:13:04.725 --> 00:13:07.365 I want to move a little bit towards scale.
317 00:13:07.645 --> 00:13:09.525 'cause I think scale is also important to talk about.
318 00:13:09.985 --> 00:13:13.805 Uh, we at Zilliz, um, think
319 00:13:13.865 --> 00:13:15.845 of scale from the very beginning, right?
320 00:13:15.845 --> 00:13:18.925 So the entire database was architected for many,
321 00:13:18.925 --> 00:13:20.725 many billions of vectors, right?
322 00:13:20.725 --> 00:13:23.205 And we do have customers running, you know, north
323 00:13:23.205 --> 00:13:26.445 of 20 billion, 30 billion, 40 billion vector workloads.
324 00:13:26.505 --> 00:13:28.165 Uh, and it is performing very well.
325 00:13:28.625 --> 00:13:31.085 And this is where Zilliz really shines.
326 00:13:31.465 --> 00:13:34.605 Uh, you know, if you're talking about a vector space that's,
327 00:13:34.865 --> 00:13:37.645 you know, sub, uh, 20 million vectors,
328 00:13:37.645 --> 00:13:39.165 maybe you sub 10 million vectors, right?
329 00:13:39.225 --> 00:13:40.325 You could probably pretty much use
330 00:13:40.485 --> 00:13:41.525 anything else that's out there right now.
331 00:13:41.605 --> 00:13:44.525 I think there's a lot of really great, uh, you know,
332 00:13:44.525 --> 00:13:46.485 vector solutions that have popped up recently.
333 00:13:46.565 --> 00:13:49.285 A lot of bolt-ons from existing legacy players that,
334 00:13:49.505 --> 00:13:51.725 you know, want, uh, their customers to be able
335 00:13:51.725 --> 00:13:53.645 to use this capability while, you know,
336 00:13:53.665 --> 00:13:54.845 not migrating away from
337 00:13:54.845 --> 00:13:56.005 what they're already currently using.
338 00:13:56.005 --> 00:13:58.085 It could be SQL database, non SQL database, uh,
339 00:13:58.085 --> 00:14:00.445 it could be a data warehouse, you know, handful
340 00:14:00.465 --> 00:14:01.685 of other solutions, right?
341 00:14:02.225 --> 00:14:07.135 Um, and it, it's, it's, it's generally okay, you know,
342 00:14:07.135 --> 00:14:09.495 if if you're, if you're looking at, you know, a handful
343 00:14:09.555 --> 00:14:11.295 of millions of vectors, it's, it's fine, right?
344 00:14:11.525 --> 00:14:14.335 When you start getting to hundreds of millions, you know,
345 00:14:14.395 --> 00:14:17.215 and, and up north of a billion vectors, uh, the,
346 00:14:17.215 --> 00:14:19.055 the architecture starts to matter a lot more, right?
347 00:14:19.875 --> 00:14:22.215 And those databases were designed
348 00:14:22.515 --> 00:14:24.095 for different use cases, right?
349 00:14:24.095 --> 00:14:26.375 They were, they were designed for a NoSQL use case,
350 00:14:26.375 --> 00:14:28.055 or they were designed for a SQL use case.
351 00:14:28.055 --> 00:14:31.615 They were not designed to store data where you want
352 00:14:31.615 --> 00:14:32.775 to store the vectors
353 00:14:32.775 --> 00:14:35.535 that are closest in vector space together on disc, so
354 00:14:35.535 --> 00:14:37.815 that when you read it in right, you'll get all the vectors
355 00:14:37.815 --> 00:14:40.495 that you need an all in one shot instead
356 00:14:40.495 --> 00:14:42.775 of loading in shards from all over the place,
357 00:14:42.775 --> 00:14:45.375 because that's how your, your, your database is architected.
358 00:14:45.375 --> 00:14:47.575 So there's a lot of little things to consider like that.
359 00:14:47.755 --> 00:14:49.895 Uh, when, when choosing a vector database, again,
360 00:14:50.235 --> 00:14:52.685 if you're talking about scale, that's,
361 00:14:52.685 --> 00:14:54.805 that's not gonna reach, you know, the hundreds of millions,
362 00:14:54.805 --> 00:14:55.645 you're probably fine, you know,
363 00:14:55.645 --> 00:14:56.725 with, with a lot of other things.
364 00:14:56.785 --> 00:14:59.165 But if you're really serious about running a production
365 00:14:59.165 --> 00:15:01.885 vector workload and you have a ton of data, uh,
366 00:15:01.885 --> 00:15:04.205 that you need to search across, it might be multi-tenant,
367 00:15:04.205 --> 00:15:05.925 it might not be multi-tenant, uh, you know,
368 00:15:06.125 --> 00:15:07.925 Zilliz is really shines in in this respect.
369 00:15:08.105 --> 00:15:11.565 So, uh, I'll, I'll walk through a little bit about
370 00:15:11.665 --> 00:15:13.965 how we've abstracted this for, for,
371 00:15:13.965 --> 00:15:15.325 for Zilliz cloud customers.
372 00:15:15.665 --> 00:15:18.285 Uh, you know, essentially we have, uh, a handful of
373 00:15:18.285 --> 00:15:19.765 what we call CU types, right?
374 00:15:19.765 --> 00:15:20.965 So they're compute units.
375 00:15:21.105 --> 00:15:24.445 And you can think of these as instances in EC2 instances in
376 00:15:24.445 --> 00:15:27.685 AWS it's a very, you know, it, uh, similar analogy, uh,
377 00:15:27.825 --> 00:15:31.565 you know, where each cu would constitute some amount
378 00:15:31.565 --> 00:15:32.845 of vector capacity, right?
379 00:15:32.845 --> 00:15:34.125 So if you have, say, uh,
380 00:15:34.125 --> 00:15:37.685 5 million vectors that'll fit on some amount of cu, right?
381 00:15:37.685 --> 00:15:40.925 And we have different types of cu that are optimized
382 00:15:41.105 --> 00:15:42.205 for different use cases.
383 00:15:42.665 --> 00:15:45.605 Uh, the primary two we have are performance and capacity.
384 00:15:46.185 --> 00:15:48.725 So if you need the maximum performance
385 00:15:48.725 --> 00:15:51.285 and your, your latency is the most important thing to you,
386 00:15:51.745 --> 00:15:54.645 we have a performance optimized cu that you can use.
387 00:15:55.185 --> 00:15:58.725 Uh, if, if, uh, you're, you're, you're more interested in
388 00:15:59.365 --> 00:16:00.925 capacity per cu, but you're willing
389 00:16:00.925 --> 00:16:02.445 to sacrifice a little bit of latency
390 00:16:02.445 --> 00:16:06.525 and a little bit of concurrency, we also have a capacity ICU
391 00:16:06.525 --> 00:16:08.645 and you get to pick, uh, at, at, at the beginning.
392 00:16:08.945 --> 00:16:11.645 Uh, and it really is based on your workload
393 00:16:11.945 --> 00:16:13.565 and the, the type of application
394 00:16:13.565 --> 00:16:15.445 and ob obviously where it sits in your pipeline.
395 00:16:15.745 --> 00:16:17.445 Uh, and, and our, our team, our,
396 00:16:17.445 --> 00:16:19.885 our solutions architect team is obviously available, uh,
397 00:16:19.885 --> 00:16:21.445 to kind of walk through the differences
398 00:16:21.445 --> 00:16:22.525 between these two products
399 00:16:22.785 --> 00:16:23.845 and make sure that, you know,
400 00:16:23.845 --> 00:16:25.405 you pick the appropriate solution for you.
401 00:16:25.865 --> 00:16:29.085 So, uh, the main, the, the, the primary way
402 00:16:29.085 --> 00:16:31.325 that most people do scaling is manual scaling.
403 00:16:31.585 --> 00:16:34.245 Uh, you know, you're able to just pick, uh, and,
404 00:16:34.245 --> 00:16:36.005 and I'll just, I'll just look through that.
405 00:16:36.115 --> 00:16:37.405 I'll just give you guys the actual
406 00:16:37.545 --> 00:16:38.805 ui 'cause it'll, it'll be helpful to see.
407 00:16:38.825 --> 00:16:41.205 So this is what a Zilliz cluster
408 00:16:41.425 --> 00:16:43.725 and it looks like inside of Zilliz cloud.
409 00:16:44.105 --> 00:16:47.165 Uh, there's a handful of things in here, uh, that the,
410 00:16:47.165 --> 00:16:49.085 the main thing that I want to point your attention
411 00:16:49.145 --> 00:16:51.285 to is the CU size, which is right here.
412 00:16:51.585 --> 00:16:53.965 You have two cu and right now the capacity is at 2%.
413 00:16:54.005 --> 00:16:55.085 I don't have too much in this cluster.
414 00:16:55.665 --> 00:16:58.365 And the way that most people do, uh,
415 00:16:58.365 --> 00:17:00.285 scaling is they just click on the scale button,
416 00:17:00.465 --> 00:17:03.365 and you're able to see the cu that you're able to add.
417 00:17:03.365 --> 00:17:05.085 And obviously, this list is very, very large.
418 00:17:05.465 --> 00:17:07.605 Um, and it could get even higher.
419 00:17:07.745 --> 00:17:10.565 Uh, if you need us to help you, obviously we can help you.
420 00:17:10.585 --> 00:17:12.445 But, you know, 256 is quite a lot
421 00:17:12.875 --> 00:17:14.645 that sits in the many, many billions.
422 00:17:14.945 --> 00:17:17.485 And, um, you know, we, we do have customers that,
423 00:17:17.515 --> 00:17:18.525 that exceed that, of course.
424 00:17:18.705 --> 00:17:20.565 Uh, and obviously we'll, we'll work with you very closely,
425 00:17:20.945 --> 00:17:22.525 but this is how most people do it.
426 00:17:22.625 --> 00:17:26.485 Uh, and for the most part, your, your vector count grow,
427 00:17:26.505 --> 00:17:29.045 as long as your vector count has been growing by, uh,
428 00:17:29.065 --> 00:17:32.405 you know, an, an insane rate, uh, every day, uh,
429 00:17:32.405 --> 00:17:33.805 you're generally fine with this approach.
430 00:17:34.225 --> 00:17:37.325 We do have an auto scale feature as well
431 00:17:37.325 --> 00:17:38.485 that you're seeing on the right.
432 00:17:38.485 --> 00:17:39.845 Lemme just move this video outta the way.
433 00:17:40.065 --> 00:17:42.165 Uh, you have this autoscale feature on the right
434 00:17:42.195 --> 00:17:44.565 that allows you to set a threshold
435 00:17:45.025 --> 00:17:46.925 so your cu capacity a threshold.
436 00:17:46.985 --> 00:17:48.725 So, you know, whatever you guys are comfortable
437 00:17:48.725 --> 00:17:52.045 with in terms of, you know, risking the, the possibility
438 00:17:52.045 --> 00:17:54.525 of maybe, you know, elevated query latency.
439 00:17:54.755 --> 00:17:57.445 Some people like to set it all the way at 90%.
440 00:17:57.465 --> 00:17:58.685 We personally don't recommend that,
441 00:17:58.705 --> 00:18:00.525 but, you know, it's something that, that you can do.
442 00:18:00.865 --> 00:18:02.365 Uh, but generally between 70
443 00:18:02.365 --> 00:18:06.725 and 80% is, is a, is a good, safe, happy medium where, uh,
444 00:18:07.025 --> 00:18:10.805 if the, if the capacity of your cluster gets to that level,
445 00:18:10.895 --> 00:18:14.205 it'll automatically scale you up to the next tier, which is,
446 00:18:14.205 --> 00:18:15.725 you know, you'll add two cu at a time.
447 00:18:16.185 --> 00:18:18.885 And I think that's also important to, to, to mention
448 00:18:18.945 --> 00:18:21.525 as well, just because, uh, you're, you're,
449 00:18:21.525 --> 00:18:23.205 you're never gonna be in a situation where you're,
450 00:18:23.205 --> 00:18:24.765 you're over allocated, right?
451 00:18:24.825 --> 00:18:27.925 So a lot of other solutions out there will force you
452 00:18:27.925 --> 00:18:29.845 to pick, um, the number
453 00:18:29.845 --> 00:18:31.885 of horizontally scaled nodes that you have.
454 00:18:31.905 --> 00:18:34.085 So let's say you have, you know, 10 horizontally scaled
455 00:18:34.085 --> 00:18:37.005 nodes, and, uh, they, they make it very difficult
456 00:18:37.005 --> 00:18:38.045 for you to change that number.
457 00:18:38.105 --> 00:18:40.005 And the reason is because the way the index is,
458 00:18:40.025 --> 00:18:42.045 is built across those, those nodes.
459 00:18:42.265 --> 00:18:44.765 So, uh, if you want to go up, that's fine,
460 00:18:44.765 --> 00:18:47.405 but you have to, you have to vertically scale all 10, right?
461 00:18:47.405 --> 00:18:48.645 Which is generally not what you want.
462 00:18:48.645 --> 00:18:50.765 You're basically doubling, you're, you're doubling the,
463 00:18:50.765 --> 00:18:52.245 the entire capacity of your cluster,
464 00:18:52.245 --> 00:18:54.445 which is sometimes it might be okay,
465 00:18:54.445 --> 00:18:56.005 but like most of the time, that's not what you want.
466 00:18:56.025 --> 00:18:58.205 You, what you wanna do is you want to add things linearly
467 00:18:58.265 --> 00:18:59.485 as your vector count grows.
468 00:18:59.985 --> 00:19:02.365 And zills allows you to do that very, very nicely.
469 00:19:02.425 --> 00:19:03.525 You're, you're, you're,
470 00:19:03.545 --> 00:19:07.285 you're just tack on two cu at a time, uh, either manually
471 00:19:07.345 --> 00:19:10.205 by yourself or you can use our autoscale feature to do that.
472 00:19:10.625 --> 00:19:14.845 Uh, you can also use our modify cluster endpoint as well.
473 00:19:14.985 --> 00:19:19.765 So, uh, if you want to plug this into some of the automation
474 00:19:19.765 --> 00:19:23.085 that you have, uh, in, in your DevOps pipeline, uh, we,
475 00:19:23.085 --> 00:19:24.125 we do offer that as well.
476 00:19:24.225 --> 00:19:27.365 So you can read all the cluster metrics through the API,
477 00:19:27.385 --> 00:19:31.485 you can see historically where the cu capacity has been and,
478 00:19:31.485 --> 00:19:33.405 and where it might be going based on other things
479 00:19:33.405 --> 00:19:34.645 that are happening within your system.
480 00:19:35.105 --> 00:19:36.405 Uh, and, and you can scale up
481 00:19:36.405 --> 00:19:38.645 or scale down, uh, with, with that endpoint as well.
482 00:19:38.645 --> 00:19:41.605 So we offer a lot of opportunities for, you know,
483 00:19:41.605 --> 00:19:42.645 both manual auto
484 00:19:42.905 --> 00:19:44.325 and, you know, plugging into your
485 00:19:44.325 --> 00:19:45.805 DevOps pipeline in terms of scaling.
486 00:19:46.385 --> 00:19:48.925 And also, it's really important to note that, you know, the,
487 00:19:48.945 --> 00:19:50.245 the scaling itself is,
488 00:19:50.505 --> 00:19:52.525 is done in a nice linear fashion, right?
489 00:19:52.745 --> 00:19:54.525 Uh, I'm just making sure I'm okay on time, okay?
490 00:19:54.745 --> 00:19:58.045 Um, and, uh, it, it, it ends up being very flexible
491 00:19:58.045 --> 00:20:00.125 to your use case, and it, it's something that, you know,
492 00:20:00.205 --> 00:20:01.485 a lot of our customers take advantage of
493 00:20:01.485 --> 00:20:02.645 and, and really appreciate.
494 00:20:02.745 --> 00:20:04.765 So, um, that's a little bit about scaling.
495 00:20:05.105 --> 00:20:09.325 Um, uh, I wanna touch a little bit on security as well.
496 00:20:09.625 --> 00:20:13.725 Uh, we get a lot of, uh, enterprise customers that are very,
497 00:20:13.955 --> 00:20:18.045 very curious about our security apparatus, what we do,
498 00:20:18.105 --> 00:20:21.125 and how it fits into their, uh, their policy network.
499 00:20:21.825 --> 00:20:24.285 And the, the primary thing really that, especially
500 00:20:24.505 --> 00:20:26.645 for any managed service is, is making sure
501 00:20:26.675 --> 00:20:30.605 that the managed service is deployed in not only the same
502 00:20:30.605 --> 00:20:33.325 region, but hopefully the same availability zone as the rest
503 00:20:33.325 --> 00:20:35.045 of the microservices that are gonna be able,
504 00:20:35.155 --> 00:20:37.645 that are gonna be calling it on, on a regular basis,
505 00:20:37.745 --> 00:20:39.325 not just for queries, but also for metrics
506 00:20:39.505 --> 00:20:41.485 and, you know, scaling things up and, and, and that nature.
507 00:20:41.905 --> 00:20:45.605 Uh, so we do support private link on AWS
508 00:20:45.785 --> 00:20:48.525 and the, all of their equivalents on, on GCP,
509 00:20:48.865 --> 00:20:50.085 uh, and Azure as well.
510 00:20:50.105 --> 00:20:51.765 So you can create a private endpoint,
511 00:20:52.145 --> 00:20:54.765 and that way all of the traffic from your microservices
512 00:20:55.145 --> 00:20:57.285 to your Zilliz installation, uh,
513 00:20:57.355 --> 00:20:58.765 will not go over the open internet.
514 00:20:58.765 --> 00:21:00.805 So you're not gonna have any in ingress or egress issues.
515 00:21:01.225 --> 00:21:04.685 Uh, but more importantly, that traffic, it'll stay inside
516 00:21:04.685 --> 00:21:07.245 of your VPC or it'll stay inside of the, the,
517 00:21:07.265 --> 00:21:08.805 the AWS network, right?
518 00:21:08.805 --> 00:21:11.525 So you're not, uh, exposing anything potentially to,
519 00:21:11.525 --> 00:21:12.765 to, to the outside world.
520 00:21:13.265 --> 00:21:16.405 Um, the other thing that we, that we offer
521 00:21:16.405 --> 00:21:19.045 as well is customer managed encryption keys.
522 00:21:19.385 --> 00:21:22.165 Uh, this has also been a really popularly requested feature
523 00:21:22.255 --> 00:21:25.405 where, uh, we can plug into your KMS.
524 00:21:25.405 --> 00:21:27.605 So it's A-W-S-K-M-S, we'll plug into that, uh,
525 00:21:27.665 --> 00:21:30.285 you can issue us keys from that KMS
526 00:21:30.545 --> 00:21:32.765 and we'll encrypt, uh, essentially everything.
527 00:21:32.765 --> 00:21:36.565 So we'll encrypt the, the, uh, the, the vector embeddings,
528 00:21:36.625 --> 00:21:39.565 uh, we'll encrypt all of the metadata associated with it.
529 00:21:39.985 --> 00:21:43.645 Um, and at query time, we'll just decrypt on the fly,
530 00:21:44.105 --> 00:21:46.845 run the a and n, and then re-encrypt everything so
531 00:21:46.845 --> 00:21:47.965 that it stays nice
532 00:21:47.965 --> 00:21:51.885 and encrypted at rest, um, in, in, in our, in our VPC.
533 00:21:51.885 --> 00:21:54.245 And obviously, if, you know, through your KMS,
534 00:21:54.245 --> 00:21:55.765 you can revoke those keys at any time.
535 00:21:56.105 --> 00:21:58.845 And then Zilliz no longer has access to any of your data.
536 00:21:58.845 --> 00:22:01.005 So that's a very popularly requested feature
537 00:22:01.025 --> 00:22:02.325 as well by enterprises.
538 00:22:02.625 --> 00:22:04.885 Uh, and it's something that we worked really hard on, uh,
539 00:22:04.885 --> 00:22:07.085 just to make sure that, you know, uh, the, the folks
540 00:22:07.105 --> 00:22:10.245 who have these very stringent requirements, uh, in terms of,
541 00:22:10.265 --> 00:22:11.845 you know, PII and all the other,
542 00:22:12.005 --> 00:22:13.805 'cause, you know, a lot of this data is, is again,
543 00:22:13.805 --> 00:22:14.725 like I said at the very beginning,
544 00:22:15.405 --> 00:22:16.485 internal to your organization.
545 00:22:16.545 --> 00:22:19.605 So it's very important that, you know, the, the, the CISOs
546 00:22:19.765 --> 00:22:22.325 of those organizations feel comfortable that all that data,
547 00:22:22.705 --> 00:22:24.725 uh, is firmly in their control
548 00:22:24.985 --> 00:22:28.605 and if necessary, could be, uh, revoked at any time,
549 00:22:28.705 --> 00:22:29.925 uh, if, if there is an issue.
550 00:22:30.025 --> 00:22:31.085 So I'll stop there.
551 00:22:31.125 --> 00:22:33.805 I know we have about four minutes left at the end, uh, for,
552 00:22:33.825 --> 00:22:36.325 for, for questions, but, uh, yeah, uh, happy
553 00:22:36.325 --> 00:22:37.325 to take any questions now,
554 00:22:37.325 --> 00:22:38.605 or Chris, I'll pass it back to you.
555 00:22:38.905 --> 00:22:41.245 Do you, what about, um, you talked about scaling up.
556 00:22:41.245 --> 00:22:43.765 What about scaling down, uh, on the previous topic?
557 00:22:44.075 --> 00:22:48.205 Yeah, so scaling down, um, is, is done the same way.
558 00:22:48.345 --> 00:22:50.405 Uh, you can go to this scale button here,
559 00:22:50.405 --> 00:22:51.725 you can scale back down to one cu,
560 00:22:51.845 --> 00:22:52.885 I can just do that right now if you want.
561 00:22:53.265 --> 00:22:56.725 Um, and then the cluster will scale back down to one.
562 00:22:57.065 --> 00:22:58.485 Uh, it's also important to note
563 00:22:58.485 --> 00:23:01.005 that auto scale does not scale back down, right?
564 00:23:01.185 --> 00:23:04.085 So that's the, and the reason why that is, is
565 00:23:04.085 --> 00:23:08.685 because scaling down is inherently a more dangerous
566 00:23:08.685 --> 00:23:10.125 operation than scaling up.
567 00:23:10.465 --> 00:23:12.965 Uh, and there's the opportunity for, uh,
568 00:23:12.965 --> 00:23:14.125 increased query latency.
569 00:23:14.125 --> 00:23:17.245 There's the opportunity for your production application to,
570 00:23:17.425 --> 00:23:19.725 um, experience issues if it is part an
571 00:23:20.005 --> 00:23:21.045 integral part of that pipeline.
572 00:23:21.785 --> 00:23:23.605 So that's the decision we made there.
573 00:23:23.745 --> 00:23:26.085 Uh, you know, if, if, if you wanna scale back down,
574 00:23:26.085 --> 00:23:29.685 like I said before in a programmatic fashion, you can use
575 00:23:30.425 --> 00:23:33.085 our query cluster metrics endpoint
576 00:23:33.105 --> 00:23:34.725 and our modify cluster endpoint
577 00:23:34.825 --> 00:23:37.085 to scale back down if you wanna do it programmatically,
578 00:23:37.265 --> 00:23:40.245 or you can do it through the UI that, uh, that, uh, that,
579 00:23:40.245 --> 00:23:41.325 that I just showed you right now.
580 00:23:41.705 --> 00:23:43.885 Um, but yeah, that, that's, that's a really, uh,
581 00:23:43.885 --> 00:23:45.640 common thing that a lot of folks, folks do just
582 00:23:45.640 --> 00:23:48.565 because they might be running some import job,
583 00:23:48.825 --> 00:23:52.245 or they might be, uh, experimenting with a lot of vectors
584 00:23:52.245 --> 00:23:53.525 and they just remove them all at once
585 00:23:53.525 --> 00:23:55.205 and they, they have this extra capacity.
586 00:23:55.205 --> 00:23:57.125 But again, you can, you can scale it down,
587 00:23:57.125 --> 00:23:58.205 you know, in step, right?
588 00:23:58.205 --> 00:24:00.605 So you don't have to go all the way down from 32 to 16.
589 00:24:00.605 --> 00:24:02.765 You can go, you know, you can go in step just to kind of see
590 00:24:03.085 --> 00:24:04.125 what the performance looks like.
591 00:24:04.385 --> 00:24:07.245 And again, it's, it's, it's really a, a testament to how
592 00:24:08.005 --> 00:24:09.845 flexible and horizontally scalable illus is.
593 00:24:10.695 --> 00:24:13.235 Oh. So, uh, if anybody has any questions, uh,
594 00:24:13.235 --> 00:24:14.475 feel free to raise your hand.
595 00:24:14.475 --> 00:24:16.835 I'll unmute, uh, you so you can ask,
596 00:24:16.855 --> 00:24:19.115 or if you prefer talking or typing it in, that's fine.
597 00:24:20.015 --> 00:24:23.275 Um, but, uh, Jay, I actually have a couple of questions.
598 00:24:23.415 --> 00:24:26.875 So, um, let's talk a little bit about bringing data in.
599 00:24:27.265 --> 00:24:29.875 What are the ways that we can bring in data into
600 00:24:29.875 --> 00:24:30.955 the database efficiently?
601 00:24:31.825 --> 00:24:36.195 Sure. So we offer a bulk, uh,
602 00:24:36.415 --> 00:24:38.035 insert, uh, API,
603 00:24:38.615 --> 00:24:42.995 and essentially the way it works is you'll define the schema
604 00:24:43.215 --> 00:24:44.635 for your Zilliz database,
605 00:24:45.375 --> 00:24:48.515 and you can send us either Parquet files
606 00:24:48.815 --> 00:24:51.315 or you can send us JSO files that have all
607 00:24:51.315 --> 00:24:52.915 of the data, uh, in them.
608 00:24:53.415 --> 00:24:56.715 And you can just send us those, you know, links to those S3,
609 00:24:56.715 --> 00:24:58.755 you know, wherever is object storage, wherever it is,
610 00:24:59.015 --> 00:25:00.315 as long as it's authenticated.
611 00:25:00.615 --> 00:25:02.075 Uh, and then we will handle all
612 00:25:02.075 --> 00:25:03.395 of the import parallelization
613 00:25:03.495 --> 00:25:06.315 and all that, um, uh, on our end, right?
614 00:25:06.975 --> 00:25:08.675 The, the other way to do it is
615 00:25:08.935 --> 00:25:11.755 to just use the uper endpoint, and that also works as well.
616 00:25:11.755 --> 00:25:13.395 If you want to do it in a more iterative fashion,
617 00:25:13.495 --> 00:25:14.595 that's, that's also fine.
618 00:25:14.595 --> 00:25:16.075 We have tons of customers do it that way.
619 00:25:16.575 --> 00:25:20.155 Uh, but if you have, say, let's say, you know, uh,
620 00:25:20.635 --> 00:25:23.435 150 million vectors that you wanna load all at once,
621 00:25:23.695 --> 00:25:25.635 and you have them somewhere, right?
622 00:25:25.635 --> 00:25:28.155 Whether they're in some other database or Snowflake
623 00:25:28.295 --> 00:25:31.195 or wherever, uh, you know, parquet
624 00:25:31.195 --> 00:25:33.715 and JSON are are pretty industry standard file types.
625 00:25:33.815 --> 00:25:37.595 And, uh, most ETL pipelines are able to, to write to those,
626 00:25:38.135 --> 00:25:39.515 uh, to those formats.
627 00:25:39.515 --> 00:25:40.675 And, and, and we accept both
628 00:25:40.675 --> 00:25:42.315 of those formats in, in a bulk fashion.
629 00:25:42.375 --> 00:25:45.675 So it allows you to kind of, uh, uh, you know, import, uh,
630 00:25:45.675 --> 00:25:47.715 very quickly if you're coming from a, uh,
631 00:25:47.715 --> 00:25:49.355 another vector database solution.
632 00:25:49.415 --> 00:25:51.275 Uh, we have import pipelines specific
633 00:25:51.275 --> 00:25:52.555 to those vector database solutions.
634 00:25:52.735 --> 00:25:54.315 Uh, so if you're coming from quadrant,
635 00:25:54.315 --> 00:25:57.235 or if you're coming from pi, from Pine Cone, um, we have,
636 00:25:57.415 --> 00:25:59.155 uh, native support for both of those.
637 00:25:59.155 --> 00:26:00.755 You just give us the API keys to each
638 00:26:00.755 --> 00:26:02.275 of those hosted solutions, uh,
639 00:26:02.275 --> 00:26:03.395 and we're able to suck the data
640 00:26:03.395 --> 00:26:04.715 in directly from that database.
641 00:26:05.055 --> 00:26:08.955 Um, there are a couple trade offs there. Excuse me.
642 00:26:09.285 --> 00:26:10.395 There are a couple trade offs there.
643 00:26:10.455 --> 00:26:12.715 Uh, you know, especially around like the schema.
644 00:26:13.015 --> 00:26:14.515 So, you know, you're, you're kind of stuck
645 00:26:14.515 --> 00:26:15.675 to the schema that you already had.
646 00:26:15.975 --> 00:26:17.075 Um, when you're,
647 00:26:17.075 --> 00:26:19.195 when you're coming from the previous solution,
648 00:26:19.815 --> 00:26:21.115 if you wanna change the schema
649 00:26:21.115 --> 00:26:22.355 or if you wanna change your embedding model,
650 00:26:22.415 --> 00:26:24.315 if there's things that you'd like to change, uh,
651 00:26:24.535 --> 00:26:27.115 you would just have to create those parquet
652 00:26:27.115 --> 00:26:28.355 and JSON files that I mentioned earlier,
653 00:26:28.615 --> 00:26:29.675 uh, and, and go that way.
654 00:26:29.675 --> 00:26:30.715 And that's, you know, generally
655 00:26:30.715 --> 00:26:33.235 what most people do if they do wanna change their schema,
656 00:26:33.615 --> 00:26:36.395 But you're here to help, right, Jay, uh, in, in making
657 00:26:36.395 --> 00:26:38.035 that, helping them make that decision.
658 00:26:38.545 --> 00:26:40.395 Yeah, of course. I mean, when we work with customers
659 00:26:40.395 --> 00:26:42.915 who are moving from something else, uh, you know, we,
660 00:26:43.025 --> 00:26:45.485 we ask a lot of questions about how they're currently set up
661 00:26:45.485 --> 00:26:48.285 and, you know, we, we have a very, uh, helpful team that's,
662 00:26:48.285 --> 00:26:50.405 that's able to not only recommend solutions,
663 00:26:50.405 --> 00:26:52.725 but also help you guys, uh, make sure that the,
664 00:26:52.825 --> 00:26:54.165 the import process is as smooth
665 00:26:54.165 --> 00:26:55.525 as possible from wherever you're coming from.
666 00:26:55.955 --> 00:26:58.245 Yeah. And so don't forget the, uh, bulk import
667 00:26:58.245 --> 00:26:59.845 or the bulk rider capability.
668 00:27:00.085 --> 00:27:01.365 I don't know why, but we've had a number
669 00:27:01.445 --> 00:27:02.765 of customers that overlooked it.
670 00:27:03.115 --> 00:27:05.885 They were frustrated with doing it one by one,
671 00:27:05.885 --> 00:27:09.165 and then Jay was like, Hey, we have this other capability.
672 00:27:09.225 --> 00:27:11.845 So, yep. Saves a lot of time.
673 00:27:12.035 --> 00:27:13.165 It's there. It, it's definitely there.
674 00:27:14.385 --> 00:27:17.685 Now let's talk a little bit about, um, uh,
675 00:27:18.165 --> 00:27:19.965 multiple embeddings in,
676 00:27:20.125 --> 00:27:22.045 especially when you're trying to do hybrid search.
677 00:27:22.145 --> 00:27:24.045 So, you know, when you talk about, uh,
678 00:27:24.115 --> 00:27:25.925 when we look at a row, you can have more
679 00:27:25.925 --> 00:27:26.965 than just one right?
680 00:27:26.965 --> 00:27:29.805 Vector bening. So what is that and why is that important?
681 00:27:29.955 --> 00:27:31.645 Yeah, that's, that's really, uh,
682 00:27:32.125 --> 00:27:33.685 I think something that is pretty unique to us.
683 00:27:33.745 --> 00:27:37.725 So we support four vector embeddings per entry.
684 00:27:38.545 --> 00:27:42.325 So you can have, uh, two dents and one sparse.
685 00:27:42.325 --> 00:27:43.805 You could have four dents in one spar,
686 00:27:43.825 --> 00:27:45.125 or three dents in one sparse.
687 00:27:45.585 --> 00:27:50.485 Um, and you can run queries both sliced down each
688 00:27:50.485 --> 00:27:51.765 of those vector embedding.
689 00:27:51.765 --> 00:27:53.085 So there's like one index for each of those.
690 00:27:53.085 --> 00:27:55.285 So if you only want to say, for example, run the a
691 00:27:55.285 --> 00:27:58.845 and n across the first set of dense vectors,
692 00:27:58.845 --> 00:27:59.965 and then you can run a separate a
693 00:27:59.965 --> 00:28:02.085 and n across the second set of dense vectors.
694 00:28:02.505 --> 00:28:04.685 Uh, that's very helpful when you're testing
695 00:28:05.315 --> 00:28:06.605 different embedding models.
696 00:28:06.865 --> 00:28:08.325 So there's a lot of embedding models now
697 00:28:08.325 --> 00:28:10.205 that are fine tuned, or maybe you're tuning them yourself.
698 00:28:10.505 --> 00:28:12.245 Um, and, and, and you want to see
699 00:28:12.915 --> 00:28:16.725 what those vector spaces look like, uh, for,
700 00:28:16.745 --> 00:28:17.885 for each of those embedding.
701 00:28:17.965 --> 00:28:19.565 'cause you know, it might be, you could take the same tease
702 00:28:19.565 --> 00:28:21.165 of text running through different embedding models,
703 00:28:21.165 --> 00:28:22.925 and you could be totally, the location could be totally
704 00:28:22.925 --> 00:28:25.445 different based on, uh, what the tuning looks like.
705 00:28:25.545 --> 00:28:29.085 So we see that a lot with customers that want to see,
706 00:28:29.145 --> 00:28:31.445 is it worth it for me to use this fine tuned model?
707 00:28:31.465 --> 00:28:32.925 How much benefit do I get from it?
708 00:28:33.425 --> 00:28:36.085 And you can just send the queries to each of them, you know,
709 00:28:36.105 --> 00:28:38.405 all day long, uh, and, and see what they look like.
710 00:28:38.465 --> 00:28:40.285 So that's a very popular use case.
711 00:28:40.625 --> 00:28:42.685 Um, the other popular use case that we see is
712 00:28:42.685 --> 00:28:43.845 with sparse vectors.
713 00:28:44.265 --> 00:28:48.965 And sparse vectors are primarily used for, uh, blending, uh,
714 00:28:49.225 --> 00:28:51.445 lexile search with semantic search, right?
715 00:28:52.315 --> 00:28:54.415 So there are, uh, the most popular ones,
716 00:28:54.495 --> 00:28:56.295 probably BM two five, which has been around for decades.
717 00:28:56.795 --> 00:28:59.455 Uh, but there are are newer ones coming up called, you know,
718 00:28:59.455 --> 00:29:01.575 there's one called Splay that's also very interesting.
719 00:29:01.915 --> 00:29:05.055 Uh, and it allow, we, we allow you to have them sit side
720 00:29:05.075 --> 00:29:06.135 by side, uh,
721 00:29:06.155 --> 00:29:09.655 and you can run a, a, a hybrid search on both of them.
722 00:29:09.835 --> 00:29:13.215 So, you know, this, this is very helpful for, for example,
723 00:29:13.835 --> 00:29:18.055 um, in e-commerce use cases where you have, say,
724 00:29:18.495 --> 00:29:22.135 a SKU or a UPC code that's very unique and,
725 00:29:22.135 --> 00:29:24.015 and you're like a hundred percent sure that
726 00:29:24.015 --> 00:29:25.895 that chunk will have that UPC code in it.
727 00:29:26.275 --> 00:29:29.975 So if you use a sparse factor on that,
728 00:29:30.265 --> 00:29:32.215 it'll combine the dense
729 00:29:32.435 --> 00:29:35.335 and sparse together, so you're combining semantic
730 00:29:35.595 --> 00:29:36.855 and lexical together,
731 00:29:37.115 --> 00:29:39.575 and it'll overweight the ones that have
732 00:29:39.575 --> 00:29:41.205 that exact UPC code in them.
733 00:29:41.205 --> 00:29:43.845 So it's, it basically, it, it, it increases the probability,
734 00:29:43.845 --> 00:29:45.245 it'll push those results to the top,
735 00:29:45.315 --> 00:29:47.405 whereas if you just did it as semantic,
736 00:29:47.905 --> 00:29:48.925 it may or may not be there.
737 00:29:48.925 --> 00:29:51.125 You might need to, you know, jack up the, the top K to kind
738 00:29:51.125 --> 00:29:52.245 of get exactly what you're looking for.
739 00:29:52.505 --> 00:29:55.085 But it allows you to fine tune your queries
740 00:29:55.085 --> 00:29:58.765 for those use cases where, uh, you know, semantic is great
741 00:29:58.945 --> 00:30:00.005 and we want to use it,
742 00:30:00.185 --> 00:30:03.325 but it's, it's also, you know, we we're, we're pretty sure
743 00:30:03.325 --> 00:30:04.325 that it has this in it,
744 00:30:04.345 --> 00:30:05.965 and we want you to do a lexical match
745 00:30:05.965 --> 00:30:07.165 as well, so we support that as well.
746 00:30:07.955 --> 00:30:10.375 So, I mean, I think there's, there used to be kind
747 00:30:10.375 --> 00:30:11.975 of a hacky way that you could do this, right?
748 00:30:11.995 --> 00:30:13.735 You could have these different embeddings
749 00:30:13.735 --> 00:30:15.695 and, you know, different collections, so mm-hmm.
750 00:30:16.285 --> 00:30:20.015 What, you know, what did we do to make it more useful
751 00:30:20.015 --> 00:30:22.255 besides, you know, doing a hybrid search by putting it,
752 00:30:22.315 --> 00:30:23.815 you know, under one entity.
753 00:30:25.035 --> 00:30:27.615 Uh, so it's, it's, it's part of the way
754 00:30:27.615 --> 00:30:29.535 that our index strategy works.
755 00:30:29.565 --> 00:30:31.535 Like we, we look at each of the, uh,
756 00:30:31.635 --> 00:30:34.495 vector entries individually, um, as opposed
757 00:30:34.515 --> 00:30:37.455 to having a single index for the entire collection,
758 00:30:37.455 --> 00:30:39.535 which is generally how, uh, a lot
759 00:30:39.535 --> 00:30:41.095 of the other vector databases do it.
760 00:30:41.155 --> 00:30:45.655 So we're a database first, index second, I think a lot
761 00:30:45.655 --> 00:30:47.615 of the vector databases out there are a index
762 00:30:47.615 --> 00:30:48.775 first, database second.
763 00:30:48.835 --> 00:30:50.215 So that's, that's a very, uh,
764 00:30:50.215 --> 00:30:51.815 important distinction, I think to make.
765 00:30:52.195 --> 00:30:53.455 Uh, and it just comes from our history.
766 00:30:53.475 --> 00:30:56.095 You know, our our founder, you know, was, was at Oracle
767 00:30:56.155 --> 00:30:58.015 for a long time and, uh, you know,
768 00:30:58.015 --> 00:31:01.095 he's very deeply knowledgeable in, in database design, uh,
769 00:31:01.115 --> 00:31:02.535 and, and, and what constitutes a
770 00:31:02.535 --> 00:31:03.655 good database design, right?
771 00:31:03.655 --> 00:31:05.455 So that's a really good foundation for us.
772 00:31:05.755 --> 00:31:07.135 And then we've added, you know,
773 00:31:07.255 --> 00:31:09.575 a world-class vector capability on top of it,
774 00:31:09.745 --> 00:31:11.495 which gives us the best of both worlds, right?
775 00:31:11.495 --> 00:31:13.415 It gives us, you know, a lot of the flexibility
776 00:31:13.415 --> 00:31:15.055 of being an actual database, right?
777 00:31:15.355 --> 00:31:18.815 Um, but at the same time having this, you know, great, um,
778 00:31:19.165 --> 00:31:21.175 very intelligent, constantly changing,
779 00:31:21.435 --> 00:31:25.055 but also, uh, you know, simple for you guys to use, right?
780 00:31:25.115 --> 00:31:26.175 And, and through, through a lot of
781 00:31:26.175 --> 00:31:27.415 the abstractions that we've created.
782 00:31:27.835 --> 00:31:31.055 Uh, and, and you kind of get to utilize a lot of the power
783 00:31:31.055 --> 00:31:32.815 that, that that's under the hood, you know,
784 00:31:32.815 --> 00:31:34.895 while having an interface that's very easy to understand.
785 00:31:35.675 --> 00:31:36.885 Cool. So don't forget,
786 00:31:36.885 --> 00:31:38.765 we'll leave the lines open for a little bit longer.
787 00:31:38.985 --> 00:31:41.965 If you have any questions, pop it in the q and a or the chat
788 00:31:41.965 --> 00:31:43.605 or raise your hand and I'll unmute lines.
789 00:31:44.145 --> 00:31:46.285 Um, but I have one more question for you, Jay.
790 00:31:46.285 --> 00:31:47.925 Well, I have always more than just one,
791 00:31:48.305 --> 00:31:51.325 but if you don't mind if we can talk a little bit about
792 00:31:51.515 --> 00:31:54.565 multi-tenancy and all the different ways
793 00:31:54.675 --> 00:31:56.205 that we can do multi-tenancy.
794 00:31:56.435 --> 00:31:59.165 Sure. And then what are the pros and cons for each?
795 00:31:59.985 --> 00:32:04.925 So, uh, in Zilliz, we, we, we generally recommend
796 00:32:05.155 --> 00:32:09.725 that you use logical isolation for multi-tenancy.
797 00:32:09.825 --> 00:32:12.765 So you put all your vectors in a single, uh, collection,
798 00:32:13.345 --> 00:32:16.285 and you use something called a partition key
799 00:32:16.665 --> 00:32:20.645 to logically isolate vector space, um, amongst your tenants.
800 00:32:20.745 --> 00:32:23.965 So if you have, say, uh, you know, a handful of tenants,
801 00:32:24.305 --> 00:32:27.925 you would basically, uh, when you upstart them, give,
802 00:32:28.065 --> 00:32:29.845 assign a partition key to all
803 00:32:29.845 --> 00:32:31.485 of those vectors that belong to that tenant.
804 00:32:31.825 --> 00:32:34.525 So that query time, when you send the query to us,
805 00:32:34.625 --> 00:32:36.165 you can give it that partition key.
806 00:32:36.165 --> 00:32:38.005 And Zilliz will essentially ignore the rest
807 00:32:38.005 --> 00:32:41.045 of the vector space and only do the perform the a
808 00:32:41.045 --> 00:32:43.445 and n across the, the, the, the tenants vectors, right?
809 00:32:43.905 --> 00:32:47.085 Um, the other way to do it is with physical isolation, uh,
810 00:32:47.085 --> 00:32:50.725 where you create one collection per tenant, uh,
811 00:32:50.865 --> 00:32:52.205 and that also works as well.
812 00:32:52.505 --> 00:32:54.765 Uh, there's a lot of customers that prefer that just
813 00:32:54.765 --> 00:32:57.125 because, you know, they may have SLAs with their customers
814 00:32:57.125 --> 00:32:58.845 that say, no, you need physical isolation.
815 00:32:58.865 --> 00:33:00.045 You can't be commingled
816 00:33:00.045 --> 00:33:02.005 with other tenants, which is also fine.
817 00:33:02.065 --> 00:33:03.925 The downside there is, you know, there are,
818 00:33:04.215 --> 00:33:05.645 there are some upper limits on
819 00:33:05.645 --> 00:33:07.405 how many collections you can have in a cluster.
820 00:33:07.785 --> 00:33:09.565 You can obviously have more clusters, which is fine,
821 00:33:09.565 --> 00:33:11.885 you know, a lot of folks use that, uh, to, to get around it.
822 00:33:11.885 --> 00:33:14.525 But with logical isolation, with partition keys, I mean,
823 00:33:14.525 --> 00:33:15.845 you can have upwards.
824 00:33:15.845 --> 00:33:17.725 I mean, you can have millions of tenants right in, in,
825 00:33:17.725 --> 00:33:18.925 in the same collection.
826 00:33:19.225 --> 00:33:21.725 Um, and then you can send all of the queries to
827 00:33:21.725 --> 00:33:23.685 that same collection, uh, and, and,
828 00:33:23.685 --> 00:33:25.365 and still get exactly what you're looking for.
829 00:33:25.425 --> 00:33:28.045 And more importantly, you know, if you want to run
830 00:33:28.765 --> 00:33:29.885 a global a
831 00:33:29.885 --> 00:33:31.965 and n across all the tenants, you know, maybe say
832 00:33:31.965 --> 00:33:33.205 for analytics use cases,
833 00:33:33.305 --> 00:33:34.845 or, you know, you just, you just wanna see,
834 00:33:34.925 --> 00:33:36.165 I, I wanna see what this looks like.
835 00:33:36.505 --> 00:33:39.965 Um, you're, you would have to, in the, in, in the, um,
836 00:33:40.585 --> 00:33:43.765 in the one tenant per collection model, you would have
837 00:33:43.765 --> 00:33:45.765 to send, you know, a single query to all
838 00:33:45.765 --> 00:33:46.845 of those collections, right?
839 00:33:47.385 --> 00:33:49.405 And then kind of merge them together versus, you know,
840 00:33:49.405 --> 00:33:50.925 if you had them all in a single collection,
841 00:33:50.925 --> 00:33:52.845 you can just send one A and n and it'll just do all of them.
842 00:33:52.925 --> 00:33:55.045 So tho those are, those are the, we, we support both.
843 00:33:55.145 --> 00:33:57.165 Uh, you know, it's, it's, it's entirely up to you.
844 00:33:57.165 --> 00:33:59.245 And obviously we, we as a team, work with you very closely
845 00:33:59.385 --> 00:34:01.205 to, to figure out what trade off makes sense.
846 00:34:01.515 --> 00:34:03.485 Yeah. And typically it looks to me
847 00:34:03.485 --> 00:34:05.325 that people usually start, like, they start
848 00:34:05.325 --> 00:34:07.925 with the database, the collection, and then partition.
849 00:34:07.925 --> 00:34:12.045 But I think, uh, our advice to everybody is, uh, sit down
850 00:34:12.045 --> 00:34:14.845 with your solution architect, really describe your use case,
851 00:34:14.845 --> 00:34:15.845 what you're trying to achieve,
852 00:34:16.265 --> 00:34:18.565 and then they can help you, you know, go down the path
853 00:34:18.565 --> 00:34:20.965 that's gonna make the most sense for you from the get go,
854 00:34:20.965 --> 00:34:22.925 instead of like, kind of having to redo things.
855 00:34:23.745 --> 00:34:26.685 And then, um, also tied to multi-tenancy is
856 00:34:26.685 --> 00:34:29.725 that we have a pretty sophisticated set of, um,
857 00:34:29.775 --> 00:34:31.885 rules associated with RAC, right?
858 00:34:31.945 --> 00:34:32.945 Jay?
859 00:34:33.575 --> 00:34:38.225 Yeah. So, uh, RBAC is, is, is really around,
860 00:34:38.565 --> 00:34:41.585 um, what your organization looks like
861 00:34:42.045 --> 00:34:44.545 and who is allowed to do what on your side.
862 00:34:44.545 --> 00:34:45.745 So we obviously work with you
863 00:34:45.745 --> 00:34:47.985 to look at your org chart, right?
864 00:34:48.135 --> 00:34:50.785 Make sure that the right folks have access
865 00:34:50.805 --> 00:34:52.865 to the right things, and more importantly,
866 00:34:52.865 --> 00:34:54.225 the right microservices have
867 00:34:54.225 --> 00:34:55.385 access to the right things, right?
868 00:34:55.385 --> 00:34:57.625 If you have a microservice that is,
869 00:34:57.645 --> 00:35:00.025 should only be reading from certain clusters
870 00:35:00.025 --> 00:35:01.225 and not other clusters, right?
871 00:35:01.225 --> 00:35:04.465 We wanna make sure that, um, that, that that's enforced, um,
872 00:35:04.645 --> 00:35:06.065 at the, at the API level.
873 00:35:06.125 --> 00:35:10.065 So, uh, there, there's a lot of very granular controls that,
874 00:35:10.065 --> 00:35:12.305 that we provide not only for, for users,
875 00:35:12.405 --> 00:35:14.545 but also for, for authentication keys.
876 00:35:15.005 --> 00:35:17.425 Um, and, you know, obviously we work with, with,
877 00:35:17.455 --> 00:35:20.145 with your teams very closely just to make sure that, uh,
878 00:35:20.405 --> 00:35:22.145 you know, we're, we're enforcing everything
879 00:35:22.145 --> 00:35:24.465 and all of the, all of the RAC is set up correctly on your
880 00:35:24.465 --> 00:35:27.265 side so that you feel confident that, you know, if, uh,
881 00:35:27.485 --> 00:35:29.865 you know, if, if a microservice goes rogue, it's,
882 00:35:29.865 --> 00:35:31.705 it's not allowed to see inside
883 00:35:31.705 --> 00:35:33.185 of clusters that it's not supposed to see.
884 00:35:34.335 --> 00:35:35.935 Excellent. Uh, and then if you got,
885 00:35:35.935 --> 00:35:38.405 if everyone's looking at the ui, you can see there's a lot
886 00:35:38.405 --> 00:35:40.085 of other capabilities that Jay didn't go over,
887 00:35:40.085 --> 00:35:41.205 but I think they're pretty simple.
888 00:35:41.505 --> 00:35:43.005 Uh, he talked about backups,
889 00:35:43.005 --> 00:35:46.205 obviously you can set those up migrations, uh,
890 00:35:46.385 --> 00:35:49.085 you can also look and see how the jobs are doing.
891 00:35:49.385 --> 00:35:51.925 And then of course, uh, the metrics, you can do any kind
892 00:35:51.925 --> 00:35:54.845 of monitoring, even if you use, uh, other DevOps tools like,
893 00:35:54.905 --> 00:35:57.765 uh, like a Datadog to be able to see, you know, how well,
894 00:35:58.025 --> 00:36:01.125 um, your instances are, uh, utilizing their resources.
895 00:36:02.995 --> 00:36:05.725 Cool. Um, so let me just share the last slide.
896 00:36:06.065 --> 00:36:07.365 So we don't have any questions,
897 00:36:07.425 --> 00:36:10.435 but I do wanna, um, just, let's see.
898 00:36:10.985 --> 00:36:12.395 I'll stop sharing so that you can, thank you.
899 00:36:14.825 --> 00:36:17.075 Just wanna, uh, let everybody know.
900 00:36:17.095 --> 00:36:18.755 So how, how else can you get help from us
901 00:36:18.755 --> 00:36:20.515 besides coming to these webinars?
902 00:36:20.515 --> 00:36:22.115 You can join us on the Discord channel.
903 00:36:22.855 --> 00:36:25.755 Um, we also have, and there's the link,
904 00:36:25.755 --> 00:36:28.235 and I'll, I'll, we'll send all this information after this.
905 00:36:28.295 --> 00:36:32.235 Uh, after, um, today's session, uh, you can also set up a,
906 00:36:32.655 --> 00:36:35.075 uh, 20 minute, uh, private office hours.
907 00:36:35.655 --> 00:36:37.915 Uh, we can do this, you know, 24 7.
908 00:36:38.175 --> 00:36:41.395 So, uh, we're available to make sure that we help in, uh,
909 00:36:41.395 --> 00:36:43.355 both your Milvus and your Zilliz implementations.
910 00:36:43.855 --> 00:36:45.955 Uh, we can also, uh, um,
911 00:36:46.215 --> 00:36:48.915 put in your issues in GitHub, uh, as well.
912 00:36:48.915 --> 00:36:50.915 And that's where the entire engineering team is also
913 00:36:50.915 --> 00:36:52.795 available to answer any questions,
914 00:36:53.025 --> 00:36:55.755 whether it's a feature request or you found a bug,
915 00:36:55.775 --> 00:36:57.755 or you, something's a little bit tricky, just pop
916 00:36:57.755 --> 00:36:59.195 that in there if that's easy for you.
917 00:36:59.895 --> 00:37:01.675 Uh, we also have a little chatbot.
918 00:37:01.675 --> 00:37:04.435 Of course, we should, since we drive a lot
919 00:37:04.435 --> 00:37:06.115 of the chatbots on our docs pages,
920 00:37:06.235 --> 00:37:08.155 I just put a little screen there, screenshot of that.
921 00:37:08.815 --> 00:37:11.915 And then, uh, we can also set up a, uh,
922 00:37:12.155 --> 00:37:14.235 a private Slack channel, which I didn't put in here,
923 00:37:14.255 --> 00:37:15.995 but that's also a possibility if
924 00:37:15.995 --> 00:37:18.915 that's your preferred method for getting any kind of help.
925 00:37:19.815 --> 00:37:21.115 So, and, uh,
926 00:37:21.175 --> 00:37:24.755 we wish you great success in implementing your Zilliz
927 00:37:24.755 --> 00:37:26.475 incense, uh, but don't be a stranger.
928 00:37:26.535 --> 00:37:27.955 Let us know how you're, what you're building,
929 00:37:28.375 --> 00:37:30.075 how you're doing, how we can help.
930 00:37:30.415 --> 00:37:31.475 Uh, we're always here
931 00:37:31.495 --> 00:37:33.275 to help if we wanna make you successful.
932 00:37:34.335 --> 00:37:36.315 Jay, any last words of, uh, advice?
933 00:37:37.705 --> 00:37:39.765 No, uh, have fun. Uh, it's, it's a cool product.
934 00:37:40.205 --> 00:37:41.445 You'll, you'll, and, uh,
935 00:37:41.445 --> 00:37:42.525 we're always here to help. Of course.
936 00:37:43.325 --> 00:37:45.405 Excellent. All right. Have a great one everyone.
937 00:37:45.405 --> 00:37:46.565 We'll see you again. Bye.