Webinar
Discover the Latest Features in Zilliz Cloud at our Fall Live Demo Event
Join the Webinar
Loading...
About the Session
Hey there! A heads-up that the Zilliz Cloud Fall Live Demo is happening soon. This event is a must-attend if you want to stay up-to-date with a technical deep dive into our latest advancements. You'll get to see the best practices and new features firsthand and learn from our product experts how to incorporate them into your projects.
Topics Covered:
- Live demonstration of new features: Range Search, Cosine Similarity, Upsert, and more
- A tour of Zilliz Cloud's interface
- Best practices for optimizing your Zilliz Cloud setup
- Q&A session to address your technical questions
I'm gonna get into the architecture of VUS a little bit. We're gonna talk a little bit through, um,what makes a distributed system, a distributed system database, um,special and what it helps you do. And then, um,going to talk a little bit about Zillow's Cloud,which is a fully managed version of Viss that also has, uh,some sort of hardware optimizations built in. And then I'm gonna advertise Zillow's Cloud to you a little bit,talking about who our customers are, what they're working with. And, um,I'll put some time at the end to let you guys ask any questions that you want.
So let me get started here. Let's share screen. Mm-Hmm. Here we go. Ooh.
Okay. Can everyone see this? Everyone can see this? Okay. Um, looks like we've got a small audience here today. So, uh,I guess this gives us a chance to do a little bit more of, uh,interactive session. So, um, if you guys could,I'd love to see, uh,I'd love to have you guys drop in the chat if you are a software engineer or adata scientist or whatever your role is, uh,so that it'll help me more angle this towards something that'll be helpful foryou.
Okay. So, software engineer. Good. Good. Okay.
Okay. Software engineers. Okay, good. So, uh,it looks like that we have a pretty, uh, you know, strong engineering audience. And that actually,that's great because this lets me talk about VIS and the infrastructure of miland, uh, stuff like that, that I really like.
Oh, we have someone in sales too. Okay, good, good. So you're in sales. I'll, I'll, uh, uh, uh,everyone is just messaging the hosts and panelists. Um,that's why I see all of your, all of the messages.
Um, and that's why I'm,I'm reading them out loud. Uh, 'cause I know that you guys can't see them. Ah,LOL Yeah. See, I can see everything you're typing. Okay, cool.
So, um,I'll talk a little bit about the sales side of things, but, uh,at the beginning we're gonna focus on some of the, uh,architecture and engineering stuff. Okay. Product and engineering. Cool. Cool.
Um, so yeah, I'll, I'll cover product, but, uh,we're gonna start with the engineering stuff. This is what I think is the most interesting part about Viss. Um,there's no introduction slide about me here. Usually there is, uh, so I'll,I'll kind of cover that right now as well. So, my name is Yugen Tang.
I am a developer advocate here at Zillows. Um,I come from a software engineering background. Prior to being at Zillow,I've run a natural language processing, API that I created myself. Uh,drove about 2. 5 million, uh, at Amazon in slightly under a year.
Um, and, uh, I started working at IBM, uh, when I was 16. And I've published two different, uh, first author papers on machine learning. Um, so yeah, very heavy machine learning background. And this is, um, the, I,I think that what we're doing in vector databases is going to be one of the, uh,most important pieces of the new LLM kind of paradigm. So let's start with why we are building Novus like this.
Why does it matter that Novus is a distributed system vector database? So,right now, the way that, uh,I understand how other people see Vector database from the outsideis that, um, people are looking at vector databases and they're like,you know, this is the same thing. This is one blob, right? There's, it's,it's just, it's, it's a vector database. But in reality, you know, uh,the implementation of the vector databases are very different at their,at their, at their functionality level. All vector databases do this thing where they do similarity search. But for us,VUSs is a vector database that is built with one goal in mind.
And that goal is how do we handle insane scale?How do we handle ridiculous scale? Alibaba has 5 trillion vectors,and that is our goal, is to be able to handle that,handle that kind of scale within the next few years. And the reason why we believe this is so important,the reason why we believe that handling scale is gonna be the most importantproblem to solve in this space is because inherently vectorsare ways that you can represent unstructured data. And this means that, um,vectors are gonna be used to represent more than 80% of the data out there,right? I'm sure you guys have seen the Gartner stuff. The, the, the,all of these, like, you know,market research things that say 80% of the data that we have out there isunstructured data. And so what this means is that there's four times more data out there that isn'tbeing analyzed,isn't being used right now than there is data that is being used, right?We're literally only touching 20% of the data that we estimate to be out there.
And we're not even analyzing and using all of that. You know,if you were at a company, actually, this would be great to, to get some, uh,to get, to get some input from the audience as well. Is if you work at a company, does your company leverage all their data?Does your company use all of its data?Or is a lot of your data just sitting there? You know, let me know. I'm gonna continue talking, but, you know, let me know and I'll,I'll comment on it as I see it, right? So, um,my guess is that your company probably doesn't,doesn't use all the data out there, right?So let's talk a little bit about Mil Vista's architecture and why this build,why this is something that allows us to build something that's gonna scale. So this is an high level overview,a high level diagram of what Mil Vista's architecture looks like.
You're gonna start with an SDK. We have two, uh, core SDKs. This, uh,can you clarify the scalability target? Is that for Mils or Zillow? Right? So,um, I'll go into this just a little bit later. Um,viss is the open source commercially available. You can use it for free, um,option.
And our goal is for viss to be able to support this,whatever it is, whatever scale needs to be supported. Trillions of vectors,we want Viss to easily support that. Zillow is our commercial offering of viss. It is our, uh,cloud managed, fully managed option for you. And I always tell people,so my role here at zills is to get people to use viss,is to unders get people to use vector databases and teach people about viss.
Because viss is really, really, I mean, at least in my opinion,it is extremely innovative. And whoever built it, I mean,like the team that built it was really, really thoughtful in, in how they,how they approached this. And if you've seen the architecture for SQL databases,for no SQL databases,you'll understand what I mean when I say this is an innovative architecture. Um,but Zillow is obviously our commercial offering. We are a business,so we would like to move people to Zillow.
And I tell people,you should use Viss until you don't want to handle the DevOps anymore. And once you don't want to handle the DevOps, you should go to Zillow because,you know, for most companies, like,is handling the DevOps really a core part of your business?Is it mission critical for you? And if it is, then you know,handle DevOps yourself. You know, we, we,we provide Novus with a commercially free option for that,that we allow you to do that, right? Um,but if handling DevOps is not mission critical to you,if it doesn't generate revenue for you, then you should use zills because it's,it's probably gonna be cheaper. You're probably gonna have to pay less DevOps engineers,you're probably have to pay less DevOps costs, because vectors are a very,um, intensive, uh, very, very, uh, uh, uh, computationally intensivedata type. And so working with these,you have to be able to optimize your EC two instances and other things likethat.
And so if you don't wanna do that, that's where Zillow really comes in. Uh, h Kraus, uh, how,how does the Zillow service integrate with on-prem or cloud platforms like AWS?So we are on AWS marketplace. Um, so you can,you can integrate with Zillow directly on AWS marketplace for on-prem. There is one, uh, s**t, uh, I actually don't know if this is,I actually don't know if this is out yet. Uh, this is kind of embarrassing.
I think this is out. We offer a, um,we offer a service that lets you, uh,maintain the data plane, and we only maintain the, um,we only touch the control plane. So we never touch your data. We never see your data. This is something that I know that a lot of security oriented companies are veryinterested in and very, you know, much want to do.
Um,and so this is something that, that we are offering, it may be available now,uh, it may not be available for the public,but it is definitely available for enterprises. Um, so if this is something,if the on-prem option is something that matters to you, you then, um, you know,I would suggest getting in touch with the team here and, and,and talking about that, uh, uh, more in, in more detail that way. Um, okay, so let's talk about the architecture. So viss is the core. The core features,the core design decisions here are revolving aroundhow do we build a system that will allow flexible scaling,um,and make it easy for you to not have to do things like datamigrations and to reduce the amount of computationally intensive tasksthat you have to do with the system.
And this is different than the design decisions that go into SQL or no SQLdatabase, because SQL databases, for example,SQL databases require more complex operations across different tables. And this is because the meaning behind the data in SQL databases is spreadacross these tables, right? You have relational, uh, data,that's why they're called relational databases, right?They match IDs and they match parameters with these IDs or, uh,attributes with these IDs,and that's how they determine the meaning behind the data in the database. A vector database is very different because vectors encode nearly all thesemantic meaning that you need. And you can add metadata to it,and you should add metadata to it. It will enrich your data and it will make the search experience better for you.
Novus allows filtering. And so, um, you know, the,the primary thing here is you do not have to connect databases. The complexity of queries is limited to rowspecific operations, which means we only need asset at the row level. And the way that we model this, so that this is possible,is that we model the entirety of viss,the entirety of the data operations in Viss anyway,we model the entirety of the data operations in Viss as a, uh,publishing, as, as a pub sub service. You're publishing data to basically Kafka through Kafka, through Pulsar,through some sort of stream.
And then the data node will subscribe to that,and they will read that. And that is how that will get in there. And so that actually allows a, um, functional, uh,that allows decoupling of basically every function in the systemas long as, uh, it is not within the same,uh, functionality basically. So the way we do this is we have a set of, you'll see at the bottom here,we have the set of worker nodes. These worker nodes are stateless.
These are just nodes. These are essentially like CPUs that you spin up whenever you want. On top of that, the way that we, uh,work with these worker nodes is we have a coordinator service. And the coordinator service essentially tells the nodes what to do. And in order to store the data, we have object storage such as min io,s three blob storage, whatever units that you need.
The reason why this is so important is because this architecture,this design allows us to,to orchestrate a true separation of concerns. When you are building a search system,when you're building something that allow that, that, that,that is focused on searching at scale,you really have three separate 3, 3, 3 separate concerns. One is,how do I query my data? Two is how do I get my data?How do I put my data into storage? And three is how do I access my data?And so this, for us, we separate these into query data and index,and these nodes are all kind. I mean, these, like, these,these functionalities need to be decoupled and need to be separate from eachother because they're gonna be used differently. Um,the query node is, is gonna be used to query, right?It's gonna be used to find your data in the system,it's gonna be used to grab your data and grabbing your data has nothing to dowith you putting your data in the system.
And so the data node in the query node has to be decoupled. And the index node here also has to be decoupled. Because the way,because using something to create a way to access your data has nothing to dowith whether or not the data's coming in. It has nothing to do with whether or not you're asking for specific data. So all three of these concerns need to be separate from each other.
And so that's why we have these decoupled functionalities. And then the, the on top of that, we have this, um,one of, we have this, this,this concept of allowing you to write and keepyour data consistent, right? One of the most difficult problems to,to deal with,with a system that is this constantly updating is this data consistency. How do I make sure that my data is, is the most recent data,or at least the most acceptable recent data?And one of the issues with building something that is not a distributed systemwith building a single instance scale up is you run into hardware problems. So for example, let's say you have, um, I'm not gonna name any names here,but there are competitor companies that do this. Um,and let's say that you are using one of those and you are on a single instance,and you are scaling up and suddenly you're hitting your 100 and, you know,28 gigabytes of ramp.
I don't even know if that's like, I don't know how big,how big these can really scale up to, but like,let's say you're at 128 gigabytes of Ram 256 gigabytes of ram. Well,what happens then? What happens when you need to scale past that?You need to do a data migration. And so we abstract this out completely and we say there's no reason that youshould have to do that, because we think that the main issue to solve is scale. So we want to make it easy for you to put your data in and to keep your data inthere and to keep growing the amount of data that you have. And we wanna make it easier to,to do that without having to do things like re-indexing or data migrations,right? When you're doing data migrations,you have to worry about data consistency between your new data, your old data.
Where is the new data that comes in getting WR written to?Is it getting written to the old data? Is it getting written to the new data?Where is that getting written to? And where am I pulling data from?Am I pulling the old data? Am I pulling the new data?When should I pull this kind of data, right? So data migrations,like I know as a software engineer, Amazon, I've seen this kind of stuff. Data migrations are not something that I wanna do anyway. And so this is one of the,one of the things that we had in mind when we designed the system,and that's why, um, and that's,that's exactly why we have this kind of stateless worker nodes that just storeyour data in object storage and are able to grab,read only versions and work with that. And that's why we have these, these, uh,index nodes that do this thing where you're building these, these, uh,indexes over segments. Um, so index nodes build,we build an index on your data every time there's a 512 megabyte segment.
Every time you put 512 megabytes into the system,512 megabytes is just a default number. It is not what you have to use. If you want 256 megabytes, if you want one gigabyte entirely up to you,we can do that. That is a configurable aspect of the system. This is just what we suggest.
And not only is this good,because you don't have to rein index, right?Because every time you're adding data, all you're doing is, is, uh, is,is just creating new indexes. You don't have to completely take down an index and make a bigger index, right?So one of the issues with single instance scale ups is let's say you hit,you know, a hundred gigabytes of data, um,and maybe that's the si limit of your, you know, whatever, uh, storage thing. Um, and you want to add more data. Well,now you've gotta have your data in two different places,and you've gotta find a way to index that data. And indexing is a very computationally expensive task because indexing,you have to actually go through and you have to go and do all these vectorcalculations.
And vectors are very, very long, uh, series of numbers. Typically, we see vectors of hundreds or even like a thousand, 2000,uh, of, of, uh, dimensions, right? So fif, uh, open ai, uh,vector embeddings are 1536 dimensions. Um, so yeah,like as you, as you, as you work with this, you're gonna have to work with very,very complex data. Uh,it looks like there's a lot of questions coming in right now. So, um,let me answer the questions,and then if there are more questions about the architecture,I'll answer the questions about the architecture, and then I will, I will go on.
So let's see. Um,where does an LLM integrate with this architecture?What is the API for calling into mils for something like a similarity search?Uh, so, uh, and, and this, this, this,this architecture is the database system architecture. The LLM is really part of the rag stuff, the retrieval augmented generation. And while that is one use case for vector databases that we see is very popularright now, that is not the only use case for vector databases. And, um,for example, uh,our original biggest customers were using us for product recommendation.
And product recommendation has nothing to do with l lms. It has nothing to do with rag. And then another use case for us is, um,there's companies that are using us for like, uh, DNA molecule kind of search,generative, like, I'm not actually entirely sure how that works, but, uh,they're, they're, they're doing something with like molecule structures,and that has nothing to do with LMS either. And so LLMs don't actually,like,we are not touching the non database system aspect because we believe that thedatabase system itself is complex enough that you should be,you should focus on how do we provide the best database system. And similarity search is, uh, well, we have, uh, SDK called ilves,uh, that you can use.
Um,or you can go on Zillow and zillow. com has a lot of different, uh,examples on how you can integrate with, uh, or how you can use the API. But at its core, it's, it's the same as connecting to any server. If you've ever connected to some sort of server,you know that there's a host and there's a port,and you can turn this host port into AURI,and then typically you pass some sort of API token. And for us that API token is a username password.
And so we,you can interact with the API for viss,the exact same way that you would interact with the API for any server basedsystem, right? Because viss is a server side vector database. viss is something that sits on a server. And the reason for this is, once again,because we believe in scale. Does Viss store all of its data in Delta Lake format? No,I believe it is actually Parquet, uh,but I believe you can also mess with that format if you would like. I think Parquet is the default.
Um,but that's a good question because I will take this back and see if there, and,and see if, if there are other, uh, ways to, to store the data as well. Uh, ooh, answer live, it looks like there are three types of worker nodes,query data index. Can all of these scale out to multiple machines? Yes,all of these can scale out to multiple machines. And actually, but, uh, so I,I'm not, I'm, um, each node essentially is a type of machine. Is is is one machine.
And the way that we actually do scale out is we do sharding. And so shards can sit, so you can have multiple shards if you would like. You can have 1, 2, 3, however many shards. And then we have something called a shard delegator or a shard leader. And what this does is this aggregates all the shards,and this keeps track of the localization of where all your data is.
So you can have one query note, and this, this is sits on the query node. So the query node, um, can typically hold the shard. And then what happens is individual shards also have a access to segments. And segments are, segments are the, uh,the 512 megabyte segments that do the indexes. So the,the shards are what controls the scalability,and you can put shards onto multiple nodes.
And, um,it is not a one-to-one, you can do multiple if you would like,based on replic replications, if you need, uh,for high availability or whatever it is that you want to do with that. It depends on your use case. But the answer is yes,all three of these are scalable, independently scalable,do not need to scale with each other, they are decoupled. Um,okay. Can you still query data that has not been indexed yet?That's a great question.
And the answer is yes. Yes, you can. So we have this thing called a, we have this concept in, uh,vis called a growing segment. And a growing segment is data that comes in,it goes to the message store, and in the message store it gets timestamped. And that's how consistency works.
We work with this thing called Delta consistency. And what that means is, um,well, I mean, there's strong consistency and there's eventual consistency,right? So eventual consistency is essentially, you know, uh,is a delta of infinity, and strong consistency is a delta of zero. So we timestamp all the data so that we know, you know,how long before the query, uh, do we need the data. Um,but as this data comes in, it's growing, it's in a growing segment. What happens is the query node and the data node keep track of this growingsegment.
And as you a, when you,if you need to query or when you query,the query node has access to your growing data, and depending on what your data,what the size of that growing segment is,we'll either already have created index for it. So once it hits 10% of the,the segment size, so in, in,in the example that I gave the default example of 50 megabytes or so,we'll create an index on it. So it's faster to search that,but before that it's just brute force search through those, uh,through those vectors. So the answer is yes, you can do that. Is is this something we have to worry about if we ran viss in our own datacenters? Uh, I, I, I need you to expand on this a little bit more.
I'm not sure what, uh, the, this is referring to. What is the engine that runs viss? Is it Java based? Uh, oh. Oh, like what,like Viss is written on GoDelta lake is Parquet with the addition of a transaction log. Oh, um, that,that, that might be, that might be right. I, I do,we do have a lot of logging and we do essentially structure the entire system asa pub sub, uh, log service.
Okay, cool. Looks like we got through all the questions there. Um,if you would like to re re-ask this question about the,is this something we'd have to worry about? Um, I can, I can answer that,but, um, if there's no other questions about the architecture,I'm gonna move on to the next slide. Um,if there are questions about the architecture,we could probably spend a few hours diving into this as I just spent like anhour earlier on Monday diving into this,and literally just the sharding and segment stuff. Okay,so here's the more advertising kind of stuff that talks about Zillow.
Zillow cloud is the fully managed vis service, right? So we put some sort of,uh, AI ML toolkits on there. So this is where, uh,the question about LMS kind of makes sense. So we do a lot make it,we do try to make it easy to be able to pipe in your data or pipe in yourembeddings or stuff like that. And we have some partner companies that do this as well. Um, and we have viss,which sits in the middle.
And this is essentially the Zillow cloud offering. Is this fully managed viss, and additionally to Viss being fully managed,we are also optimizing the hardware, uh, the hardware capabilities. And I mean, you can do this yourself, but you know,obviously we put a lot of people onto this. We have a lot of expertise in doing this. Uh, and we just recently, uh,not recently, I guess, I don't know how it depends on what you think is recent.
Um, but about March of this year,we began our NVIDIA integration, and we finished that up earlier this year. I'm not exactly sure when,but we finished up our NVIDIA integration and now Novus, um,can run on, uh, hardware accelerated GPUs. And one of the reasons why we did this, it's, it's, it is because, you know,once again,vectors are computationally expensive data types to work with, right?We're working with hundreds of numbers, and the way we compare them is,you know, there's three basic ways to compare these, uh, vectors. And one is Euclidean or L two norm, right? That's X square plus Y square's,square root. Now in our implementation,we get rid of the square root because the rank order,whether or not you square root is gonna be the same, right?X square plus y squared is, is, you know, the, the,the same rank order as the square root.
Um, the other ways you can do this are,uh, uh, inner product, which is a DOT product, which is just, you know,we multiply all the numbers and then we add them and, and, and, and, you know,check out what, what that looks like. Uh,and then the other one is cosign similarity, where we measure the,the angle between the, the different vectors. And, um,all of these are something that requires you to run compute through the vectors. And so you need something that is able to efficiently andeffectively and accurately run that compute. And so that's why we did the GPU integrations.
Um,other things that we provide with Zillows on top of viss include that,you know, the multi-tenancy, right? You can use AWS, you can use GCP,you can use Azure, uh, I think you can use Aldi cloud as well. Um,and, uh, yeah, so I think that's, that's the, oh,the self-hosted option. Yes. Yes. Okay.
So this is on the slide. Um,the self-hosted option is that if you are an enterprise andyou havea lot of data and you don't wanna manage it, then you come to us and you say,Hey,we want you to manage our data to manage how our vector database,but we don't want to, we don't want to, uh,we don't want to expose the data to you. And that's what I was talking about earlier with the control plane data planeseparation. And next year we'll have a truly on-prem deployment. Um,but, uh, we do not have that at the, at at the current moment.
Um,and we are also, like, we have a lot of security and compliance, uh,like initiatives and security and compliance things. I should be,I think they're on here somewhere. Yep. Yep. Here they are.
SOC two,type two ISO 27,001, um,you know, role-based access control data encryption built in failover,three nines of uptime, these kinds of things. Um,we're probably working on four nines, five nines uptime, right? You guys are,you guys are all software engineers. You guys under, not all of you,but many of you're software engineers, you understand the uptime stuff. Um,okay,is there an article on Zillow that explains how you choose between Euclidean dotproduct and co-sign distance? Um,I will link some YouTube videos of me talking about the, whoa, whoa, whoa,what just happened? Um, what is going on?Can you guys, what do you guys see in here? Okay,I don't know what just happened to my laptop. Um, no, what is, what the,I'm sorry, I'm having technical difficulties.
Whatis going on?Okay, I will link some YouTube videos of, um, these, uh,fundamentals, uh, on Zillow, and, um, hopefully those would be helpful. So I'll put this in everybody. Here we go. So these are, uh, where I talk about,they're very quick videos I just talked about. What is L two?What is that product? What is cosign?What are the different advantages of these?And I provide like a visual that lets you see kind of what they look like.
Does Zillows have comparative information for viss compared to other vector dbsfor things like architecture, features, scale and performance?So we do have things for scale and performance for architecture. Um, you know,uh, like, uh, it's, I mean,it's hard enough to get people to document their own code. It's very,very difficult to get people to document other people's code. Um,so we do not have documentation for chroma, wvi quadrant,and obviously nobody can see pine cones, uh, code, which is, you know,part of the reason why, um, I mean,part of the reason why I don't think Pine Cone is, is, uh, uh, is,is gonna be a truly, um,competitive solution for us is because we believe in the open source ecosystem. We are very much about the open source ethos.
We believe that open source software makes things better, and, um,pine Cone is closed source. And so I, I mean, I don't even really not a fan. Um, but anyway, uh, features scale, performance, um,we do have something for that. Let me find that. Uh, features are hard, but scale and performance, here you go.
Um, okay, done. Okay. It's the, and so much more than Viss slides, we see slides. Uh, okay, good,good, good. Uh, okay, cool.
Yeah, I was having technical difficulties,so thank you guys for verifying this. All right,pine Cone says that adding too much metadata can hinder retrieval performance. Can you talk about that for Viss? So, um,it's viss. Okay, so this is actually,this is actually a really cool point about the way that Viss handles metadata. So viss lets you choose which metadata fields you want back.
So your retrieval, if you don't care about your,if you have like metadata fields that you want back specifically, then you know,you can put them in there. But if you don't care, like you don't need them back,right? You don't need them,you don't need all the metadata for every query you're gonna do. Um,you are going to do many type, the, the, the,there's a very high likelihood that you're gonna do very many different typesof, of querying. And because you're gonna be doing different, many, many types,you don't need, um, you don't need to, to, to,to get the same metadata every time back. One of the cool things about the way that VUS does handle metadata is that youcan filter on your metadata.
And I believe that, um,there are many, many ways to filter our metadata. There are,we have a page somewhere that tells you about all of the expressions. And the way the metadata filtering done in is done in Novus is,I think this is absolutely brilliant,is it's actually implemented with a bitmask. And so what that means is that yes,you're gonna get time editions for metadata filtering,but that time edition is gonna be a near constant time edition, right? It's,it's,it's gonna be so small compared to the time thatit actually takes to do the computationally expensive tasks of doing the vectorsearch. So that's how we handle metadata, and, um,that's why VIS is able to scale along those lines as well without having much ofa performance, um, uh, uh, uh, difference for adding more and more metadata.
Um, other features of, uh, milli, uh, of Zillow is, um,we have an auto index. So, uh, if you guys have heard of face, uh, Facebook ai,uh, similarity search,we've got one of the people who wrote like a ton of code for face, uh,for face is an, uh, is a way to, um,face is a way to essentially, uh, build your indexes a way to,a way a a framework to efficiently build your indexes and, uh, to query on them. And so face is probably somewhere under the hood here,but we auto index your data. So you don't have to choose the index,you don't have to think about the index type. With Novus,you should think about the index type,and it's actually quite important that you use the right index type becausedifferent use cases will require different index types.
So for example, um,if you are doing product recommendations,you may be fine with not getting good recall. You may be fine with say, Hey,I want a hundred things back, and as long as 30 of them, or 40 of them are,you know, close enough to what I need, like are actually the L 30 of the,of the a hundred that I get back are actually the closest 30 in the,in the real data. That's good enough. Um,and the reason why product recommendations can do that is because when you'rebuilding a product recommendation engine,it's very likely that you have products that are, um,it's very likely you have products you're selling,and people who wanna buy things are gonna click through,and they will click through multiple things because they can't find exactly whatthey need, but they find things that are similar. I do this on Amazon all the time.
I'm scrolling on Amazon,I'm doing my online shopping. I'm like, ah, yeah, okay,this is kind of close to what I want, but it's not exactly what I want. Lemme go see what, lemme, lemme go see what's similar. Right now,Amazon doesn't use a Vector database, but the idea is the same, right?Like eBay uses us, Walmart uses us. So if you're on eBay,or if you're on Walmart and you're like, so let's say you're on eBay,you're like, you know, I wanna, I want a shiny char art, or I don't know,something like that, right? I want si, I want a shiny char art, but you know,what I see is a shiny blast, toys or something.
Well, let me,lemme see what's similar here, right? And then maybe, you know,of the collections you get back, you get a bunch of Pokemon cards that are all,all, all all Pokemon cards, and they're all similar to Blast toys,and maybe none of them are charar, but you'll click through and you say, Hey,this one's Blain. Oh, this is also a fire type. Oh, cool,the next one has ard and you'll see it and you'll get it. So product recommendation, people, they don't really care, right?But then maybe, maybe you're doing, uh,maybe you're working with DNA and you're working with, uh, molecule structures,and in that case you probably want a hundred percent recall. You want to make sure that everything you get back is the closest that is,is is actually the closest that you get back.
You want the actual accurate structures. But in that case, you don't need,there's some things that you don't care about, right? You care about throughput,but maybe you don't care about, like consistency levels. You can do eventual consistency. So there's a lot of customization that goes into building your Vector databaseand using your Vector database. And it really, really depends on your use case.
And that's why Novus has all of these different features built in. And that's why Zillow will actually,the auto index will look at your data basically,and it will build the index for you based off of what the mo what the best data,what the best index is for that data. Um, on top of these things,we have a bunch of integrations, OpenAI, cohere, et cetera, et cetera. Um,we support data migration. Remember earlier I was talking about data migration,right? You gotta put your data, you got new data, you got old data,you gotta make sure they're consistent, right?So we support these consistency levels, we support you doing the data migration,um, and we have on demand scaling.
So, uh,you don't have to worry about on demand, like, uh, load bouncing and scaling. So you don't have to worry about things like, oh, you know, like tomorrow is,um, let's, let's say your eBay, uh, tomorrow is eBay day. I don't know if that's a real thing, but you know, Amazon has their prime day,right? So tomorrow's eBay day, and, uh, we're expecting like, you know, uh,10 times traffic. So we need to scale up. Well, you don't need to do that with,with Zillow, right? You don't need to have your DevOps engineers do that.
You don't have to have your DevOps engineers build a prediction system. You can just say, okay, we're on Zillow. Zillow will scale it up automatically for you. Um, on top of that,there's like a lot of monitoring, like log stuffs. We'll alert you to your usage.
We'll tell you about, you know,have you used this collection? Is it sitting empty? Are you using it a lot?Is it going over your, your desired budget? Et cetera, and et cetera. Your architecture slide showed DDL, which means schema definition. How do you design the schema for a vector database such as viss? Wow,what a great question. Um,so schema is, okay. Uh, so, um,the way that we, the way, okay, uh,I think about schema design a little bit differently than, um,the way that we as a company promote schema design.
Um,but I think that you should be defining your schema based off of the metadatathat you need. You need an ID field, and you need an embeddings field. You need the embeddings field. So you can compare the vector embeddings across,um, you know, across the vector and embeddings,and you need an ID field so that you can, um,so that you can retrieve, uh, uh, uh, specific vectors, right? Um,and then we introduced this feature recently called Dynamic schema,which allows you to essentially just throw in JSON documents as long as you havean ID in embedding field, we don't care what the other fields are. And the reason we do this is because, you know,we want to allow for flexibility in the data that you put in there because,you know, as, as much as, so I think that all data schemas should be enforced,but in reality, that is not how people use.
That is not how people use things, right? You can tell people all you want,use it this way, but they're not gonna listen to you. And so you have to be able to, you have to be able to, uh, account for that and,and,and work around that and give people the tools they need to use it in the waythat they wanna use it, right? So in the end,we have this dynamics team of thing,because the users want to use it differently than the way that we imagine,than we, than, than we imagined they would use it. And so this allows for flexibility and use case. This allows for more usability,ease, not necessarily easier usability, more customization, um,and just a broader set of use case and a better user experience. Is there a publicly available product roadmap for vis, uh,Christie, I don't know if you know this one.
Um, if you know this one,please drop it in. Uh, I don't know this one. How do you compact or delete obsolete or updated data in vis? Great question. So right now, um, this is why we have the IDs, right? Um,this is why we have, uh, this is why we have the ID field, basically. So you can grab your data and you can delete it, and you can resert it.
So,right, so there's, we just added this thing called upsert, actually. So that's the new feature in Zillows as now,now you can just insert your data like this. Um,but the way that deletes were working before is that we would, uh,do a soft delete. So it's essentially, you know, uh, we store your data, uh,uh, uh, so all data is stored kind of like,you know, like, uh, in bits, right? So what we did,what we do is we have a bit at the beginning, and in order to delete,we do a soft delete. We just flip the bit.
And so this makes data recovery easier,and then later on we can do a hard delete down the road. And once you do a hard delete, then there is very, then,then there comes a likelihood that the segments that you have, the 500,originally 512 megabyte segments are no longer 512 megabyte segments. And some of them may be small. So after we do, uh, those deletes,what we'll do is every so often, and this is,I believe this is up to user configuration, uh, at, at least with Vis,with Zillow, it is automated for you. Um,you can combine all these segments andre-index just small segments at a time.
So this is still a computationally expensive task,but is much less computationally expensive than re-indexing your entire, uh,entire, uh, uh, dataset, right? So, um,as you delete more and more data, you get sparse certain sparse or segments. At a certain point, we'll say, I think it's maybe like 10, 20%, we'll just say,Hey, we're gonna combine these segments and we're gonna re-index them. So at a certain point they will get reindexed, um,but originally they delete the delete is a, is a bit flip. And then later on in actual, uh, overwrite,What does auth zero Oh, um,huh. Lemme see.
Uh, this is not, this is, um, wait,I'm going forward. I meant to go backwards. AU zero LDP. Lemme see where this one'sDo I, what, where is this?Is it in the other one? Isit in here? Ah, ah, ah, ah, ah, yes, yes. Oh, okay.
Um, I don't,uh, yes, yes, yes. Thank you. I, I,I did not realize that it was about this slide. Okay? Um,I don't believe that it's auth zero. I believe that this is called, uh,auth o Um,I'm not entirely sure what, uh, this is, to be honest.
Uh,this is part of the Zillow's cloud stuff. Um, I will get back to you on that. I will find the answer for you, and I'll get back to you on this. Um,I'm much more familiar with the Novus side of things than the Zillow side ofthings, as you might be able to tell. Um, but yeah,so essentially Zillow will handle a lot of the authorization stuff for you.
And I believe this, what this has to do is, I believe this is, uh, related to,um, uh, uh, the role-based access control kind of stuff, right?Is when you are building these systems,you need to make sure that there are the correct authorizations. And that's, that's what this is for. Auth zero is popular, yes, I know,I know of auth zero, but I don't believe this is auth zero. It could be we do allow you to sign up with GitHub and, uh, and,and Google and, and whatever. So, um,so the way that Zillow works, so this is the Zillow architecture.
This is essentially how we're also able to, uh,do a decoupling of the control plane and the data plane, right? Um,so essentially you have your private cloud, your VPC, right?We can do VPC tunneling. I actually, I think we can do VPC tunneling. It is definitely available to enterprises. It may not be available to the general public yet. Um,but this right here is basically saying like, Hey,we're gonna take this Kubernetes agent, and the Kubernetes is gonna handle like,what's going on with, with our proxies, right?The load balancing and all that s**t.
And then we're gonna,we're gonna use that to run viss. We're gonna use Kubernetes help charts,whatever, to orchestrate viss. Kubernetes is like your orchestrator,viss is your engine, right? Um, and then, um, you know, if,if you have AV, VP C with whatever you have in there that needs to happen, uh,we can, we can do the peer linking and, uh, and you can run that in your VPC. Uh, what more questions, unless they meant an ld,IDP and an identity provider. Um,yeah, I, I don't know.
I really, I just don't know the answer to this question. Uh, I will find the answer to this question, and I can,I can get back to you about it. Ah,what, what just happened? Oh my God, having more technical details. Okay,here we go. Um, so yeah, these are the ways you can use Zillow.
Um,there's a starter tier. This is a serverless tier. This is free, um,free for as long as we exist as a company. Um,at least that's the current, uh, that's the current, uh, uh, uh, initiative. That's the current guideline for this.
Um,and this allows you to get two collections of up to 500 K,768 dimensional vectors, or 256 K,1536 dimensional vectors. The reason why we kind of set it up this way is because it's easier for peopleto grok,like number of vectors and like size as opposed to doing something like,uh, you know, half a cu one CU or something like that. Uh,we don't provide any uptime, SLAs, single availability zone. There's only GCP for this. Um, we don't promise anything for security.
Um, and the main support that you get on the starter tier, the serverless tier,the free tier,is that you can come to the community and ask the community for support. Um,and we have a discord channel that, um, is, is, you know,for community support basically. Um, and then you have the standard tier. This is the dedicated tier. This is the, you know, uh, production ready.
Um,and this allows you, we have cost optimiz, cost optimized storage, optimized,performance optimized. We have different tiers basically, uh,that you can use for this. And they all have different costs, uh,and they all have different use cases, essentially. Um,and you can put however much data you need on here. Um, and this comes with all,like the basic security stuff, right? Data encryption role-based access control,SOC two, whatever.
Um,and you can get email support and up to two technical people from your,from your company can come and, and, and, and, uh,we'll answer questions from them. Um,and then we have the dedicated enterprise here,which is when vector databases are a part of your company'smission critical, um, is mission critical for your company. If your use of similarity search,if your use of being able to compare unstructured data is critical to drivingrevenue in your company, that's when you would use an enterprise kind of tier. And so here we have once again, the same kinds of options for the,um, for performance. Um, but we also provide extra SLAs,like three nines of uptime, multiple availability zones, the VPC private link,the data backups, and up to four, um, contacts.
And your organization can use multiple clusters, uh,types of clusters if you need. What is the overhead to move from a serverless to dedicated cluster?I don't think it's anything. I think it's, I think it's actually very, I,I don't think you experience any, any disruption in, um, I think you, like,literally you should not experience disruption in service doing that. Are the HCTP metrics usable by visualization tools like Grafana, Tableau,and D three? Yes. In fact,this morning I got a message that Tableau is just integrated into Zillow'sCloud.
So the answer is yes, you can do this, and I, Grafana is also available. Uh, can you manage your viss database using IAC like Terraform or ACLI?Yes. Yes, you can. Uh, ooh, wait, wait, wait, wait, wait, wait, wait, wait. Uh,Terraform,I'm not entirely sure.
I'm not terribly familiar with Terraform or Ansible. Um, but with the Kubernetes stuff, definitely, um,the way that I manage my non Mils lightdatabase is through, uh, Docker Compose. Um,and that's just because I'm not working with something that is incredibly huge. Um, I mostly use this for, uh, you know, proof of concepts, uh, MVPs,like that kind of stuff. Um,but the Helm charts is how I see a lot of enterprises like Salesforce and atand t using Novus.
So there are three types of cuse on Novus. One is optimized for cost, which is basically, um, you know,it's, you don't need the best performance, but, uh, you need it to work. And,um, it should still beat out the performance of other it, well,not should, it does, if you look at the benchmarking,you can bring your own data and always test it yourself. But if you look at the benchmarking,it does beat out pretty much the performance of most of the other vectordatabases out there. Um,and this allows you to do one CU is 3.
75 million,10, 24 dimensional vectors when you go to storage. Um, same difference. Um, but when you get to performance, we are actually, the, the, the storage,you get less storage on it, but that's because it is more compute heavy. It is meant for faster, uh, you know, for less latency and for, um,better throughput. Do, do, do, oh,there's more questions.
Somebody just has to create a Terraform Terraform provider to allow it to findand manage things like novis. Maybe Zills is working on one. Uh,Christie Satchi, I don't know if you know the answer to this one, but I don't. It could be. I think that, you know, we are in the,we are interested in the space, and so we will definitely be moving towards, um,allowing more flexible usage and allowing and, and,and trying to build the best user experience possible.
Lummi is the other popular IAC product. Okay, thank you. I,I haven't heard of this. If VIS has an admin, CLI,you can do your own scripting that way. I think that it does actually exist.
There, there is like a cl I mean, so you can interact with VIS through, uh,through, um, GRPC or through, uh, uh,rest. Um, and so I believe the CII think you can just, yeah,I think you can do that with the, with with the CLI. I don't,I don't see why that would be an issue. As long as you can provide,as long as you can send all of the, the, the parameters that you need. Um, did I cover the rest of this? Uh,performances is a lot faster than either of these.
Um,I think these are all the same. Uh, oh, here, example, use cases. This is probably good for you guys to know. So,data labeling or deduplication or clustering or outlier detection, essentially,if you want data cleaning, data analysis, cost optimized is the way to go, uh,storage, um, is for more of like, you know, you want to compare many,many things. You have a lot of things,but you don't need to do a lot of comparisons.
Uh,or you need to do small comparisons, but you, what you really, really,what you really, really care about is being able to store a huge amount of data. That's when you would use storage optimized. And then for performance,you want more real time, more speed, more, uh, more throughput,all this kind of s**t. That's when you would use, uh,the performance optimized stuff. Gen ai, recommendation systems, chat bot,fraud detection, right? Real time, like real, uh, uh, performance critical, uh,use cases.
Um,so as I was saying earlier, right? Uh,we are not just for rag rag is just something that's very popular right now. The LMS have brought in a lot of attention to vector databases, but really it's,it's actually not the only use case. And it's, I would say it's not,I don't even think it's gonna be the biggest use case because really like thingslike drug discovery and product recommendations are huge inventory management,right? Fraud detection. Like banks, they need to be like, you know,like I had my credit card stolen like, uh, a couple weeks ago. And so like,things, and, and like, you know, things like this is,is is why it's important to have these ability to compare this unstructureddata, um, like across, not just, it's,it's, it's not just something that you feed into lms, right?It's something that you wanna be able to work with because you're doing asimilarity search thing.
Relational databases can have indexes on commonly accessed columns foracceleration. Is there something similar to this for vector databases? Uh, like,um, I'm gonna need you to expand on this question. Like, uh, you mean like,like caching? Um, we do have a,we have a, a tool called GPT Cache that does semantic caching, seman. It's a semantic cache at scale. What's the query language from Viss?The query language from Viss is, um, it's not sql.
Uh, essentially what you do is you provide a vector and then it does a vectorsearch. And if you need to do filters,you provide an expression and it does an expression search. Um,so I think I,I might have an example for this pulled up. I'm not sure if I have this pulled up. Uh, da da.
Where are my visual studio code? Basic usage. Okay, so I'm gonna stop sharing this and I'm gonna share my VS. Code share screen vs code. I believe this talks about this, yes. Okay.
So this is what a raw search and novis would look like. I will zoom in for you to see this. So essentially we're gonna do is you send in some embeddings,and then you tell it what field, uh,to run the approximate nearest Sameer search on,and you give it some parameters for the search. Um, and so this case,we're giving it L two because that was what we stored it with. And then, um,how many of the, oh,so this is specific to the fact that we store this using IVF,which is inverted file index, which creates a list of, uh, cluster, uh, sorry,list of OIDs.
And this is just telling it how many OIDs we wanna search. Limit is the number of results we want back. And then output fields,this is where I was talking about the metadata, right?If you need this metadata, you, you pull it back. If you don't care about the metadata, you don't pull it back. So in this example, I'm only pulling back, back these two metadata fields,but what actually happened is, uh, this is what the actual fields look like.
So I'm storing a ton of metadata and I'm only pulling back two fields. Um,and then there's an expression here, right?A filtering expression like I was talking about. And, um,this is, uh, this is how you, how you filter expressions. It's, it's very, it's,it's kind of like sql, I guess you can, I mean,the expression fields you can think of as like sql, right?Like there's the same,like kind of like greater than equal to kind of stuff to it. So, uh,search language is API by SDK, okay? I don't know what that means.
Oh,can you expand on a use case such as the credit card being stolen? Dude, no. My credit card was literally stolen. Not, I'm, I'm not using this as an example. Use case, it just was stolen. Why not use an LSS TM model to identify fraudulent transactions wherever thevector store come in? So Ltms can't identify fraudulent transactions.
Ltms are meant for, uh, sequence, uh, sequence to sequence transformations. And,um, a vector store is able to do similarity search. And so the way the, uh,fraudulent transaction would be, um, um,done I guess would be that you do, you, you,you have like the information about the user and you're vectorizing that, uh,information. And, um, then you compare that information to the other vectors,right? So every time that there's a transaction,there's a lot of information about that transaction. And so you vectorize that information and you compare it to the othertransactions for that user and possibly also other like things that user isdoing.
So for example, um, so, so for example, uh, if, uh, if,if, uh, if most of my spending is in Seattle,and then suddenly I'm buying something in Ohio, red flag, right?And you can see that in the vector distance. So things like that are basically how fraudulent detections are, are, are,are done. You see something that's unusual out of the ordinary and it does that. Um, so the, the way this was being done before, um,was actually through this thing called collaborative filtering, uh,or collaborative filtering was for product recommendations. But similarly,you could do something like this, um, with, uh, the,with anomaly detection, yes, it's vector distance, not an AI model.
We're not using AI to, to, uh, to, to, to calculate that. It's actually much,much simpler and much quicker to use vector distance. Now,you do have to use a machine learning model to vectorize your data,but beyond that, it's, um, it's, it's simply, uh, uh, the,the vector distance we don't use. Um, we, we don't, we don't,we don't run it through some sort of model basically. Uh,okay, wait, where is my, where, where's my screen?I'm gonna stop screen share because I don't know where my screen is anymore.
Okay. Uh,okay, were there any questions here? Uh, there's some, uh, so we're at time. Um,so I can,I'll answer the last few questions and then if you have other questions, uh,my name's in the corner, just feel free to reach out to me on LinkedIn. Um, or,uh, Christie, can you post the Discord link? Um, and, um, you know, the,we can also, uh, uh, uh, feel free to reach out on Discord. Uh, this,this one's done.
I'm guessing that because of schema and usage for vector structure and usage isso different from databases like sql,there's no practical way to create an index, something like an index. Four relational systems. So I talked about this earlier, the indexing,we do do an index, right? Um,and so the indexing is the way that we access your vector data is the way wecompare your vector data. And, and it's not so much like,it's not so much that the index is is done, uh,caching or whatever the indexing is,is much more of something likeyou have 6 billion vectors that you want to search on,but you don't wanna do 6 billion comparisons. And so you create index over that and indexes,examples of indexes are like IVF inverted file index,where you create a ton of OIDs.
And then the way that works is first you search for the, just the OIDs. And based on the oid, you can pick out what might be the closest one. So you reduce your vector space that you have to insert, uh, uh,search by a lot, and you reduce the amount of computations that you have to do. Uh, other examples of indexes include like HNSW, hierarchal,navigable small worlds where essentially every layer is a sparser andsparser graph. And then you search, uh, the, the, the,the top graph all the way down until you, um,un until you, until you get to the bottom.
And in this graph,we encode all these distances,so you don't have to do as much of the calculations. So we have an index,and in fact, indexes are almost required,necessary for factor databases because if you don't have them,you're just gonna be doing flat comparisons. And the best computational,like the best,most computationally optimized systems will still be slow for that. So there is an index. It doesn't work the same way that it works in relational systems.
Um,essentially what it is, is it's the only,it's not just an efficient way to access your data. It's pretty much the only way you can really work with it at scale. Okay. Um,thank you guys and, uh, h Kraus. I believe I,I believe we've met before.
So hope to see you at one of our Seattle events again soon. Um,thank you guys and uh, have a good day.
Meet the Speaker
Join the session for live Q&A with the speaker
Yujian Tang
Developer Advocate at Zilliz
Yujian Tang is a Developer Advocate at Zilliz. He has a background as a software engineer working on AutoML at Amazon. Yujian studied Computer Science, Statistics, and Neuroscience with research papers published to conferences including IEEE Big Data. He enjoys drinking bubble tea, spending time with family, and being near water.