Webinar
What is the future of vector search? A fireside chat with HNSW author Yury Malkov
Join the Webinar
Loading...
About the Session
The Hierarchical Navigable Small World (HNSW) is one of the most popular graph-based indices for nearest neighbor search. Join us for a fireside chat with Yury Malkov, as we discuss how he ended up developing this critical method of high-dimensional data analysis. We’ll explore how he got started in computer science and where he sees the future of vector search and LLMs. You’ll even get a chance to ask Yury questions at the end of the chat, you won’t want to miss it!
Topics covered:
- Why HNSW and other graph indexes are great tools for vector search
- What is the future of vector search?
- Thoughts on the proliferation and increasing importance of LLMs as it relates to vector data
All right. It is the top of the hour. Uh, good morning, good afternoon,and good evening everyone. Thank you so much for joining us for today's session. What is the future of Vector search? A fireside chat with H N S W,author Yuri Malka.
I'm Emily kk, and I'm a member of the team here at Zillows. I will cover just a few housekeeping items,and then we'll get right into the session. First. This webinar is being recorded, so if you have to drop off at any point,you will get access to the on-demand version within a few days,hopefully sooner. Um, if you have any questions,feel free to paste them into the q and a tool, um, at the bottom of your screen.
Um, you can also use the chat window. It's a little harder to keep track for us,so if you could use that QA tool would really help us out. Um,and then we've got some really great upcoming events, um,that my screens have disappeared. Oh, there we go. Um,we've got some great upcoming events,so be sure to check those out at zillows.
com/event. Um,we will send you some more information on that. Um,and then if you haven't already, um, joined our community, uh,we would love to carry the conversation on from then. And without further ado, let's get into today's session. I am pleased to introduce, um, today's session again,what is the future of Vector search,a fireside chat with h and s w author Yuri Malkoff, um, and our guest speaker,um, Yuri is a distinguished engineer adverse innovation,and is currently developing crucial recommender systems that serve over ahundred million users.
Has published research boasting 2000 citations SPAN cv,LLMs reus,and Similarity Search featuring widely recognized open source solutions like Hand S W and learnable triangulation. Yuri's professional interests encompass constructing highly scalable contentmachine learning systems. In addition to Yuri,we are also joined by my colleague Frank Lu, uh,ML architect and our director of ops here at Zillows. Welcome, URI Frank. Thank you for the introduction, Emily.
Um, and yeah,thank you everybody for coming to this session today. We'll,we'll do things a little bit differently. Um,there won't be any formal presentation or anything of that, of that sort,but we will just be having an open chat, uh, sort of about,a bit about LLMs, uh, a bit about vector databases,and of course about vector search and H H S W as well. So, Yuri,thanks for agreeing to come on. Um, I want to just chat, you know, chat openly,right? But before we really dive into some of those questions about, you know,about machine learning, uh, about, you know, your background, your work,I wanna ask a bit about how you got to where you are right now.
Right?How did you, um, get to, you know, being in Vector search?How did you go from being, you know,having a background in laser physics to doing what you're doing now?Hey, Frank, uh, yeah, thank you for the introduction. Uh, yeah, I got, uh, education in laser physics. So, uh,when I did PhD, I was working on, uh,experimental f plus second laser physics. So,yeah, and there, like, it helped, helped me to shape professionally. So I worked with very smart people, uh,in Russian academia.
And, uh, yeah, one,one of my advisors, uh, later became, uh,the head of Russian academia. So he reported to Putin at some point. Uh, yeah, it was upon time. Uh, also like in some sense there is a similarity between,uh, LA experimental laser physics and machine learning. So in both of them,you build pipelines though in, uh, laser physics,you build like, uh, physical pipelines.
So you assemble,assemble the pipeline that transported the laser beam and adddiagnostics, transport, and so on. Uh,yeah, so it was fun time, uh, yeah. Uh, and, uh, as, uh, to how I,I transitioned to vector search. So, uh, there was no,no like strict, uh, transition. So even before I started working on laser physics, I signed a contract,uh, with, uh, uh, on part-time work,a startup in Nira that was called me labs.
Uh, so,and I worked there for I think like six or seven years while,uh, doing my masters and PhD and that company,it was pretty ambitious, so especially like given the time. So they started at 2007, uh,and they wanted to build a highly scalable distributedsimilarity search engines, uh, with targeting three based,uh, three based entities. So they said, wow, like there is a,there are, uh, internet of things, so when they,they can be represented as three of, uh, uh,of three of properties. So like, you have a,like a G V G player or like a net net player network player have a chili visa,G T V. So like, and they can connect,so they need to find each other effectively.
Uh,so, and that work ended up, uh,in publication of N S W, uh,which is a predecessor of H N S W. So this work, uh,has done together with Alexander Panari, uh, Andrea Log, and,uh, and, uh, yeah, that happened before,uh, I got PhD in laser physics,and also about that point of time the company ceases to exist. So, and maybe like right now it's pretty obvious like why there,there was not that much interest 10 years ago,like in vector search. Uh, yeah,but yeah,Wasn't that much. Yeah, I mean, I don't know.
I think even,I think there was definitely some interest, but it was, it was, it was,it was certainly very fledgling, um, not like what we see today, you know,with all the hype on lms, with all the hype around recommender systems, um,with any type of searching across all these different types of modalities. But yeah, sort of diving a little bit deeper into that, you know, sort of, I,I guess,I suppose I understand to a certain extent how having a background in laserphysics, you know, rebuilding a, all of these pipelines,how that can be similar to building infrastructure for ml, uh,or even building ML models themselves as well. Um, but I'm still curious,like, I mean, you,I think you have to be one smart person to start in laser physics and then stillbe able to do, to create data structures, you know, to create, quite frankly,I think one, one of the data structures that's used pretty much universally,uh, in all vector databases and really all,you know, a lot of production vector search systems. I think that's really,really impressive personally. Um, but yeah, anyway, I mean, how,how do you think,do you think sort of h and s w and do you think graph based indexes,do you think those are going to be, you know,do you think the future is going to be graph based, or do you think that,you know, we'll come up with some better solution, uh, for doing vector search?Something that gives us better accuracy, better recall, uh, or,or better query performance?Uh, well, like that depends on, like,what do you understand by graph based indices? So, uh, fair enough,like graphs are n natural form of representation of relatrelations between items.
Okay. So, uh,and if you look at LLMs, uh,those are also can be seen as, uh, graph neural networks. It's just the paragraph is, uh,fully connected and, uh, like, uh,there are many methods to speed up lms, uh, and, uh,most of them, no, no, no, but a lot of them,they're based on like specifying the graph so that you don't do attention toeach of the element, uh, in, in, in, in, in your input. And that is, uh, the same as gns, uh, but, uh, with like,not in, with especially crafted graph. It might be a dynamic graph,it might be a sad graph, uh, but still, like that's a,yeah, that's a form of representation.
So I, I don't think like graph,uh, nearest neighbors church is going anywhere that is gonnabe there for sure. Uh, so there,there are other methods that gonna compliment that, but, uh, yeah. So you need to define the, the relations somehow. Very nice graph are the form. Fair enough, fair enough.
How did you come by?I'd also like to hear in addition to your own origin story,and thank you for that earlier, by the way. I'd also like to hear a little bit about the, sort of the origin story of, of,of H and s w and this, the origin story of,of how you came about to develop this particularalgorithm. Uh, sort of dive a little bit deeper into,into understanding that as well. Uh, as many details as you can share there,that'd be great. Okay, sure.
Yeah, as I said, so the,that work stemmed from the previous work done in the startup,uh, which is nsw, so that's also a graph algorithm. And, uh,that graph algorithm is, uh, designed to be distributed,so it has, uh, like random entry point. So it's supposed to scale, uh,with addition of new node. So existing industries should be scalable. Uh, yeah.
And that'll have some issues in which we knew. And, uh, yeah, when I got my PhD in laser physics, I understood that,uh, well the career, uh, like to advance in career,I was need to move abroad,so I didn't want to move abroad at that point of time. Uh,so like I was looking for other alternative. And at the same time,like before, a little bit before I got PhD, uh, uh,reach out to me, uh, saying that w uh,the previous algorithm was performing like really well and newly developed,and then benchmark, so it was on top right at high recall. Uh,so I was a bit surprised there is that much interest in n w.
Yeah. Uh, but, uh, if there is interest, uh, so like,I thought that makes sense to like revive this work. Uh,and first I tried to publish, uh,like a net network physics paper, uh, on, uh, the,like, the origin of nav ability in, uh,like natural networks. So it's kind of like a long, uh, old, uh,theme of research. And it was published in like natural scale journals,uh, in the early two thousands.
So I also like wrote a paper. So like we come up with a simple explanation of why natural lang natural,uh, networks are navigable. Uh, but that was, uh,rejected in nature. So it seems that the interest in network science kind of died out by that pointof time. So, and after a few rejections from other journals,it was published in plus one, which is like an, a good journal,but still not nature.
And, uh, one,like I, I, like, I was working on that paper, so I compare, uh,the s w to existing models of navigatable networks,which were advertised, um, research and, uh,s w performed the best, uh, compared to those. Uh, but, uh,like that, uh, uh, like it still,like it has poly algorithmic scality on, uh, dimension,like a small dimensionality and, uh, well struggled at low dimensional data,right? Uh, and, uh, but the, the other,they've struggled more and that hinted the reason why, like, they struggle. So, Hmm. Yeah. The problem,the problem is N S W and other solutions for, uh,navigatable networks is that they, they are kind of like airport travel.
So you,you need to travel from point A to B,so you need to first to get to the airport by public transit, by car. Then,like, if you, probably you don't have a big airport close by,then you travel from your airport to the bigger airport,then to another bigger airport, then to a smaller airport,and then also like travel by. And that,that is a well-known model of navigatable networks. And the issue is when you go to the hub, like the big airport, it,it is very connected. So like,to make this work has to be like really well connected,uh, and that scales as, uh,logarithmic with the number of elements in the, in the network and,and double leading to, and, uh,well after you realize this, so it's easier.
So you need to, well,first you don't need to start with random element. So if you don't need the distributed search, so you cut it,you can start directly from the top element. So that, that's always,and that helped that, that, that improved the performance. But the second thing,uh, is, uh, you, you then need to avoid, uh,doing like all long connections for, uh,for the hubs because you start from the hubs. And for that, uh,like we know that n w was kind of like similar to skip list,so we never published like the, the similarity.
But, uh, internally, we,like, when we explained why NSW was, uh, navigatable, we said it's like,it's similar to skip list. Like skip list is a good model for msw. Uh,but, uh, yeah, it never matter much until this point. And Skip Lift has a natural, uh, division of, uh, links by length. So if you say, okay, I'm gonna use, uh,the ski police division in w then I, I'm solving the,uh, scalability, and that worked.
So the performance on low dimensional data,like improved by many orders of magnitude compared to W andmade algorithm performing well on most of the data set. And, uh,the last point was, uh, that, like,it still didn't perform well on one data set. It was a very low dimensional data set. And, uh, yeah,after discussions with, uh, the by itself, so like,we thought that maybe it worth fixing,so we added a heuristic for the neighbor selection, uh,which we knew from the spatial approximation tree work,uh, done by some good people. And, uh,so that, that made the transition to one dimensional skip list, exact,so that has a nice property.
Uh, yeah. And later we, like,I found out that the same heuristics was used even earlier,works like work by area mount, uh, and like,I have to find it through Google search, like school, school or search,when you search for relative neighborhood graph and the graph es. So like,I'll find all, like, I saw this paper, but I didn't know that I actually use it. Yeah. And yeah, that's, that is basically it.
So, uh, yeah,we did test, so we, uh, integrated with, uh, N W N,uh, which had, uh, already integration with Python and, uh, and,and Benchmark. So when we took part in, in and Benchmark. Yeah. And you know, I think for context, I think NMS live, uh,and Hsw still remains one of the top performing sort of vector searchlibraries on n n benchmarks, um, which I think is super impressive. And thank you for that context as well.
I think it's really interesting to hear about how really you were able to meshthe, the, the concept of skip lists and, and,and sort of navigable small worlds,and then how you also sort of some of the context around how some of this workwas used in the past as well. So definitely appreciate that. But I think either way, I think it's really,really interesting to hear about H and s c W's origin story and how that came tobe as an algorithm for Vector search. I just also talk very,very briefly about your thoughts on, um, you know,some of the newer methods as well, you know, scan disk,n n uh, obviously, you know, disc, K n n also being graph based. Uh,I'm not, not as familiar with Disc k n n as I should be, uh, as familiar, uh,in terms of, you know, the inner workings,especially as it relates to how sort of some of the individual components arestored on pages in M VM e.
Uh,but I'd like to get your thoughts on those as well. Right. Um, you know,as really, you know,foremost expert in vector search and what are your thoughts on disk? And then,um, about scan some of the newer span, you know, some of the newer, uh,sort of vector search algorithms. And, and definitely, you know, we've,I think we've seen a, an ex sort of, uh,we've seen interest in this area coming, coming up very quickly as well. Hmm.
Well, uh, for disc n so my, my,my my my understanding that the main contribution of disc K n n is, uh,is like co-location of data that they used on,on, on the disc. So from other, other points, I am like,not sure that those are, those make a big difference, but,uh, yeah, this, this is like a legit method to do disc search. So there are alternatives. So like, uh, so I think, uh,another work from Microsoft also, they,they have a different solution from disk for, for, for disk. And, uh, that solution, I don't know, like, I actually have probably prefer the,the other solution, like, I mean, uh,because it makes some other things easier, like removing the elements.
Uh, but yeah. And, uh, so my understanding is that, uh,so that like disc canan, they also have, uh, have disc canan,and now they have fresh disc canan. They kind of like converging,so like closer to H N S W actually. But in terms of, uh, updates and freshnessfor, for scan, uh, for scan and other types of, uh,like, uh, ization, uh,algorithms, so those are orthogonal to graph so they can be used together. So, and I think there are some solutions that are using them together.
Oh, yeah, yeah. You're,Can you hear me now a little bit better? Sorry about that. Yeah,I think it's quite, it's quite interesting to see the explosion of interest. There's been in and around vector search. Um, you know,there's, and I think it ties in very, very well.
So I know we have,we actually have one, one of the members, um, of the audience currently,they actually ask a question about, about graph based techniques. I think it ties in very well with what we're talking about. So I think we're gonna just, we can just answer that right,right now rather than in the q and a session that we have at the very end,which is, um, in the current landscape of graph based techniques,the dominant trends revolve around the processes of pruning and incrementalbuilding. So specifically as it relates to disk and n run, in your perspective,what do you consider to be the next crucial factor for constructing high qualitygraphs with efficient navigability? Pretty general question, but, uh,I thought it would be great to answer that one right now,especially as we're talking about graph based networks and disco,not graph based networks. Graph based index says N K N N.
Mm-hmm. Well, I'm, uh,like not totally sure what would be the best answer, uh, for the,uh, question. So I think that, uh,currently existing methods for construc constructing the graphs are, uh,pretty developed. And there is, well, there are some improvements,but those are a bit extra. So they're not gonna give you two x or more in terms of performance.
Um, maybe I'm missing something that might be possible,but, uh, from my understanding,so people know how to build the graphs so that they aregood in a good state. Okay, fair enough. Yeah, appreciate that context as well. Um,I wanna move on a little bit, right? I wanna sort of talk a little bit about,um, so we talked a little bit about vector search and in particular,h and s w briefly about, you know, scan span, um, you know,scan N as well. Uh, I wanna talk a little bit about, uh,what seems to be the most popular topic, I think in everybody's mind these days,which is I'll talk about LLMs, um,or more specifically auto regressive lms, right? Uh,I really hate to call them,I really hate to call them LLMs because I think there's a lot of language modelsthat are large that are not, you know, they don't,they're not trained to do text generation.
Uh, but,you know, just starting very, very broadly, uh, we can dive into some,some interesting topics as it as, as they come up. But what are some interesting problems that you think we should solve inLMS and AI ML in general? And we'll start with this on,we can dive into some individual components, uh, as it comes in, um, as it,as it comes in, uh, later on. Uh, well, there are many,I think there are quite a few challenges for LMS and, uh, yeah, I'm not,uh, I don't have on cyber information from the big companies how they do. Maybe they have some secret sauce. So the, my,my understanding that the biggest, uh,issues are, are with scalability,so how to run, uh,very knowledgeable models so they have lots of weights, uh,without paying too much money and well running and training them.
So, so, yeah. That's,uh, I cannot hear you. Sorry. It's, it's mic keeps going off, but I apologize for that. I definitely second you there.
And I think there's a lot of,there's a lot of talk about context length, uh, you know,philanthropics recent a hundred K context, length model,and I really question if you,You know, I think a lot of, a lot of folks,I think they underestimate the amount of time once you have,once you really use that, that amount of context,how long it's gonna take to generate, uh, you know, to, to,to really do completion across all 100 K tokens, right? Um,and to really be able to process and parse all of that. I think that's one of those things. So diving a little bit,sort of deeper into what you were talking about, um, on, uh,when it comes to scalability and when it comes to deployment as well,and do you, but, but, sorry, I,I mean Yeah, go ahead. Sorry, sorry to interrupt you in the middle there. No, no, no worries.
Uh, well, yeah, uh, like ING context,left length is, uh, pretty important. And, uh, well,the easiest way to scale is to like,make your solution support, uh,bigger context length. So, and as far as I know, the, uh,techniques, uh, well to,to make it a bit more scalable, at least in training. Yeah. Uh,like this alibi, uh,like positional en coding, uh, but still, like,if you don't apply spar in one form or another,so the method require, uh, like ncore complexity in the end,like both, in, like both in compute and memory, uh,which is a limiting factor for many applications.
Yeah, yeah, yeah. Totally. Yeah, there are, mm-hmm. Yeah. Yeah, I think, uh, yeah, and I think,I think from what I understand, I'm not not sure about GBT four, but G three,G three is a, uh, it's a combination of sort of, uh,both sparse sparse layers, sparse attention layers,as well as dense attention layers, right? So I think there's,there's something to be said about it that probably an extension, uh,for a lot of the current models out there is very, very similar.
So I don't know. We'll see. We'll see. But, um,that's just my 2 cents there. Uh, uh, an interesting question, you know,that I have, and something, something that I wanted to talk about as well, is,um, do you think, um,do you think LLMs or do you think auto aggressive language models,do you think they're self-aware, or do you think that they're emergent?I know they're, I know there's been some talk about that recently, um,uh, sort of especially on the path to, to what folks like to call agi,but what are your thoughts there?Mm-hmm.
Well, I, I don't think they are self aware or emergent. So actually ask this question to charge a PT and chat it with ita bit. So like,they don't have a budget so they can, uh,so they don't have a, of suffering and concept of suffering is important. So like, it's like humans are taught to, like, they're like the,the main goal of humans is predict the futureVery. Yeah.
If they don't predict the future well,so they suffer because their expectations, they don't match with reality. Uh, but, uh, for rail lamps, uh, they don't have it,so they trained with back propagation. So the back propagation doesn't sound like, uh, like a painful process. So you can say that, uh, after training you have a new,like a new entity. So yeah,doesn't like, well, it can be in some sense self-aware if you add like,the data about it, but, uh,like it doesn't have memory of interaction, so it'll forget.
Yeah. Yeah. I think it's interesting that you, that you mentioned that as well,because when it comes to being self-aware, when it comes to being emergence,there's two, there's a couple of key factors. And the first is,I think we always define emergence or being self-aware as related to, you know,a human concept. And I think it's very, very difficult to say that we as humans,when we learn that we do back propagation inside of our, you know, inside our,in, inside our brain, basically.
I don't, I don't think that is a,a very natural way of looking at things. Uh,so I question whether LMS truly are self-aware where they're trulyemergent. I dunno, it's an interesting question. Um, probably not,probably not one that I can answer, but, uh,one that I figured we could discuss about anyway. Uh,so yeah, in terms of,in terms of sort of sticking on the topic of LMS and ai, ML in general,you know, what are some, you know, we also talked earlier about, hey, you know,if you wanna put these in production,you need ways to be able to scale them to be able to do well.
What are some scalability techniques that you're aware of that are helpful forLLMs, um, beyond just using vector databases, right?Beyond using vector databases, uh, to store, you know,documents or document chunks in retrieving the most relevant ones related to aquery?Uh, well, uh, there are many parts based methods, so,which, uh,like avoid doing attention to all of the items. So there are like all the works like reformer. Yeah. I think there are,there were Google works that use some random, uh,random attention, like, I mean, attach attention pattern. Yeah.
So there are also rance, uh, which are very,very various sparse. Um, yeah. So there are many solutions and there are like tons of papers,uh, which like do that, that, that, that is, yeah,that is in the context of increasing the context one, right? Uh,yeah. But, uh, I don't really know what, what is used in like the,like the open AI and others. So,so when I chatted with some generative AI companies and ask them ifthey use it, they say, oh, no, we don't use it.
So we use something very simple. So I was surprised. Fair enough. Okay. And, uh, yeah, and the, like, CNNs are also, those are,they don't have this issue with, uh, complexity.
And I think, like, there are, there are, like, I, I,I probably expect to see something like CNNs with pullingand, uh, uh, yeah, with,with pulling and evolution. So like when you have different scales can, can, uh,be like mainstream in some future. Interesting. Okay. Interesting.
Interesting. So we,so essentially you're saying like, we reduce, um,okay. All right. Interesting. So sort of reducing,I'm not really sure how that would work from the context of Yeah,but that's an interesting way of thinking about it.
Yeah. As you sort of reduce dimensions spatially,you'll be able to get better performance. Uh,Well, yeah. So the, there are like in, in computer vision, so there is a,I think a very interesting concept of, uh, stack power glass models. So it's like stack, stack, stack, I remember,I remember that.
Yeah. So then basically you have the flow of information, uh,like the information flows at different levels of, of,uh, like special levels. And that, that, that I think is a very, uh,like strong approach. Uh,so like something sim si similar can work for Transformers,and I saw papers that use something, something, so, but I already know what's,what's used in production. Fair enough.
Enough. Okay. All right. Yeah, yeah, definitely an open problem. Uh, definitely an open problem.
And, you know,I think using Vector database to retrieval is, is one of them,definitely one of the most common ones right now. And, uh,but I think there still is some work to be done in, you know,once you do retrieve those most relevant documents, uh, or later on,different modalities, images, video as well,once you do retrieve the most relevant pieces of data,being able to run that through an l l m very, very efficiently and get a very,very fast response. I think it's something that's still, yeah,still needs to be done, but, uh, but yeah. Um,yeah. When it comes to LLMs, you know, what do you think?Do you think LLMs and, and Vector search,are these going to be going hand in hand? Uh, are these going to be,are these going to be, you know,going at the hip basically for a lot of applications moving forward?Or do you think we're going to, you know, figure out a way where we canimprove, let's say, you know, improve the performance for long context for very,very long context models, um, and we'll use Vector database more for,um, you know, storing different storing conversations between,between different sessions? Uh, I guess what I'm saying is, you know,do you think that these are going to be really, really,you know, going hand in hand moving forward? Um,or are we going to see sort of a shift in this space?Well, yeah, I think they're very complimentary to each other.
So, uh,vector search is more like a tool that can be used by people whoalso use large language models. So like, uh, like why,why do we even need, like, those language models? So we,we want to have, uh,text and other data represented as vectors. Mm-hmm. So they, and uh, that, that makes it comparable. Uh,you can infer other like, numerical values out, out,out of it.
And, uh, yeah, if you have vectors,so you have concept of similar, right? And, uh, you need, you need,you need tools, uh, to scale it to large silos. So maybe, yeah. Uh,like vector databases might not be needed that much for a small scale, right?Right. So if you don't have much data, right? But, uh, like for like,serious applications, so that quickly gonna be a bottleneck. So when you have, you have too much data, um,especially if you want to filter them.
So it's gonna be a bottleneck. And it is bottom, like, like in recommender systems right now. So due to scale, I mean, you have to, you have to use, uh,approximate nearest neighbor search. Yeah. So to make it work.
And, uh,it's probably gonna be used and l lamps as well,and the in inside the lamps. So it also can be used probably like, uh,in, in the retrieval argumented approaches. So when you don't just use,so when you retrieve, uh, like not user data, but even,uh, like your training data. So you don't want to store all information from your training set, uh,in your model, so that, that is not very efficient. Uh,so yeah, you can put it, uh,couple it with your model as well.
So it's just like a tool. So when you are making your new LM or you're making a product together withlmm, so you want to make, uh,vector like finding nearest neighbors faster, right? So, um,yeah, you, it, it it's very handy if you have tools to do that. Yeah. Yeah. And on the topic of sort of recommender systems as well, um,I like to dive a little bit deeper into that too.
I think, you know, there,for a lot of folks who are new to Vector databases, uh, who have been,you know,around vector databases such as Melva or Zillows around the time of lms,around the time of chat, pt, you know, of, of Bard, Claude, uh, you know,vector databases actually have actually been around for quite a while. Um,you know, we in particular, we've been working, uh, we've been working on Novis,uh, and you know, for,I would say the biggest use case prior to LMS has been around recommendedsystems and has been, you know,the capability to do really content awarerecommendation. Right. And sort of coming back to this topic of Rex uri, you know,what are, I guess we can start broadly as well,I'd love to get your your thoughts on, hey,what are some of the biggest challenges associated with building recommendersystems? And it doesn't have to be a related to vector search, just in general. Well, uh, like the biggest, so I, I like, I think there is a landscape of ml,so am I, so there is, uh, like computer vision, there is, uh,natural language processing, and they are kind of like conversed.
Uh,they use the same architecture, so they train on similar data. So it seems like images also help with text. Uh,but the recommender system is, uh, kinda like different. And, uh,I think the major issue is recommender systems is that recommender systems arealmost never open source. So, uh, they work with,uh, the company data and, uh, to make a good recommendation.
So you have to support really, like,like a heterogeneous set of features which are crafted by somepeople specifically for your product. And, uh, that makes,uh, preor recommender systems and, uh,like having strong baseline and recommender systems complicated. So every time you go to like a new company right now, and, uh,you have a task of like, build a recommender system. So there are like things you need to do,and it's kind of like starting from scratch. And, uh,yeah, I, I, I, I believe that at some point of time,so people will come up with a generalizable solution solutions for,uh, recommender systems.
Uh, so that like,well, and there are, there are definitely startups. So for instance,uh, there is kumo ai, right? Uh, like the one that, um,has yep. Some con connections to, uh,so there are other, but like, the goal is,so use like a typical user of the, uh,recommender systems like the customer. Uh,so they will provide some data in a well known format. You have to instruct the user really well, what is needed,and then it can come up with some solution.
So that solves the, uh, the problem by the user. And the user have to be like,really well at describing what they exactly want. So maybe, yeah,right now, LMS gonna help to like, to gap to,to, to reduce the gap so they can like, uh,uh, yeah, so like my understanding that like, uh,recommender systems are gonna be changing and, uh, yeah,right now, like if you,if you're a new company and you want to do recommender systems,so you don't have, uh, too much money on, uh, like hiring expensive engineers,like hiring them from meta Google, like, um,other well paying companies. So it's like,probably like not, not that many options. Uh,so I think that's a also a good market.
And, uh, pi nearest neighbor search is,is, uh, like a vector search is, uh,like critical component in many other systems,especially at big systems. So when you need to, to,to scale two millions and more, and, uh, you want to save money, then yeah,you need to use vector,Yeah. Vector databases. Yeah. You, well, uh, yeah, you hear it here, folks.
Um, yeah, I guess, uh,recommended being in recommendation is, is a, uh,is well,can be a pretty big boon or can be a pretty big money maker for anybody who'sthere, but, um, but yeah, I think I have a tendency to agree with you,which is that we're going to see some shifts in how people do recommendationlater on. Uh,I don't think it's going to be purely about vector searchfrom, from my opinion. So I think there, there definitely needs to be a lot of,uh, it's definitely going to be, you know,mixed between vectors as well as any existing metadata that you have. So if I think of, for example, a YouTube video, um,or I think of any other video platform, understanding the content of that video,but understanding also, hey, maybe who are some of the, you know,if it's a video, you know, if it's a clip from a TV show, uh,maybe understanding some of the people who are in this video,some of the actors who are in this video, um,maybe understanding some of the actions that are happening in the video as wellthrough these very, very specific tags, uh, that could be pretty important. Um,so I d I I, I do, I do agree with you though, and especially in the LM era,I think, uh, you know,we're seeing a lot of changes in sort of these traditional, um,platforms, so we'll see.
We'll see. Uh,in your experience when building these recommended systems, what is,what are some of the most challenging things that you've come up, you know,that you've encountered, uh, that you've seen? Well,Yeah, the, like, it's, I think it's a general, uh, general,uh, answer. So the biggest issues with, uh,recommended system is data is having correct datathat, that, that doesn't have any bugs and having correct targets. So if you, if you, if you have, uh, good data,if you have like a especially structured data, so the, then the joints,the joints are like a big issue with recommender systems. So the,the problem becomes very simple.
So you just train the model and, uh,yeah, put it in production. But, uh,usually there are some issues with data. So the, or the, the,the target is not properly defined. So for instance, like you,you are doing a ranking objective, uh, like you're,you want to rank something, some items, but, uh,you do a point wise loss instead, instead of that. And sometimes,especially like if you don't account for, uh, positional bias, so like,maybe like you are, you are even distilling the previous model that you trained.
So, so there are many, many issues. Uh, so, and I think engineers mostly spend, uh,time working with the data, making sure that the data is good,that the objectives that are optimizing make sense, and, uh, then,uh, well, the, the, the models. So, and right now, maybe like it's, uh,a pretty obvious thing. So, uh, like, uh,there is a progress in user understanding. So, and, uh,now most of the user understanding is just a transformer, uh,applied to user history.
So it's also okay, like converging with,uh, computer vision and, andYeah. So when it comes to,I suppose, I think you made an interesting point there,which is that oftentimes when it comes to recommended sims, what you,especially if you're looking to do vector search or semantic,semantically represent any, you know, any of the items that are in your,that are in your database, uh,a lot of the time is spent on doing a lot of data wrangling and working on themodel and making sure that you actually have the,that the function that you're actually optimizing for is what you really,really, truly want. And I think it's,and we probably spend a little bit too much time, uh,probably all of us when building these systems, trying to figure out, okay,what is the best way to, you know,try to figure out more engineering problems rather than really understand thedata. So thank you for that. Really, thank you for that bit of advice.
I think very helpful as well, sort of coming up in the last 15 minutes here. Um, and, uh, you know, first of all, thanks for everybody. Thanks for everyone here for, for sort of sticking with us. Uh, I wanna,I wanna sort of talk, uh, I just have one last question, um, that I really,really dive. I'd like to dive a little bit deeper into, uh, and get,get your perspective on it as well, which is,what do you think is the future of Vector Search and what do you think is thefuture of Vector databases?Well, I think they're gonna be improved and become like,very useful tools that people gonna use.
It's like you have and you have now, it's a very good tool. You just have vector database, so, which, which is easy to use,um, right, yeah. When, whenever people feel to use it, they just use it. It's symbols. Yeah.
Do you think we're gonna, do you think we're gonna still continue to see,well, I mean, I hope, I hope so, but do you think, uh,do you think we're gonna see sort of new methods for vector search,new vector search algorithms really coming out much more frequently, um,or do you think most folks are gonna stick their guns and we're really gonnawork more on improving the embedding generation techniques? Um,maybe improving, you know, uh, in better models and models for retrieval,um, the ability to re-ranking, so on and so forth?Well, uh, I think that, uh, like nearest neighbor church,like vector search is a pretty like, refined field, so there are like,fair enough, yeah, there are papers,like there are good papers from the seventies,so there is like area amount, which is like nineties. So, uh,I can, there, there is definitely, like,I believe there's going to be improvement. And, uh,right now the interest in nearest neighbors are Spike,so there will be more work, that's for sure. And there will be more papers. Uh,but I, I, I don't think there are gonna be like a giant breakthrough in terms ofperformance.
There are going to be incremental gains. Fair enough. Uh,maybe like it is also will make easier for users to use. So there'll be like, uh, usability improvements,which might make them like, might, might make a big difference for,for the users. Uh, but, uh, yeah,most of the progress probably gonna be in representations, so FairEnough.
Getting better representations. Fair enough. Yeah. Um, yeah, uh, that really brings us to the,to the end of this, this back and forth that we have. I know we went probably more into broader questions,talked about some vector search, uh, talked about hsw,some of the history behind it.
Uh,we didn't really need to dive too deep into the technology there. Um, but, uh,I'd like to sort of open the floor up to some questions, you know,if there's any, um, I see we already have two here,and if there's any other questions from the audience as well, you know,I'd love to take them. Uh, I definitely will take them as well. You know,I'd love to take 'em here. Uh, the first one is, uh,do you have any comments on using encryption for a sub graph controlled accessor vector search workarounds for graph that use encryption to enforce privacy?So this question seems to be more about, um, graph,graph based indexes and security, uh,graph based indexes and, and privacy.
Well, I'm not very familiar with, uh, privacy. So like what, what is defined as privacy? For, for, for,for for graph, so for graph search. So like it's, uh, when you search, so yeah,I, I'm not sure like I understand like what, what,what does definition of privacy in that case. Yeah, my, I guess my guess my, you know, if I were to, if I were to,my understanding the question is more around, um, if I have, let's say,user roles or if I have, you know, I have a very,very large graph of vectors and I wanna be able to, you know, each,each vector is going to be associated with a either a potential role or it'sgoing to be associated with, you know, something along those lines, right? So,mm-hmm. Uh, okay.
I'm not 100%, and again,I could be wrong, but I'm not 100% how that relates to, uh,how that relates to encryption, but that is my guess to what,what this particular, uh, question is related to. Mm-hmm. Mm-hmm. Yeah. Okay.
Well, yeah, maybe like if a U user, uh,wants to, to have, like, control over the data that is shared with,uh, an index, so there might be a sub graph and there are works,there are papers that, uh,they have like a big graph and then they have an isolated sub graph,and that still works to some extent. So if, uh,you can encrypt everything, they can have a,like a key that I can and the decrypt all the links and data,like for a given user that, or, or store the data on users' device,then yeah, maybe that, that might be a, a solution for that. Frank, I just wanted to jump in. There's a clarification in the chat. Um,the q and a question on privacy of my question was about roles and access todata of a department within an organization, for example.
Okay, got it. Yeah, thank you for that context, Emily. I appreciate that. Um,yeah,but I think that's pretty much in line with what we were talking about a little,what we were talking about and answering this question as well. Uh, the,in particular, um, I think the ability to storevectors as well as metadata or vectors as well as scaler fields, um,and be able to co-locate them and be able to do approximate news,neighbor search, but filtered, uh, I think is is,is probably the important component that you're looking for there.
Um,and you know, that's a avail that, that,that can be easily done with something like H and s w, um,to be able to co-locate that with, with metadata fields. So, great question,by the way. Thank you for that one. Uh,the follow up to sort of the chat about h and s w, uh,there's a question here about, uh, uh,does a small world characteristic of h and s w arise fromthe hierarchical long range edges alone,or is it also influenced by the early edges established during the incrementalbuilding process?Mm-hmm. Well, the early,the early edges are pr uh,you like inside the heuristic of H S W,so it's safe to say that they,uh, oh, they don't play a big role, so they're from,for a different reason to make it more, uh, memory flat,memory friendly.
But, uh, yeah, during this,like this process,it doesn't hurt the performance list like significantly. So it's safe to say that like on load dimension data,the algorithmic scalability is due to the long rangeedges alone. GotIt. Okay. Yeah, thank you for that question.
Um,and there's a follow up to that question as well, which is,what are the primary constraints that you believe the current graph a n methodsare encountering?Uh, well, there is a constraint that, uh,the distance calculation,like at least an s w implementation channels, w so the memory,sorry, the distance implementation, uh, might be slow. So like,that's the easiest thing to, to solve. So if you have a, so because, uh,hs w it sees the data as distances between the elementsand, and the query, uh,ev every time you have a comparison that calculates like the, the,like,if you have vectors like that is like L two or kaine or innerproduct, and that might be expensive. So yeah, I think that is, yeah,one of the biggest limitations. But, uh,you can compress the vectors like in face, so there is, uh,uh, like vector compression together with, uh, H N S W,and that can speed up things.
Um,yeah, that's, uh, like can, uh,also, there might be some other constraints, uh,but, uh, I, I, I'm, I'm not sure,like maybe if there is a clarification,so what kind of constraints, uh, are, are meant,then maybe I could answer it better. Yeah. Uh, yeah, if we can, if we can get a clarification for that,that'd be great. But my best guess as to constraints probably, you know,I think, I think you, you put it very well. So the distance calculation,distance computation is one of those aspects, but, um,I think maybe another one of those could be related to, let's say memory.
Uh,you know, how can we re reduce the,how can we reduce the footprint of the index, um, of the Emory Index by bit? Um,especially because oftentimes on,oftentimes as indexes are actually larger than the,than than the data set itself. If you're not doing any compression of the vectors,you're not doing any quantization. Um, if you're not doing any,uh, you know, if you don't, you don't have any sort of, uh,any fine quantizer to, to, to sort of lead on, lead into that. Uh,so I guess one of the challenges could be around that, right?So how do we really reduce the memory footprint and how do we make, um,you know, graphs, indexes, especially ones that store not just a sub graph,but the entire dataset and memory, how do we make that, uh, more efficient? And,uh, yeah, I, I don't know. Um, I'm not sure, but, um, you know,hybrid indexes are one solution.
Um, yeah, you're a, you know,as you mentioned earlier, as you mentioned as well, you know,figuring out a way for us to be able to do more efficient computation, uh,is gonna be another solution as well. So, um, oh, the follow up,the clarification here is, uh,related to footprint indexing time in large dataset sizes. Mm-hmm. Mm-hmm. Mm-hmm.
Yeah. Well, yeah, yeah, yeah, there is a,definitely is a, a constraint on memory. So you have to at least few links per element, and that means you,you need to have like at least 10 bytes as minimum, or,or maybe more. So. And, uh, if you want to avoid that,you have to cluster the elements and, uh,so you don't store the individual elements you store like a blob.
Yeah. So you can work around that, but, uh,it's gonna reduce the speed recall, uh,trade off. So yeah,We'll reduce, but yeah. Yeah,great question, by the way, thank you for that one. And our last question here,I know we're coming up at the top of the hour, um, and there's a,there's a bit of housekeeping at the very end as well.
Is what is your opinion on dense plus sparse vector search?Well, it's, uh, like, uh, yeah,I think it's a very good for church. Like, I mean, church. FairEnough. Yeah. So that is very relevant and, uh, like, uh,like as going back to, uh, graph ink.
So graph ink, uh,they supported out out the box because, uh,they work with distance. So if you define dense plus part distance,so it's gonna work and, and, and the must slip. So there are sparse distances and, uh,the in index works with spar distances,so there might be some hacks how to make it more efficient in your case,like in, if you're talking, say with, uh, BM 25. So BM 25 is a heat based method. Uh,so if you add it as, uh, inter points for the graph with your, uh,sparse plus dance metric, so let's gonna converge pretty quickly.
So, but, uh, yeah, that, that, that is, uh, like, and, and, and, uh, yeah,s w to make it work, uh, with, uh, sparse synthesis. So there,there are some hacks with inter points. Yeah, well look forward to seeing, uh,look forward to seeing potentially that support being in there for H S W live aswell. Um, but that is the final question that we have, uh,in the q and a session. Uh, you know, I'm, if,if there are any follow up questions, uh,if there's any questions that folks didn't get a chance to ask Yuri for,for this session, feel free to follow up with us.
Um,and we'll try to get to those. But Emily, back to you. Thank you, Frank. Um, Frank, Yuri, what a great conversation. Um,I know that I learned a lot.
I know that I probably have more questions than I have answers right now. Uh,hopefully all of you enjoyed the session as well. Um,so thank you so much for joining us. We hope to catch you on a future, um,Zillow's webinar. You can find those at zillows.
com/event. Uh,we hope to see you in the community. Thank you both. Thank you. And thank, thank you for the questions.
Meet the Speaker
Join the session for live Q&A with the speaker
Yury Malkov
Distinguished Software Engineer at VerSe Innovation
Yury Malkov, a Distinguished Engineer at VerSe Innovation, is currently developing crucial recommender systems that serve 100M+ users. His published research, boasting 2,000+ citations, spans CV, LLMs, RecSys, and Similarity Search, featuring widely recognized open-source solutions like HNSW and Learnable Triangulation. Yury's professional interests encompass constructing highly scalable content machine learning systems.