Events
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for your GenAI Apps

Webinar

Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for your GenAI Apps

Zilliz Webinar | Zoom

Join the Webinar

What will you learn?

Join us to introduce Milvus Lite, a vector database that can run on notebooks and laptops, share the same API with Milvus, and integrate with every popular GenAI framework. This webinar is perfect for developers seeking easy-to-use, well-integrated vector databases for their GenAI apps.

Topics Covered

Milvus Lite Overview: The thought process behind crafting a lightweight yet powerful version for local use.
Design Principles: Key features and architectural choices that ensure a smooth developer experience.
Quick Start Guide: Tips to get Milvus Lite up and running on your machine.
Live RAG Demo: Watch a Live RAG app building using Milvus Lite.

Attend to discover how Milvus Lite can fit into your GenAI app development journey, and start building right away.

View presentation slides

Transcript

I'm pleased to introduce for today's session, um,introducing Mils Lights, uh, our guest speaker. So Yang Chen, which is the head of AI platformand ecosystem at zills with yearsof experience in data infrastructuresand information retrieval. Yang previously served as a tech leadand product manager, both search indexing at Google. He holds a master's degree in computer science from theUniversity of Michigan. Anna Abor welcome Yang, and this stage is yours.

Hello. Welcome everybody. Um, yeah, let me do a quick introduction of myself. Um, as Stefan introduced, um,I'm currently have the ecosystem and AI platforms. Um, so I think in the past few months I have been, uh,other than doing the ecosystem integrations,I've been also leading this, um, mul life project.

Um, and today I'm very excited to introduce this, you know,new offering of Ware to You,which is a lightweight vector database that you can runas a Python library in your AI application. Super easy to install, super easy to get started with,and I hope you will like it. So, without further ado, um, you know, let's, let's kindof review a bit of background, um, why we wantto build Vector databaseand specifically, uh, easy to use, easyto install vector database. That's perfect for, um, AI developers to get startedwith their, uh, AI development stack. So this story of vector retrieval kind of come from theadvancement in deep learning and neural networks.

Um, so traditionally there was a search by probability. Uh, there was search by termsand tokens, which is what the inverted indexand, uh, stacks like elastic search was used for. The idea is pretty much, um,we break the documents apart into tokens and keywords,and then we also find the keywords, uh, you know, um, uh,we also kind of, uh, um, uh,generate the keywords from the query,and then we match the keywords from the queryto the documents, and then we finish the search. But this is so mechanic, the mechanical way of doing search,um, results in some undesirable behavior, which is like,even though the documents is pretty much talking about thesame idea as the query, it cannot be retrievedbecause the keywords doesn't exactly match. For example, what's your age is very,very different than how old are you.

Uh, they don't share many common keywords,but they actually share the same semantic and same meaning. So this is how deep learningand neural networks come into play, which is using the ideaof probability in search. So this, this idea is that, um, with thedeep neural network, we can, can encodeand represent the documents as factorsand that kind of grasp the, a sense of the meaningof the document. And the same thing is done with queries, sothat when you match the two representations of queryand document, if they share this like,like a close dis uh, if they are close in the latent space,like in another word, they, they have shorter distancebetween the vectors compared to, you know,the other documents and the same query,then there's a high likelihoodthat they share the same semantic meaning. So with this idea, there was, you know, many techniquesand machinery modelsand also data infrastructures built around this.

Um, some of you may have known like in value models, um,uh, in, uh, uh, in, in the, in the, um, in the architectureof say, bird or even, um, transformer, um, architecture. And the vector database is a pieceof infrastructure introduced for this. So that with the grid models,now we can generate embed vectorsand we still need something to storeand efficiently retrieve the vectors. And that's what Vector database does. And with this great technology, there's, there's, um,another kind of, um, tech, uh, technical paradigm, um,happened in the last 12 monthsand has been very popular among the AI applicationsand developers, which is calledretrieval augmented Generation.

The idea is to use the inbound modeland vector retrieval together with the large language modelto improve the large language models capability on somespecific domain of knowledges. For example, if you are a company with internal data,which you would love to, would not like, like to sharewith the external, you know, communityor the, um, wouldn't want it to be trainedfor the large language model I used as trained datafor large language model. In that case, you can still use this data to do, you know,to build chat bots to, to provide great, um,question answering experience to userswith vector retrieval. Um, the way it works is that the first index,the knowledge base as a vectorsand store them in vector database, as was shown in the,um, bottom left. So with this vectors, pretty much we havecompiled a knowledge base that can be efficientlysearched and retrieved from.

And then when user asks about some specific question,in this case, the user may wantto ask about a very specific index type question for vus,which the large language model probably don't know'cause it was not trained for, you know, the,the documentation of specifically in that case, this,this query will be, uh, converted into a query vectorwith the same inbound model that we used for the documents. And the query will be searched in the vector databasethrough, um, approximate nearest neighbor searchto find the nearest vectors, whichvery likely will have the similar meaningor related semantic with the query. And in that case, it's very likelythose documents will answer the question. So once we retrieve the top K documents from the vectordatabase, we can compose the, uh, promptwith user question and retrieve doc and some instructionsand send it to a large language model. So the large language model will be smart enough,it will be, uh, do some reasoning, it will do summarization,and then eventually provide a, um, high quality answer that,um, that matches the user's expectation.

In this, in this example, it will, um, answer asof last month, it was for 11 index types. Um, instead of just making something up. So with this background, why do we build mil light?'cause MIL is doing great in this, in this domain. The reason it's being, um, despite VU is lovedby many developers, we kind of notice it's notthat easy to get start with. So for example, right now, the easiest way to use Ware isthrough a Docker deployment.

And to use the Docker, you need to download some scripts,you need to run the script, even though not many command,but it's still kind of scary to some developers on, uh,familiar with say Python and,and pip, um, uh, way of, uh, developing, uh,developing applications. Um, and also it requires youto install Docker on your host machineor development machine, right?What if we have something way easier, right?Say, if I don't care about scalability at this moment,can I just use something simpler?Well, the answer is yes,because s slide is designed for this. So the idea is that instead of downloading Dockerand a bunch of scripts, you just needto have a Python environment and do pip install PY mailbox. And that will do everything for you. It's pretty much just a Python library, which you can importlike the, the one we showed below, uh,from Pine mailbox import mill client,and then you instantiate the clientwith a local file just like how you use SQ lite.

So with this local file, it not only loads the datathat you will insert after this into memory,but also materializeand persist them on this as a file that specified here. In this case it's called NS Demo db,and in this way, pre it providessimilar experience as the mill was deployed on Dockerand even Kubernetes. And this is very lightweightand efficient so that it can be run on your laptop onJupyter Notebooksand also on edge devices like smart home devices and mobile. So the way it works is,uh, we kind of strip off all the heavy lifting componentsof VU that's designed for scalabilityon distributed systems, right?In, if you look at the architecture beforeand there are query nodes, there are data nodes,there's Kafka for, you know, message queue. There are a lot of, a lot of great things forhandling the, you know, high throughput, high traffic,and designed for horizontal scalability.

So that just by throwing in more machines,it will be more scalable. It can sustain you towards tens of billions of vectorsas thousands of KPIs of searching that. However, that's not necessary for mostof the AI applications,which is in its very early stage, right?Simply, I don't have many data. I don't even have say a million vectors,or I don't wantto think about scalability at this moment, at this moment. My first priority is to move fastor I just lo want to learn about the tech,uh, AI technical stack.

I just want to learn about how to build rack rather thanhaving to install, say, Dockeror Kubernetes as a prerequisite. That's too much ask. So this is this, to solve that problem, we, we can strip offall the heavy lifting components from this so that it can belean and small enough to fit in your local,say laptopor even, um, you know, machines with very limitedcomputing resource like, um, edge devices. So what we, um, what we keep hereis just a very basic, very minimumof vector indexing and retrievaland also persistent storage. So let's walk, uh, walk you through from the leftto the right, right on the left I have my user application,which is say AI chat bot.

And in the the chat bot I imported mails client sothat I have a client to talk to the,um, I'm sorry. So I have the client to talk to, um, the core of, uh,of Mailbu, uh, which in this, in this case isa server and the code. Um, so the client can insert absurdand delete data records, um, to,uh, through middle slide. And it first talks to a parser sothat it can parse the query. The reason that we need a parser is not only, uh,the client will express the semantic of, say,inserting this vector for me, um, for the queryit sometimes, um, for, for metadata filtering feature,it'll also specify, um, find the closest,say top 10 vectors to this vector,but also following the follow, uh,following the filter expression such as, um,I want the subject to be history.

I want the version to be 2. 4 0. 0 rather than2. 2 0. 3 0.

Things like that, right?So there, there's, um, uh, expression that comes with it. So the passer is mostly used for, um, parsingthat expression and then convert them into, um,an internal representation of this, uh, query intention,which we call execution plan. And the execution plan will be passed to the corefor further processing. On the other hand, we also want to materializethe data insertionand deletion so that next time when you restart the program,your all your state won't be lost.

It will be reloaded from the, you know,local persistent filethat will specify in the mills client instantiation. Um, so there, there's a storage kind of converter,which converts the, um, insertionand deletion operations into file operationsand then load, uh, materialize those data in the local file. So once doing that, now it's ready for us to, uh,do the real vector search or, um, updateor index to reflect the recent insert and division. So the core will actually talk to the in-memory indexthat we, um, that's the, that's the state we keep in memoryand update the index sothat the vectors are added or removed. And if that's a search, it'll use the index to do, um,any search, uh, kind of algorithm you specify, like HHWor IIVF depends on, um, whether you care about that.

If you don't care about that, you can use,you can specify nothing sothat you use auto index will choose for you. Uh, if you do care about the algorithm,then you can also specify which, uh, indexingor algorithm it uses. So once um, everything is done, it will pass it backto Parer and eventually back to the mills client. Now your user application gets the vectoror finish the operation on the vector storage. So with this design,now you can use VUS pretty much everywhere.

Um, on your gen application, you just write theVUS related client code. Once, like you instantiate the client, you doinsertion search. And this code can work with any kind of mill deployment,whether it's mill light running on computation resource,li uh, very limiting, uh,very limited computation resource devices,or it can run on Docker, Kubernetes,and even cloud like Zeus Cloud,which shares the same API as vu. The only change that you'll need is you still needto switch the endpoint in the mill client. 'cause pretty much you have set up a new server, you wantto express the server endpoint, um,while establish establishing the client.

But that, that's mostly it. And for the data, we also think about how to, you know,move the data efficient, uh,move the data conveniently from one deploymentto the other deployment. So now MI uh, supports this data import feature,which you can specify a file in either JSO or,and then you can load the data into a new collection on anymills deployment, um, automatically with just API call. And we're also in the processof developing a convenient command line toolsto help you do this so that you don't even need to write,um, a few lines of code to, to do the API code. In addition to the deployment,the flexibility on the deployment.

Um, we also integrated VUS into pretty much any popular AIdevelopment toolkit, like, um,some known things like LAMA index long chain. And, um, other than that, we also integrate with a lotof inviting model providers like the, um, Voya AIand Gina, Gina ai, um, and open ai. They provide inviting model as a service. Uh, we also integrated with Bento ml, which, um,you know, uh, does the model inference, uh, hosting. They, they have the open source and cloud offering.

And we also, um, implement with other like, um,projects with, with, um, very novel concepts. For example, ragas. What they do is they integrate, um, they, they kindof streamline the process of rag, rag quality evaluation sothat after you develop a vanilla rag,usually you care about it qualityand you want to do some experiment and,and measuring of the quality, right?So with ragas, what you can do is, um, uh, you can,you can get the, uh, qualitative, uh, sorry,quantitative numbers out of your RA implementation sothat next time when you change some implementationor you do some optimizations,it will give you a different numberso they can tell you this made it betteror worse or how much better. And we also integrated with, um, MAP GPT,which is working on agent with memory. Um, so vector database and embed models.

That's a great way to, you know, encodeand store the memory of the agents so that they rememberwhat the users has told them in the past. Um, the vu, uh, there is class,uh, um, integrated into MAM GPT, so that in, um, in this,um, agent implemented by MAM gtt, you can, you can use BUto store out the stateand you can share the benefitof the flexible deployment modes of bu. And we also have the data source friends, uh, like, uh,which can, you know, um, um, help you build the,uh, data processing pipelines so that it canhelp you move the data from one source to another sourceand, and do the, uh, inviting and other conversions for youand many more other projects. So I will pause here for some questions. Um, and then after this I will show you the documentationand some live code examples of mulli.

Any questions from the audience?No question directly so far. Okay. Someone was just very impressed, uh, byand happy about middle slide in general,how it easy it is to use. Okay, then I think we can move on to the next chapter. Um, so for the detailed information about Mul light, uh,and also vus, you can go to Mills io websiteand we have just redesigned the websiteto reflect a lot more information, right?So, um, as I saidbefore, um, now the Mul light experience it justa pip style away.

Um, so once you PIP style, you can use the, um,same API shared between mul on Docker and Covid ESand Mul Light, um, for collection creation, creation,insert data search and others, which are quite intuitive. And so we also, um, we also, um, put the pointersof, uh, instructions on how to do different types of employ,uh, deployment for it was like, that's kindof straightforward, just P ping install. But for Docker and Kubernetes, it's a bit more complex. But you can visit the details for, um, you know,you can visit the documentation website for more details. Uh, you can also find the, uh,helpful scripts from this install middle sectionwhere it has, um, those light and it docker.

Uh, there are different, um, modes of Docker availableand also Kubernetes, um, which is a bit more complex. You need to, um, you know, set up the operatorsor use HEL to do the deployment. And we also linked the integration documents with, uh,a bunch of or partners. Uh, this is not a complete list. There are many more from the documentation.

And lastly, but not least, uh, we also have thein-person meetup called Unstructured Data Meetup,which happens, uh, in many cities around the world, uh,from San Francisco, south Bay, to Berlin in Seattle,and maybe New York, uh, in, in the upcoming months. So welcome to our, uh, to reserve a seat in theupcoming meetups and, you know, talk, talk with us,uh, in person. So, um, without further deal, let's also, um,check out the documentation, which has a lotof great coding examples to, you know, show you showto show you how to use lighting in action. Um, so I'll startwith a quick start here. Um, so this documentation, they are also designedas notebook so that you can run it for, um, in additionto, you know, copy paste the commandsand run your, uh, running your local, uh,development environment.

And other than the basic stuff, uh, we also havealso have deep dive on different concepts in mill, such as,say consistency, multi-tenancy, which areconcepts mostly used for the, um, new server. And also how to use, um, different kind of indexto speed up your search performanceand detailed explanation of the schemaand data model design. Um, which we won't cover here'cause there it is another big topic. Um, but there are, if you really wantto design a very efficient systemand can do very complex operations, um,learning about the schema design, that's very important. And in addition to that,we have very comprehensive integration documents.

Um, so as you can see here, we integrated with many, uh,partners, uh, in this list. And I will show, uh, so long chain, for example, um,for anyone working on retrieval, augmented generation,probably long chain and LAMA index are, are well knownand mostly used frameworks to help you design thedata pipelines for indexing and data processingand also retrieval, and some even provide the agent, uh,or agent capability. Um, we will, if we have time, I think we'll have some time. I will do a more deep dive on how to use long chain todo the, you know, to implement the advanced concepts in rac. So I can improve the quality of the question answer,but the basic stuff is here.

Okay, um, let's jump back to the get startedand let's run this in action. Um, so as I mentionedbefore, the first step is to do PPP install. And in this, um, in this command, it will install both,not only the client of Pan Mills,but also the server part of Milland, uh, sorry, the server part of Mill Light. Um, I put the server in CO under code isbecause, as we mentioned before, is lightweight. So under the hood, what it actually does is, is used, um,GRPC to communicate between the client and the server,but not through a network, but through a local file.

Local file is different than the local filefor resisting the data. As some, you know, temporary local file,which is pretty much just used as a billboardto exchange information betweenand, uh, be between the mill clientand mul light, uh, server component. Um, for convenience, we include allof the components in Pine Mill sothat you can use Pine Mill. If you are, uh, if you have a mill server, then pretty muchwhat you want Pine Mills for is just as a client. But if you also want to use mill, uh, mul Lightwith the Pine Mills,you also get the server part inYou don't need to, you know, install anotherPython package just to, you know, uh, have a easier wayof using milless.

So with just PIP install, you can, uh, import theMul client, which is theabstraction that we use to talk with Milless. Um, and in Mills client, you can instantiatewith a few different ways, uh, in this case,'cause we want to use Mills Light, we just specifya file name and the client will be instantiatedwith Mills Light and will create this file for youif you wanted to talk with some serverthat you deployed before. And you need to specify the mill server endpointand token, right?So if that's a say Mill server on Docker, then it,it is very likely a local host, um, with the port. And then token depends on whether you haveset password or not. It will be the username and password.

Um, if you are using say, ZI Cloud, then it will be, uh,network endpoint of your Zis cloud cluster, uh, like Vector,CL vector database cluster. And then following that will be API key as the token. So still sharing the same function, uh, sorry,sharing the same API, but with slightly different flavors. Let me run this for now. Yeah.

So with this, we have created, um, a new database. And right now in this database, um,we don't have any collections. So, so we need to create a new collectionand specify the minimum that you needto specify is dimension, which is how many dimension,how many dimensions does Vector have. Um, but other than this, you can specify many more, um,parameters, including if you wantto design the schema yourself, then you needto create a schema object and then spec specify it here. For example, if I want to specify what kind of, um,labels I have together with this vector,and you specify that in the schema.

Um, however, if you don't have an idea,um, what, what kind of, uh, labels you have at this moment,or you just want to omit them and just use anythingand don't care about the, you know, performance implication. 'cause by defining schema, you can define index on specificscale fields like non, non, non-actor fields. In that case it will, uh, do more efficient metadata filter. But if you don't have metadata,say if you just have thousandsor tens of tens of thousands of vectors, that's,that's a very small, uh, uh, scale of data,then you don't have to specify that. And that's what we are doing here.

Um, so what this creates is a vector database schemawith only ID vectorand any kind of labels you want to put there, we'll seethat in action in a bit. Um, but before doing that, remember thatto do vector search, we needto have factory in the first place so that we need toinstall the, um, we need to install a, a, a, a packagefor, uh, inviting models so that we can, can convert, say,text into vectors. And we have provided a convenient wrapper called model. Um, this sub package will, uh, this sub package,it wraps a lot of model providers, like what we mentionedbefore, open ai, Voya, ai, uh, Gina, ai,and also open source, um, inviting models like BGEand uh, sentence transformers, which, uh, is goodfor many open source inviting models. So with it, we can convert the text.

In this case, I, I put, uh, three example, uh,text strings as documents. Um, we can convert them into factory buyingswith just a function call. So what we does here, uh, so what we do here is, um,we import the model package from Pineand, uh, we instantiate a new inviting function, um,in this case, 'cause we,we don't really care about which inviting modelto use at this moment,but once you, um, are familiar with this,you probably have your own say, um,what inviting models, uh, to use. 'cause they have different quality implicationsand sometimes they, um, the providers also,uh, develop the, uh,so-called vertical, uh, embedded models. Some are good for say, uh, legal accountant, some are goodfor medical accountantor finance, um, uh, things about finance.

Um, you can feel free to choose anyone that suits youor like models suitable for different languages. Uh, but for the demonstration purpose,let's just use a small model, um,called Outward Small V two, which has just 50 megabytes. So that, uh, within a few seconds it has been downloaded,um, so that we can use it to convert the in embed. Now we have three document pieces all talking about eitherAI or who was kind of the founder of the, the conceptof activation activational intelligence. And we use in embedding function dot encode documentswith a list of documents to generate the vectors.

Now we also have a list of vectorsand each one kind of mapping to, um, each, uh,document element of this list. And the reason that we, uh, see incode documents,rather than just encode thisbecause, um, due to the design of the invite model,some use different methodsor slightly different, uh, adding slightly different, uh,uh, say salt to the, uh, encoding process so that they,they encode the document and query slightly differently. So we had this, um, by providing two convenient functions,one called encode documents,and another one called encode vectors, uh,sorry, encode queries. So that you, you don't need to think about this and,and remember it, just, just to make surethat you use encode documents for documentsand encode queries for the queries, uh, strings. And for some of them there's no such difference than weprovide just in code function plainly.

And if you, uh, print the dimension of the, uh,both the embedded modeland the vectors generating by inviting model, um,they shall match the dimension that you specify, um,about, which is, I think it's, uh, sevenfor one second, um, 768 dimensions. So here we, we can seeafter download, uh, downloading the email model, uh,it prints out the dimension matches, uh, what we specifyfor the schema of the, uh, of the collection. Otherwise it won't work. Um, 'cause you want this, uh, you know, what you insertto match what you define in the schema. And, um, we also print the, uh, the whole, uh,we also construct the whole data object, which will insert,um, into the, um, you know, vector database I, um, in,in one single operation.

And in addition to the vector that we just generated,we'll also have an id, which is used as, as the primary keyand also the text so that when we retrieve thevector, we can also kind of carry out the, the text sothat we don't need to, um, find the text in another place. Like we don't need to set up anotherSQL storage just for this. And also, for just for fun,I also added a label which called subject that describes,uh, what kind of subject this is talking about. So for example, I, if I want to do a query,but only among the subject of history, instead ofsay biologyor something, um, I can,I can specify the filtering in the query language of mil,which will show, uh, a bit later. Now we have the, um, data objects,which has three entities mappingto the three string objects.

Um, each has the, uh, each has the,uh, uh, each has four fields, um,from Id vector text to subject. Um, if you couldn't, you know, download the modelfor some reason, like network issue, um,in your local environment, you can also kindof generate the fake vectors just to finish the demo. But that's not interestingand what we want cover about that part. Now we have the real vectors, uh,which represents the semantic of the text. And each one of them has a very interesting labelcalled, um, subject.

Um, we're ready to insert them into the vector database. So you don't need to set up any server at this moment'cause you have already have the middle slide coverage. And with this intuitive API client cer insertto specify the collection name, which we created a bitbefore and the data object,and now inserted three elements, uh,with ID from zero one to two. Now we are ready to do semantic search. Um, let's just make up a query like who is Alan Turn.

And as I saidbefore, we need to compose, uh, we need to encode it into a,uh, into a query vectors. So, um, this encode queries for convenience accepts a listof queries so that you can domultiple queries at the same time. But for this case, let's just do, uh, one query sothat we have a list of, of a single query,and we generate a query vector of size one,and we're ready to do search. Let's see how this search query array is, um, composed. So first we need to specify the collection name, again,the same collection we created and insert the data to.

And the data accepts a query vector, uh, sorry, the data,uh, data field accepts a query vector list,which has just one single vector,and we need to specify limit. This is interesting, this, the limit is equivalentto the top, uh, the K in top K meaning, uh, among all of thevectors stored in this collection. In this, in this case, we have three vectors. How many, um, closest vectors I want to, um,I want to get out of it. Um, in this case, let's specify two.

Uh, 'cause we don't want all of the vectors, uh,then it won't be interesting. And other than that, all you also, uh,can specify the output fields. Like if you don't just want the vectors, you also want some,say the raw data where the vector was converted from,or some labels, you can specify them in the output fieldso that they will be carried out, uh, togetherwith vector in the result. Now let's run this. So you cansee the result contains, um, a lotof interesting information.

Um, first it contains a data object, which is a listof results for each result. Um, it specifies the ID and distance. Distance is like the similarity score, um,of the target vector. In this case, the vector forThis sentence talking about moratorium was more, um,and the query vector, which is who is atoryand depends on which, uh, index algorithmand distance metric you used. Um, in this case we didn't really specify that,so we used the default.

But if you have a say, you can define, uh, for example,my index, uh, index algorithm being HSWand my, um, uh, vector distance. Uh, uh, my vector distance is de defined by, say,co-sign similarity or ip, like inner product similarity. So depends on which one you defined. The distance will be calculated in different ways. In this case, um, the distance actually describeshow similar the two, uh, vectors are.

Um, but in other like, um, distant, uh, distance metrics,um, the distance could literally meanhow distant the two vectors are, so that that will dethat will affect whether they are sortedexceedingly or descending. But either way, um, the, the ranking of the results isthat they are ranked by the semanticsimilarity between them. Um, in this case, the most, uh,the most semantically close one is, um,where touring was born. And then the second one is island touring kindof invented the concept of ai. And because we specify the output fields to be taxand subject, so that in addition to the vector itself,it also outputs the textand subject in the result.

Okay. Um, lemme see if there's any questions from the audience. Yes, there's one, um,which is about prioritizing the query. I don't know if you've seen it in thechat or Yeah, it's in the chat directly from Alice. I can read it if you want.

It's asking if you canparameterize the query, uh, vectors. Um, and also if it's possible to iterate, uh,over each record, like, you know, similarto SQL Server MySQL, to get the answers,Um, iterate each record in SQL Server by sql,other database to get answer. And it's a follow up from the question that is above. I see. So the result is actually, um,the result is actually just a, a list of re um, a list of,um, entities stored in the, in the database.

So if you want to iterate through the result,then you can just implement a full loop for this. Um, but let's say, I wonder if the question is about,let's say if I want to, um, search for many items, um,and then kind can just kind of have a pagination sothat I can, I can do 101st and then another a hundred. Is, is that the question?They just added the comments? Uh, no. They want to iterate the query vectors. So search for a list of items or database of items.

Ah, I see. And maybe put them into a J Doc. Okay. So you have two ways to do multiple queries. One way is that you, um, you know, specify a listof one single query and then do the search.

Say if you want to, um,do 10 queries at the same time, right?You can, you can just do 10 searches, which is notthat efficient, you know. Um, in addition, you can also specify the datato be a list of query vectors, uh, of size 10. And the 10 queries will be performed at the same time in onesingle, in one single, uh, uh, you know,API call or operation. And there's another question as well,which is why does the greater distance indicate a bettermatch in this notebook?Ah, that's a good question. That depends on the distant metric in this case.

Um, I think, if I don't remember it wrong, thedistance here is defined as co-sign similarityand the for co-sign similarity, the greater it is,the, the better it is. But for other things, um, you know,maybe I'm making up here,but, um, I think for ip, um, the shorter it is, um, the,you know, the shorter the distance, the, the better it is. So that really depends on which, uh, uh, distance seme, uh,sorry, distance metric that he used in this case. Thank you. And we have a follow up, uh,as well from the previous question.

I don't know if you read it. Mm-Hmm. Uh,but it's asking, sir, is there 10 item limit?So like, do we have to put things into a fractionor something to iterate if we have more than 10?Basically, like, do we haveto iterate over multiples of tens?Yeah, so there's a, there's a limit. I think, uh, I'm not sure if that's 10. It is either 10 or a, or 100.

Um, but there's a limit. And for the, with the limit, if you say have,if you have a thousand queries, I, I'm,I'm not sure you can fit a thousand queries in into onesingle search request. So yes, you need to iterate through say, 100 at a timeinto cut them into 10 batches. Hopefully that answers the question. Thank you.

Um, those was all the only questions so far. Seems like it answered the question. Yes. Okay. Yeah, thanks a lot for the question.

Let's, let's move on to the more interesting topic here. Good, Good, good. So, um, as I mentionedbefore, we could define different subjectsfor different documents, right?And then in, during the search, we may just wantto get the results related to one single subjectrather than everything. Okay? Let's, uh, insert more documentsof different subject here. I made up three more documents, uh, which are related to,um, in this case biology, um,and also somewhat to, uh, relatedto machine learning and ai.

So, um, for the ID, we, we want to increment it a bit'cause we don't want to have different contentwith the same duplicate id, right?So after we encode the documents into vectors, let's,let's compose the data object here. Um, we increment the ID by two to skipwhat we have used in the past, like 0 1, 2. Um, we also specify the vector field, tax field and subject. We want to match the same, uh, the schema of the vectorsthat we have inserted before. Um, and for this whole thing,technically we call this an entity, right?So data contains a list of entity,uh, just for the terminology.

Now we can insert the data again. Now, I think in this case we have six items, uh, sorry,six entities in the vector database now. And we can do a search, um, by specifying a filter. The only different part is the filter. Um, the same, we use the same collection, um,also encoded the queries.

You can specify multiple queries here. I, I can, I can show you this, um, maybe a bit. And again, we specify limit of by two'cause we just want to find two, um,uh, two documents closestto tell me AI related information in the subject of biology. And only return me text and subjects. 'cause I'm not interested in the other stuff.

So let's run this and see what happens. Now we retrieved, um, two different results. One is, um, something about biology and ai. The other one is also about biology. Um, so as you can see here, um, first of all,all of the things, they come with the subject biologyrather than, um, uh,rather than say history, uh, subject.

And secondly, both of the, you know, top two content,they are related to ai. I intentionally put this one like DDR ones involving cancersand fibrosis, which is not related to ai,and it wasn't show up in the semantic search result. If I remove this, I think,I believe the result will be different. But let's see. Oh, I think I made a mistake here.

Um, it shouldn't insert the data again,at least not without the same id. Otherwise there will be duplicate data. Um, but it's okay. Let's see the, see the point. Now the, the result shows something related to, uh,history subject and also biologysubject rather than a clean filter was supplied.

Okay, I think that's, um, that's all for the,you know, uh, basic semantic search. Um, just very quickly, if you, um, if you wantto do some other operations, like you don't wantto do search, you just wantto get something from the, from the database. Like if, just gimme everything about history here. Um, you can also, uh, use a different query, uh, called, uh,sorry, a different, uh, uh, um, function called, uh, query,which can, you know, return the results, um,uh, that matches the specific filterwithout the consideration of, uh, vector similarity. Um, you can also directly retrieve, uh,retrieve the entity by their IDs.

And after all this, um, if you want to delete some data,for example, I want to delete everything about biology. Um, you can also use the delete functionby specifying the collection and filter. And after, say,after you terminate this, this thing, uh, next time,if you want to load the existing data into the, um, into,into, into your, uh, into memory, you can justspecify the client instantiation with the same file name. In this case, all of the existing data will be loaded. And you can also drop the collection if you don't want them,um, after you, you know,after you, um, finish the developing the stuffwith mill light and you are happy with it,but maybe your user grow from say, 10,000 to500 million users.

Um, glad for you. Um, in that case, you can, um, you can moveto a more scalable version, which is most deployed on Dockeror Kubernetes in the scale. Probably you need Kubernetes, like a cluster,a gigantic cluster, uh, for vector search. In that case, you just need to substitute the URIwith your network endpoint and token. Uh, if you have a self deployed one,then specify your username and passwordand boom, all, all the other code will workfor the mules server.

Okay. Um, that's everything I wantto show in this, uh, in this webinar. I think, uh, we probably have some,some more time for the questions. Yes, thank you very much. People are quite impressed.

Uh, also really happy with, um, filteringand different things that we have. Do people have some questions? MaybePeople happy?Yeah, thanks for the support. Um, maybe lastly, um, uh, kindof a, a quick one, a quick, uh,quickly go over the documentations we have. So we recently added many integration documents for,or favorite, you know, uh,or lovely partners, um,including the known names like Long Chain and LAMA Index. So that in, in this one you can also use Mul Light.

Uh, like for long chain, you just need to specify the,um, um, you know, the local file name in the long chain, uh,rector store, API,and then you have a rector store from, it was light. And also you can use it with many,many model providers from, uh, open AI and Voyage ai. And we show the examples how to combine them togetherand build your semantic search or rack or image search. And also, uh, very recently, I,I noticed there was a project calledDisplay, which is very popular. Uh, we also have the integration with display, uh,where you can use, uh, the mul abstraction called Mul RMin Buy, which works for, um, both Mul Lightand the Mul, uh, server, Docker and Kubernetes.

Oh, thank you. Uh, yes. I'm just asking, answering questions directly from people. Uh, they're more about like the meetups. Uh, so I think we stream the meetups inSan Francisco, right?They're directly streamed.

Um, so yes, otherwise they're recorded. And then, uh, it's always on YouTube. Yes. So for the meetups, uh, we do have a recordingand streaming for the meetup. Uh, if you want to check out the, the, the contentof the past meetup, um, please, uh,visit our YouTube channeland we have all the great recordings therefor the upcoming meetup.

You can RSVP here. Um, and it will have the live streaming, um,I think on, on Twilio. Yeah. Yeah. And yes, the recording will be shared, uh,and the code as well.

I mean, the code is already available directly on thedocumentation, so if you want the code, uh,you can just go there directly as well. But I think that's it. Uh, thank you very much for the,for the webinar, sorry, and for showing vis lights. So yeah, for anyone, uh, check out the website vis io. Also check out GitHub directlyand yeah, go for like the peep installed by Elvis.

Go crazy on your machine. Break your laptop again. Yes, please, please try it outand let us know how, how it works for you. Yes. And uh, if you can, you know, give us a star on GitHub,that will be a lovely addition to this.

And also we have, uh, announced the launchof S Light on Pro Product Hunt,and today I think it's, let me refresh it, it's ranking as,um, yeah, 17th, um, place. But if you can give it a upload there, which I link, uh,included a link in your chat that that'll be awesome. And thank you so much for your support. Thank you very much. Thank you everyone.

Have a lovely morning, afternoon, evening, wherever you are,and see you for the next one. ByeBye.

Meet the Speaker

Join the session for live Q&A with the speaker

Jiang Chen
Head of Ecosystem and Developer Relations
Jiang is currently Head of Ecosystem and Developer Relations at Zilliz. He has years of experience in data infrastructures and cloud security. Before joining Zilliz, he had previously served as a tech lead and product manager at Google, where he led the development of web-scale semantic understanding and search indexing that powers innovative search products such as short video search. He has extensive industry experience handling massive unstructured data and multimedia content retrieval. He has also worked on cloud authorization systems and research on data privacy technologies. Jiang holds a Master's degree in Computer Science from the University of Michigan.

Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for your GenAI Apps

What will you learn?

Topics Covered

Meet the Speaker

AI Assistant