Today, uh, I'm pleased to introduce, um, so the session,which is AI brex privacy,and how private, private GBT fixes it. So, we have, uh, Daniel today, which is the co-founderof Xon, uh, which is a truly private AI workspace,which offers a self-contained AI solutionsthat runs within company's infrastructure,and that ensures data ownership and compliance. In previous phases of his career, uh,he was working mostly on, uh,consumer oriented products spanning from mobile devices,iot, robotics, 3D printings, and robots. Uh, later on he assumed a leadership role at Amazon BusinessDelivery Experience organization,where he successfully formedand guided software engineering teams steering the creationand delivery of impactful products on a global scale. Also, impressive academic backgrounds.
So he has a bachelorand master's degree in telecommunication engineering,as well as a PhD in proactive context aware,recommended systems, which allowed him to contributeto the ML domain with more than 20 research papers. Welcome Daniel, and the stage is yours now. Thank you, Stephen. Um, thank you cli uh, team for, uh,having me here and, uh, giving me the chance to talkto all your community. Um, thank you for the introduction.
Um, so yeah, let's, uh, jump, uh, directly into the slidesthat I have prepared for today. Uh, let me check that everything works. Well. You can see my screen properly, right?Perfect. Cool.
So, um, as, uh, Stefan mentioned, I'm Daniel, uh,co-founder and, uh, co CEO of silo and, and private GPT. Today we are gonna talk about, uh, how AI breaks privacy,which is something that you might be a little bit surprised,uh, taking into account that we are, uh, the owners of, uh,ai, uh, open source project,and a product that is AI native also. And you might be wondering why I am claiming this. Um, but it's actually, that's the reason why,why ity and insight were created. So let's, uh, let's go into that.
Um, I'm not gonna go over my, my, uh, background. Uh, Stephen already mentioned it pretty well, so, so yeah. Uh, pretty diverse. Uh, I have been a pro engineer all my life, uh, working on,but also have a, a really technical background. So, yeah, uh, we, we can discuss, uh, a lotof things in the, in the question part.
Um, so what is the private GPT?Let's start first with, uh, what's the project that, uh,brings me here today?Private G PT is a open source project that, uh, allows youto, um, have a localand on-premise ai, basically to control everythingand runs, uh, an LLM offline. And, uh, by doing that, you can leverage your, uh,private knowledge space, your documents,and talking, uh,to your documents in a completely private way,because you can execute private GBT in your own laptop. Uh, so it's completely private. No, no, uh, no data leak is possible. Um, this project was launched in, in May, 2023.
Since then, it has been in the odd list of, uh,GitHub several times worldwide as of today. It's basically the first, uh, open source projectfor privacy in AI is a reference. Uh, more than 53 K GI have star more than seven K forks, uh,a really, uh, healthy community, which I, by the way,I I really appreciate, uh, the work that you do on a,on a weekly basis, guys, uh, you, you keep the, the project,um, super alive, so thank you for that. And, and the discord is always,uh, a really good place to be. So this is private GPT, uh, private GPT was the seedfor our commercial product, the onethat we have built on top of the private GPTtechnology, which is silent.
And silent is a truly, uh, private AI workspace. We will understand today why I use the word truly,but, um, what is silent in a nutshell. Uh, it's, uh, uh, all in one AI work space, uh, that, uh,leverages your knowledge space, the companyor organization knowledge space to allow, uh,knowledge workers to write, synthesize data, uh, createand co-create with partners, along with the, with the LLM,with silent, uh, intelligent agents. Um, it's, um, fully prepared for, uh, AI experiencearound projects and collaboration. So it's, uh, it's designed for the, uh, professional worldwith, uh, you know, roleand permission management, stuff like that and everything.
It's, uh, wrapping a really, uh, ux, uh, friendly interfacethat, uh, it's also, uh, entirely deployed. All the layers from the, the, the LLM, the embedding model,the vector database, uh, backend, frontend, everything. It's, uh, uh, it's a software packagethat can be entirely deployed within the,the company infrastructure, making it 100% private. And again, this is somethingthat we will understand better today,because when we talk about privacy, especially in the new,uh, AI era, um, there is, um, an interesting thing which is,uh, if you go outsideand you read about privacy in ai, basicallyeveryone claims to be private. Uh, but it's interesting, uh, this statementbecause you need to, uh, type deep a little bit in orderto understand what is, uh, um, you know, uh,below the surface when they say they are private,because there are different levels of privacy,different granularities of privacy,and this is what we are gonna be reviewing today.
Uh, I wanna provide you, uh, a complete overviewof the different architectures, the different waysof solving, uh, ai, um, applicationsfor the B2B environment, especially in order to understand,uh, next time you read, uh, we are fully private,you will understand, okay, this is the level of privacythat you're covering, or, or, or this other one. So let's jump, uh, directly into the different ways of, um,understanding why this is a problem in ai. So, the, the first, the problem that all these products, uh,including silent wants to want to solve, is, uh,that AI breaks privacyand breaks privacy in a, in a simple, uh, way. Because at the end of the day, when you, um,provide your data to, uh, AI platforms,you are basically opening your heart to these,uh, uh, platforms. And we have read, um, several news about, uh, for example,OpenAI using the data that we put in, in the,in CGPT to train their models.
And later on, people doing reverse engineering in orderto extract such data. So the training data at the end that is used, uh,could be accessed by some techniques. And if we provide this data to this kind of platform,we are putting in risk our data. So basically, you're losing that as sovereignty. Um, when you think about, uh, companies, um,that have a strong regulation over the data that they, uh,control, like, uh, healthcare or public governmentor defense, military, things like that, um, finance, legal,uh, compliance is an important factor.
And when you, as an engineer, as a technical, uh, uh, or,or a product, people in your company, you wantto integrate AI services in your company, you need to, uh,think twice about the legal implications of doing so,because you might be breaking your compliance by,uh, using some tools. So it's important to understand the level of privacythat you need for your industry. And by doing so, it will be easierto pick the, the proper solution. So, um, is this something that we should be con uh,concerned as of today?Yes. Uh, one year ago, um, the ai, uh,environment was completely different.
The adoption of AI is, uh, uh, you know, brutal as of today. Uh, this is, uh, uh, some numbers that, uh,an study from Microsoftand LinkedIn was, uh, shared a few, a couple of months ago,like 75% of, uh, knowledge worker are already using ai. And you could be thinking, okay, so 75 of these, uh,workers have already adopted some solutionthat is compliant with the company. Sadly, that's not the reality. We encounter this every day.
When we talk to our clients in silentand majority of these companies. What they have done is thatbecause of the privacy risk, they have banned the usageof some of these tools, especially the onethat are completely free or open, like, for example,CGPT, for B2C. And, and the problem is that when you ban, uh, that kindof tools, usually employees create their own personalaccounts, uh, behind the curtainsand keep using them for, uh, company purposes, which is, uh,making the problem even worse. Because basically, you have completely out of control,or you have no control at all of the,how your company data is being used by your employees. And this is something that, uh,companies are finding on a daily basis,and this is a real problem.
So let's try to understand what are the solutions that areoutside and how you can, uh, understand the levelof privacy on what is the solution that is most suitablefor, for your environment. Um, because again, as I saidbefore, everyone is claiming that,uh, the product is private. So let's, uh, uh, dive deep a little bit in orderto understand what's the actual privacy levelthat they comply with. Um, the first one is, uh,what I call compliance SaaS, software as a service,which basically are, uh, products that, uh, have allor their infrastructure in, in, in a cloud architecture. Uh, and this cloud architecture might belong to themor might belong to others in a sense that,that they can have their core business in a cloud managedby them, but then they can be using different third partiesin order to implement different featuresof their ai uh, application.
Uh, the most interesting thing about these kind of, um,products is that usually they are so two, uh, GDPR, uh, HPAcompliant, but this is not enough,because at the end of the day, um,these certifications usually refer to how you are managing,uh, the data in, in a procedural way. Um, but when you are using third parties,you are losing some, uh, or part of your control about that,because you might be SOC two,but, uh, your third parties might not be SOC two. And in addition, if your,even if your third party are SOC twoor GDPR compliant, you cannot control your third party. So if your third parties have a, a, a data breachor have a security issue, you cannot fix itbecause basically, you are notthe owner of that third party. So basically, your product is at risk because of others.
Um, and this is something that at the end of the day,is making you losing control over the, the datathat you are giving to these, uh, applications. So as you can see, uh, in the, in the graphic, your datathat is private is in red color. You are providing your data in a raw way to these, um,AI SaaS provider. And this AI SaaS provider is scoping, uh,is scoping your dataand is distributing your data to other third partiesfor different purposes like, uh, you know, embeddingsor, uh, ingestion or, uh, vector database,or even inferencing. So all of this is basically generating copies of your data,and you are losing control of, uh, such data.
And in addition to this, you needto trust all these third parties,and you need to trust this, uh, from phase, uh, companythat is providing the AI service that they are, uh,not using your data, for example,for training their own models or training their own systems. Um, of course, at the end of the day,this is, uh, a matter of trust. You can for sure trust them. And, and, and if you're okay with that, uh, that's good. I'm, I'm not saying that these are bad options,I'm just saying that, uh, if you are tiedto a specific regulation or compliancebecause of the data that you're using, again, healthcare,legal, uh, defense, public government, you might not be, uh,using this kind of, uh,or you'll be, you will be completely, um, you know,it will be illegal, uh, to, to use this kind of solutionbecause you need full control of your data.
So this is a kind of, uh, um, privacy level that you needto be aware when you use this, these kind of, uh,services like, uh, um, you, you can think about, uh, severalof them in the, in the industry. So let's move to the next one. The next one, it's, uh, uh, uh, an interesting evolution of,uh, this, uh, SaaS, uh, cloud-based, uh, approach, which is,uh, the data anonymization, uh, layer, uh,that is included in the architecture. So basically, when you have your dataand you provide this to a service,if you put in a man in the middle, uh, uh, that is in chargeof, uh, anonymizing your data, that could solve, uh, partof the problem that we're, uh, discussing before. For example, let's think about, um, a legal contractthat you cannot disclose, you cannot share with othersbecause of regulations,but maybe what you can do is anonymize the data within thedocument, the legal contract, by replacing, uh, namesand numbers and dates, uh, by, uh, pseudonymor by placeholders.
And by doing that, you are anonymizing the, the document. So when you have the document already anonymized,and you can see here the change, you have your data in red,which is at risk, uh, when you shared it,and then you have your data at blue,which is basically is anonymized. So from that point on, you follow the same patternthat you saw before, uh, in the,in the previous architecture. And your, um, data is gonna be copied allaround in all these services,but your, uh, data is gonna be anonymized. So, um, your, um, this kind of services, the kindof third party services, they're not gonna be probably, um,able to use your data to train their servicesbecause your data is anonymized.
So they are losing that kind of contextual information. But, and here is the problem of this, you are achieving a,a, a greater level of privacy, but at some cost. First of all, you needto have this on-premise anonymization within your company,which is something that, uh, it's not a big problem. It is something that you need to consider,but also you need to, uh, consider the loss of quality. And this is something that could happenbecause be, uh, by the fact of anonymizing data,you might have less, um, rich contextual datato provide to the LLM.
And by doing that, the LLM might be providing worse, uh,results when, uh, uh, executing, uh, the influences. So it's important to understand, uh, this trade offthat you are, uh, investingbecause of, uh, achieving more privacy due to animation. Uh, you will be more protected,but your data, uh, generated by the LLM,the results generated by the LM might be, uh, worseor, uh, not as richas when you are providing the full, uh, context. And this is a trade off that you need to, uh, balanceand understand when you go for this kind of solutions. And then there is, uh, uh, an an obvious risk, which is, uh,if the anonymization method,the anonymization process fails, somehowyour data is completely smooth.
Uh, and we go back to the, to the previous, uh, uh, modelingwith your data is being copied, uh,or copied by others, uh, in different third parties. So keep in mind that this kind of trade offbecause, uh, at the end of the day, privacyor the level of privacy that you want to achieve is a matterof, um, uh, understanding the trade offthat you wanna do, you wanna go for. So, um, let's go for the next one. The next one, uh, is, um, local execution. And, uh, with local execution at the end, uh,what you're achieving is, uh,a solution in which the whole system is running, uh,in a control device and isolated device.
Let's, uh, for example, use, uh, example of private GPT. You can, uh, we will be talking more about private G PTlater, but if you have a, a, a personal device like a laptopor a, a, you know, personal serverthat you have in your home that it's only used by you, um,you can run all the, uh, AI related logic,all the inferences, you can, uh,build all the layers related to have, uh,chat GPT alike application, uh, in this personal device. And nothing is gonna go outside. It's gonna be 100% pri uh, private. And actually, you can even disconnect it from the internetbecause, uh, an LLM doesn't need internet connection.
One has been, uh, set up. Um, this could be, uh, an interesting solution. But as you might be already thinking, uh, this is somethingthat is not gonna scale, uh,because you need a really powerful device to, uh, uh, have,um, a really good experienceor to, to, to, to play AI at the scale. And, and this is a, a solution thatprovides you the maximum level of privacy,but sacrificing, uh, all the scalabilityand collaboration between peers in an environment. Because at the end of the day, you will have only, uh, that,uh, AI system running in a single machine, uh,which again, you need to evaluate the, the trade offsbetween going in on a direction or,or another direction, which in this case is, uh,full isolation, full privacy.
Um, as you can imagine, in this case, uh,your data is never, uh, copied to, uh, any other placebecause your data remains in your device, in your serveror in your laptop or, uh, in your, uh, smartphone. Um, let's move to the next one. Uh, sorry. Uh, the next one is the in-house development. And as you might be thinking,and actually this is, uh, uh, an example that, uh,I can give you, uh, a lot of, uh, real example from peoplethat they have a leverage private GPTto create their own in-house development.
And this is completely valuablebecause at the end of the day, if youcreate from scratch all the technology involved to have, uh,an AI system within your company by taking careof all the end-to-end development, um, this is somethingthat is gonna provide you, uh, full privacy,but, uh, at a cost that might not be feasiblefor majority of companies. Because, uh, you will need to invest a significant amountof time, significant amount of money, and,and you will need to have a, a great team focusedto build all the required technologyand all the required features around this,because it's not just building the AI layers. You will need also a backend, a frontend. Uh, you will need to develop your business logic around, uh,all the, uh, uh, you know, use casesaround ai, all the pipelines. So it is not just setting up, uh, an LLM to be ableto run simple inferences, you will needto develop a whole software product.
And, uh, the implication of that is basically a really,really expensive project that only in, in my experience,um, for those that want to, uh, uh,have a system at a scale, it's only feasible for really,really large enterprises, uh,because it, it, it, uh, relates to a really, uh, amount of,uh, again, time, money, and focus from the company. Um, the next, uh, option is, uh,literally this, the previous one. But instead of doing, uh, it by yourself, you, uh,pick a product that does this for you. And this is literally what, forexample, we have done in silo. So at the end, it's, uh, uh, the same as developing itby yourself, but the whole platform isprovided by others in this case.
Uh, for example, this is what we, what we do at silent,and we are the only ones having all these layers, um, in a,in a single package that can be deployed, uh, in, in,in the infrastructure of, of our clients on,on a company or an organization. So at the end of the day, as you can see,what you have here is your private data. It's always, uh, protected. It's 100% private. There is no possibility for data leakbecause your data remains within your infrastructure,because the whole system, in this case, silent or,or, uh, product, AI product leavesand is, uh, running within your, uh, infrastructure.
So at the end of the day, you can customize your isolationlevel, and you can go from your, uh, deploying the system,uh, within your private cloud,or you can deploy it in a bare metal in a data center,and you can even disconnect the datacenter from the internet. Same, same case that we saw in, in private gt. So as you can see, this is kind of a private G PT atreal scale, at enterprise scale. Um, the only drawback about this, uh, approach,because as I said, uh, full privacy, uh, all the features,uh, at your hand, the only, um, uh, drawback isthat you will need your own infrastructure. You will need to, uh, manage your own private cloudor data center, because the only way, um, to, uh,ensure 100% privacy is that you run it by yourself.
Because that way, the, the, uh,company providing the service, for example, as silent, uh,we won't have even access to your data. So it, it will be completely managed, uh, by you. And by doing that, you would ensure, uh, complete,uh, 100% privacy. So these are the, the different levels that we can, uh,find, uh, when going for privacy in B2B applications. So, um, from this point on, let's try to, um, review, uh,what you can do with private GPT,and then we will explore different setups to, uh, go forthat, uh, different, uh, for those different levelsthat I just, uh, showed you.
And, and I will for sure, uh, use real examplesof the components and models that you will need to, uh,connect to private G pt. So let's start first with private GPT,going a little bit deeper of what is private GPT,not just from a, a functionalor user facing point of view,but from a, an architectural point of view. So, private G pt, as I mentionedbefore, is a, a whole frameworkfor developing context aware applicationswith privacy in mind. And the way of private GPT is that we have leveraged, um,a lot of open source, uh, components in,in the community in order to create, uh, a complete, uh,framework that can, allows youto extract yourself from the difficulties, uh, relatedto develop AI projects. So in this case, private GPT, uh, expose, uh, a setof APIs for, uh, drug primitivesand for recipes that we, I will explain later what they are.
We, um, have built all of that in on top of, uh, LAMA Index,which is another open source from, from the community. And we also use, uh, a lot of pieces that are also in theopen source or enterprise community,but you can leverage them, like, for example, for, uh,the inference and, uh, model for the inre, in part,you can use a LAMA or you can use Lama CPPor even Triton from Nvidia, uh, that is up to youbecause at the end of the day, privateis fully configurable. So you can, uh, by just changing a a YAML configurationfile, you can decide which inferencing server to use. Then you can, uh, by using the same method, you can, uh,leverage or use different large language models. You could be using lama, you could,you could be using minstrel.
So it's up to you actually, if you would liketo use a private model like a cloudor like open ai, you could do sobecause, um, private EPT exposed a set of APIsthat are based in the standard proposed by OpenAI,uh, more than a year ago. And it's as easy as disconnecting, uh, from, uh, your, uh,current, uh, OpenAI account and using private GPTand setting up a different LLM, um, below. And it will work, uh, just straight away. Of course, you can select your own embedding model,and of course, you can select your vector database. Uh, one of the, the, the database is the vector databasethat is, uh, fully integrated with private GPT is mebo.
So it's something that you can, uh, just use out of the box. So this is the, the general architecture of private GBT. Let's go a little bit, uh, into, into more details. So how to build on top of private GBT in orderto build on top of private GBT. As I mentioned, uh, we have our rest, API that providesto a different set of, uh, APIs.
One is the, what we call the primitives API. And this is the, the, the building blocksthat you will need in order to create a, a context app, uh,following, as I saidbefore, an extending the OpenAI, uh, API standard. And we have, uh, uh, separated these, um, API into levels,the high level API for those that don't want to go into,you know, uh, the more, uh, uh, obscure details of, uh, howto deal with, uh, LLMs and embeds and so on. And in the high level, API, it's, uh, basically are readyto use drug pipeline that you can, uh, useto ingest your own documents. Uh, and it's gonna be taking care of parsing, splitting,creating the metadata of the different chunks,and it's gonna take care of them varying on the storage.
Um, and then you'll have also an API for Satanand completions in order to give you the chanceto create a full, um, you know, uh, chat experienceto talk to your documents. So it's gonna be taken care of, uh,the contextual retrieval, the prompt engineering,and all the response generation. So pretty easy, just a, a a, a simple rest, uh, APIthat is gonna allow you to do that. If you wanna go into more details, you wanna customize, uh,even, uh, more what's the kind of componentsthat you wanna integrate in, in private GPT,you can create your own rack pipeline. And this is the low level API.
And this expose you, uh, capabilities to play aroundwith the embedding generation. You can create your own algorithms to, uh,or you can create your own logic for, for embeddingand for ingesting documents. And also you can create your own logic for, uh,the contextual retrieval of chunks. So you can decide, uh, what is your strategy for, uh,both creating the chunksand then, uh, populating them with metadataor other, uh, attributes that you want, you, you wanted to,to you want to play around. And, uh, most recently, actually, this is, um, somethingthat we released, uh, uh, month ago, uh, is the recipes api,API, the recipes API is, um, out of the box, um, set of,uh, AI native use cases that allows you tocompletely abstract in a higher level from thespecific features one by one that I just mentioned.
And you will go for a complete, uh, use case. In this case, we released, uh, a few weeks ago, thesummarize, uh, use case. And this is basically a use case that is gonna allow youto summarize oneor a, a set of documents into a single, uh, piece of test. And it's gonna take care of, um, all the generations of, um,the different nodes in a three-way in orderto move the context window to take into account how to,at the end of the day, summarize the, the final, uh, uh,document or set of documents. Because as you may imagine, depending on the LLMthat you're using, your contextual window is not gonna beenough to just put the whole document there.
Let's imagine that you wantto summarize a big book or several books. Um, you might need to, uh, have different techniques to, uh,create different summaries that are being aggregateduntil you generate the final summary. This is what we do with recipes. We create the whole, uh, use case. And by using summarize, API,you can do this, uh, out of the box.
So this is the, the API. And you might be thinking, okay, do I really need to gothrough a rest, uh, HTT B methods to work with this?No. We have other options that are even easierfor the community to, to leverage. First one, we have a Python, SDK, uh, which is somethingthat the, um, ML community is pretty useful. So, um, we have, uh, we released, um, beginningof this year, the SDK, uh,and we publish this, uh, publish it at our IPI libraryso you can install it and, and use it,and it's gonna have the same, uh,rest API that I mentioned before.
So it's gonna be super simplethan the straightforward to use. And if you want to move, uh, in a different direction,and you want to, for example,develop your own website application,we have our type script, SDK,and exactly the same philosophy of the SDK for, for Python. It's a simple to use, uh, fully mapping. The rest, API, uh, we have publish it as an NNPM librarythat you can, uh, install, uh, both the, the web, uh,component and the no GS component. We have also react adapter built in.
So with this, as you can imagine, you can create your own,uh, application completely from scratch by, uh,leveraging the rest API,but you can also, if you already have, uh, an application,uh, on Python, in Pythonor uh, in any web technology, you can use the type SDKor the Python sdk, and it, it's gonna accelerate a lot, uh,your development type, um, for thosethat maybe don't want to, uh, play aroundwith the development or that just want to test things, uh,in a really fast wayor in a really local way that you,you don't need to develop anything. You just want your own, uh, LLMor personal LLM at hand in your machine. Uh, I would totally recommend to use the,the Gladio UI client. It's, uh, uh, available out of the box when you, uh, installand you deploy, uh, private in your machine,and it has, uh, you know,the look and feel that you can see. It's a, uh, a chat base interface that is gonna allow youto select what kind of LLM you want to run, what kindof inference, um, server you want to run.
In this case, for example, we are using alama,and the model is Mistral. Um, and then you can query, you can search fileswith a semantic search. Uh, you can also chat directly to the knowledge base, uh,that the LM was trained withwithout incorporating any contextualinformation from your documents. But of course, the interesting thing here isto incorporate your documents. You will have here, uh, uh, a wayof managing the ingested files and, uh, remove themor, uh, managing your library.
And this is something that is pretty useful for thosethat don't want to maybe complicate too muchwith the technical details. So this is, uh, again, an out of the box way of, uh,using private gt, uh, in your computer. Okay. So with this, uh, you have an overview of, um,how, uh, private GPT, uh, can be leveraged, uh,for different, uh, sorry for how can, how private GPT worksand what, uh, features private GPT offers to you, uh,from a developer or development perspective. So now what I want to do is, um, kind of mix, uh,what I just told you,and the different level of privacy that I mentioned before,and see how private GPT can cover, uh, some of them.
Uh, uh, the extreme, uh, caseof private GT covering the, the most private way is silent,because in silo, we literally, uh,develop a whole project in the last year on top of,uh, private gt. Uh, but we will see how to evolve this, uh, littleby little while, a couple of cases. So let's go first with the localexecution that I mentioned before. So the local execution, uh, as I said, is, uh,the most private way of interacting with A LLMbecause, uh, all your data will remain in your computeror, uh, you know, your server, that is, if it's a machinethat you have at home,and, uh, it's up to you to even connect it to the internet. So the moment you, uh, install everythingthat we are seeing in, in the picture, uh, you will have,uh, complete control over the datathat you have in, in your local setup.
So you could even disconnect it from, from the internet. Um, what's a, um, um, common, uh, architecturethat, uh, we recommend from private GPT to, to go on and,and explore these kind of, uh, use cases,this kind of local execution. Our recommendation would be to, uh, use alamaas the inference server. alama is gonna simplify a lot, uh, how to, uh, you know,configure your GPU, how to, uh, integrate the LLM. Actually, a LAMA already has, uh, uh, pre, uh, integrationwith different LLMs, like, for example, SRA or LAMA 3.
- So you are gonna have, uh, a lot of options there,and it's gonna be pretty easy. It has an installer for desktop, which is super easy to go. So that will be my recommendation for those wantingto just try this in a local execution, uh, and,and leverage the power of LMS in our complete private way. Then for the models, uh, I would go as of todayfor LAMA 3.
18 billion,but it's up to you guys yesterday or the day before. MIS trial release, a 12 billion, um, multimodal, um,model in order to, um, work with both text and images. So it's up to you to, to go for, uh, what you prefer. Uh, then for the Vector database, you could use,for example, BU light, which is a, a,a really lightweight way of, uh,having your own vector database without complicating toomuch the installation, pretty straightforwardto style the documentation is pretty neat. And for the embedding models, for sure, you could select,uh, among a, a a really big amount of, uh, embedding models,but we usually recommend, no, in this case, uh,but up to you guys, again, you can goto hugging phase check embedding models.
It will also depend on thelanguages that you want to support. If you want to support English,or you want to support, you know, Spanish or Frenchor Arab, you have a lot of, uh, uh, interesting ways of,uh, understanding how to, uh, select the purposeof your AI system, taking into account the embedding modelthat you are selected, because at the end of the day,the embedding model is gonna be the one responsible for, um,uh, transforming your data into, um, uh,embedding inside the, the, the LLM is gonna,uh, play around later on. Um, on top of that, uh, private GPT is gonna beactually wrapping everything. And the, as I saidbefore, the gladio UI client is the easiest way to go. But if you want to play around with your own Python clientor, or web client, uh, it's up to you.
It's, it's super easy to go, even you can, uh,manage it by, by console. So, uh, pretty easy. Um, the infra needed for this,because this is a common question that we receive,you can do this without commercial laptop,but you need to understand that you need, uh, you know,at least a minimum amount of power, especiallyregarding the GPO, uh, in order to, to run this. On the other hand, you also need some RAM in order to, to,to, to put the models at work. So, our recommendation from our side, uh,actually is the one that I'm using is a MacBook Pro M three.
Uh, I know it's not the, you know, uh, cheap laptop,but it's not also a, a super, super expensive one. So, uh, uh, it's something that you can go and,and everything is gonna work, uh, pretty,pretty good without good quality and good response time. Um, if you move to a different, um, execution,let's say this time, a cloud execution,because you want your private GPT instanceto be accessible from anywhere, I would recommend you to gofor a setup like this. Uh, and a setup like this basically is gonna be, um,you will need a, uh, cloud infra, uh, infrastructure. And this, uh, in order to do so,you will need a cloud instance in which you will deploy, uh,this, the, the same services that I mentioned before.
For sure, you can split this into different cloud instances,but that the simplest, um, architecture will be to have Lamabu in this case, I will go for BU standalone, uh,which is already prepared for, uh, with a Docker image. So it's gonna be pretty easy to, uh, install in your,your instance if, if you have Docker there. And we totally recommend to have Docker there. Actually, we have your, our own, uh,private GPT Docker image. So, uh, you can put this, uh, in, uh, in the instance,in this, uh, cloud instance.
And also, uh, for example, for the sake of exploring other,uh, ways of, uh, using private GPT,you could have your node JS, uh, server in order to have,uh, a web, uh, client, uh, using the typeof script web client, uh, for having a private GPT in a,a website, um, flavor. Um, by doing that, you could be, uh, you will be ableto access at the same time from different devicesand remotely from your laptop or your smartphoneor whatever you have. And the interesting thing here is that, uh, again, as I saidbefore, you can break this into pieces. You can put the no JSand the web client in a different server justto serve all the web parts,and we, you could, you could have the restof the components more AI related into a different instance. But for the first sakeof simplicity, let's keep it that way.
But one thing that could be really super interesting inorder to scale up, uh,because at the end, the constraint, uh,when we were discussing about the local execution,was the power associated to your laptopor to your local machine. So the way to scale this up is to basically leverage, um,a cloud-based, uh, provider of models. And, and I'm using these specific wordsbecause you are not using, um, an a PAtoken based, uh, service from, for example, open ai, whichby the way, you could, uh, what I'm proposing here isto deploy your own models in your own, uh, Amazonfor SageMaker, and I'm saying AmazonSageMaker, you have other options. And we are used to work with Amazon for obvious reason. I was there for several years.
But, uh, in any case, the, um, the thing is that, uh,the embedding model and the LLM, you can, uh, serve themthrough Amazon SageMaker. Um, by doing that, you are the one controlling, uh,that no one is able to spy your databecause you are serving your own models,you are controlling this cloud infrastructure. So privacy remains, that's the important thing here. Remember that. So by having your, uh, for example,AWS cloud infra,you will have one cloud instance running the services that,I mean, uh, that I mentioned before.
And another, uh, let's call it instant, uh, of sales maker,uh, providing you the LLMand the embedding that is gonna be connected to private GPTfor doing the whole AI operation. So this is, um, uh, uh, a nice setupthat will allow you to scale. And of course, if you ask mewhat would be the next step, Danny?So the next step is, uh, clearly obvious. The next step is, uh, you need to scale this up. And in order to do that, uh, you might need to include, uh,things like load balancers.
Uh, you might need, uh, different inferences services, uh,in order to, uh, be ableto have concurrency, uh, in the request. You will need to manage, uh, also pipelinesof requests from the users if you wantto really scale this up for different users. And this will be up to you. To be honest, I, I haven't, um, proposed here, uh,a specific architecturebecause, um, as you saw, uh, let me go back in this slide,for example, this is a snapshot of what we have,a really simplified one of what we have in silent. And in silent.
We have our GPUManagement and Inference server. Uh, on top of that, we have our LLM, our Vector database,the ingestion, uh, service,and the OCR service for managing, uh, image, uh, documents,image based documents. On top of that, we have silent GPT, which is at the endof the day, uh, uh, private GPT on steroids. Uh, something that we have been, uh, you know, uh,developing for concurrency, for catch a,for realtime execution, uh, for having, uh,more complex data pipelines, somethingthat could work at a scalefor a bigger company or organization. And on top of that, we have all the APIs from,from the asylum platform.
And, uh, from that point on, it's up to you the kindof product that you want to be if, uh,because that will be, uh, defined, uh, by your use cases,um, by the value prop that you wantto provide to your customers. Um, that's gonna determine the business logic,uh, living in the backend. That's gonna define the user experience that you will have,uh, at the UI layer, uh, the customer facing, uh, layer. So as you can see, this is the waythat you can scale up from private G. And this is literally what we did.
We started with private GPT one year ago,and we scale up by, uh, uh, scale up horizontallyand vertically, uh, in complexity, uh,because you need to take into account a lotof things when you want to serve, uh,a hundred or thousand of users. So let me, um, close up now. Um, we have, uh, I wantto save some time for, for questions. So the, the main conclusion here today is, um,keep in mind that when someone tells you I have, uh,uh, an AI product that is private,that could mean different things. And we have seen today that, uh,there are at least five level of privacy that you needto take into account or that you can, uh, try to reach.
And the important thing here is to understand the trade offbetween privacy and the required effortor, uh, investment that you want to put,uh, on top of the day. Because, um, if you are a, a, a, a companyor organization living in a really regulated worldwith a hard compliance, uh, dealing with, um, uh,sensible data, it's important for you to be aware ofwhat happens when you go for, uh, simple SaaS servicesthat are, uh, leveraging a lot of third parties,that is gonna put your data, uh, completely outof control if something happens. So it's important for, uh, these kind of scenariosto understand what's the granularityof the privacy that you need. Uh, of course, if you are a standalone developer at homethat you wanna, uh, play around with LLMs,or you want to, you know, allow your colleagues tocollaborate in a small environment,I would totally recommend to go for the, uh,cloud execution, uh,or the cloud architecture that I just presented,because it's something that is, uh, easy to, to build and,and pretty, uh, nice to, to play around. Um, again, be aware of the privacy level and trade offand do the, the balance, uh, between privacyand the investment and effort, uh, you require.
And with that, um, I'm closing. Uh, just thank you, uh, people for, for, for being there. And just a kindly reminder, uh, from my, uh,my background, uh, we have two projects, uh, in,in our company we have private GPT,which is the open source project, which probably majorityof you know, and we have Silent,which is the commercial solution, the on-premise platform,all in one workspace for ai. So if you have, uh, any interest on bothor in one of them, please reach me out, uh, eitherby Discord or by email. Uh, I will be, it would be a pleasure for meto, to answer your questions.
So thank you for your time, guys. Thank you very much. Thank you. It was very interesting, actually, uh, I'll just follow up. We have, uh, already one question in the chat from Justin,so it's actually related to silent.
So how does it work? And I'll just read it out out loud. So it's like, how does it work?So, do you need to send your data to silent,or is silent acting on your datafirst in your infrastructure? How does that work?Yeah, that's a really good question. Uh, no, you don't need to send anything to silentbecause, uh, at the end of the day, silent runs within, uh,your company or within your private infrastructure. So you can think about, uh, silent as a, you know, uh,all day software that you buy in a board box,and the moment you install it, it's yours. So it's, it's running in your infrastructure.
You need to, uh, actually, we don't want to, to, to dealwith your data because it's partof the private privacy level that we offer. Uh, you are the, the one controlling 100% your data and,and it, it's always, uh, in your infrastructure. Cool. Thank you. And we have another question.
So you, like, it's using open AI embedding in,in the case of open ai. So does that mean the data is gonna go to open AIand will be stored in any way there?Uh, I'm not sure to which part this question refers,but if it refers to silent, not at all. We don't use any, any third parties at all. Uh, we,but if you go, okay, I'm you, yes,it just not silent. So if you go for, um, for the, for example, thisor, or these, uh, setups, no, it's not going for open ai.
You don't need open AI at all. Uh, for, for, uh, applying embedding models, uh, all the,all the, uh, for example, the local execution,everything is in your machine. So you can disconnect this from the internet,and it's gonna work because the embedding model,it's running in your laptop. So same philosophy as, um, actually, I, I like to,to do this reflection like we,private GPT was the first project being ableto run all infrastructureof a chat GPT alike application in your laptop,completely private, even withoutinternet inter, uh, connection. Silent does the same, uh, at the co at the enterprise level.
So you don't need any third parties at all. Uh, and if, uh, you need anything, it isto download these models that are open source,but that's something that you can download and,and use it without leveraging on, on any third party. And yeah, there's a follow up question. So you can set it up locallyfor any company in your organization,and you just only need the infraand the services, right? Exactly. Yeah.
Okay. In the, both in private GT and,and Silent, of course, private GT is a more basic product,more oriented to developers to build on top of it. Asylum is, uh, out of the box, ready to go. Cool. Thank you.
And actually, ha, I have a question as well, which is, um,how does it work for like multimodal in general?Is it like, does it work locally or like, how does it work?I mean, the, the, the answer would be, uh,is your infrastructure powerful enough Yeah. To run these kind of models?If the answer is yes, uh, private GPT will work. Mm-Hmm. It's up to you, of course, to modify the interface,the user interface, I mean, in order to support this kindof, uh, uploading of images and,and understanding the imagesthat the model is, uh, returning to you. And the same goes for silent, uh, in silent, actually,what we do is, uh, depending on the client we interact with,some clients they prefer to remain with the small modelsbecause they are, it's a good balancebetween cost and efficiency.
Some of them, they prefer to go for bigger modelsand multi models, models. So it's up to you. At the end of the day, it's fully configurable,so it's just easy for us to just, uh,connect one or the other. Okay. Cool.
Thank you. And we have another question as well. So in private GPT, when it's on the cloud infrastructure,can multiple users use it at the same timeand have separate chats?That's a really good question. Um, no. If you use the, uh, out of the box, uh, private GPT,because, uh, private GPT doesn't understand, uh, about usersor sessions, this is something that we haven't, uh,provided in the, in inside GPT,because this is part of your, uh, of your use case.
It depends on, on the use, uh, that you're gonna,that you gonna, it depends on the use caseor the scenario that you wanna cover. With private GBT, you will need users or not. Uh, so it's up to you to build all the sessionsand user layer, uh, in silo, for sure. In silo, we have, uh, all the user layer, we have roles,we have permissions, we have granularity to select thatand to decide who can access what. Uh, so that's kind of the, the word that we have been doing,uh, in parallel to keep evolving private G And by the way,you could do it, uh, if you, uh,if you put several, uh,private G in parallel, you could do it.
Uh, that's up to you. It, it, it, they can actuallyreuse the same inference server. It's just having different private, uh,from a client face point of view. Okay. And yeah, someone check your documentation.
So in the documentation it saysthat the RUG Pipeline is based on LAMA Index. Why did you choose LAMA Index?And can the same be done with L Chain?Yeah, this is a, a really nice question. Um, to be honest, uh, we started with LAMA Index. If, if you go to, sorry, with launching, um, we were super,uh, in contact with Harrison by that time. Uh, but at some point we decided to move to LAMA Indexbecause the kind of, uh, use cases that we wanted to coverwith, um, uh, PRIand Silent, uh, very easier to cover with LAMA Index.
And since then we decided to keep that way. Actually, we, uh, collaborate a lot with, uh,and the team from, from LAMA Index. So, uh, it's something that we decided, uh, one year agoor maybe less than a year ago. But, uh, yeah, we started with Laing,we moved to LAMA Index. Um, it was a design decisionbecause of the use cases that we wanted to cover Lab Index.
By that time, uh, I'm not talking today,but by that time it was providing us, uh, a better, uh,support for the use cases that we wanted to cover. And that was the decision. No specific one. Ooh, thank you. And I havea follow up question to this one.
Have you tried, tried out, uh, Lama agent?Not yet. Okay. Uh, this is something that we are actually,we were discussing also about, um, the lama uh, workflows. Uh, this is another thing that we are, uh, already, uh,using internally, but not the agents for now. Okay.
But it's something that is in our law. Okay. Cool. Cool. Okay.
Do we have any more questions?Otherwise, I have one follow up one, which is,so you have more than 50 K stars. Mm-Hmm. How do people contribute?How can they contribute to private GBT if they want to?I mean, we have a, a roadmap, uh,published in the GitHub page. So you can go there and,and pick, uh, the, the different project lines that we have,uh, or you can propose a new one. Um, we are pretty flexible.
Uh, I can tell you that on a, uh, you know,the last quarter we, uh, merged more than a, uh, dozen, uh,pool request from people outside our team. Uh, but also we have, uh, uh,still we are the major contributors. Uh, so that's a normal thing, I guess. But we are basically collaborating every day. I, I, I, yesterday for example, I, I talked to my, um, uh,to my frontend engineer that was yesterday,supporting a guy asking some question about how to, uh,use remix, uh, to deploy private GPT.
And he was, uh, providing him a, a full POCto, to start from that. So we are, uh, pretty connected. So I would say that either you go to the GitHub, uh, pageand, and pick something that is there,or you go to the Discord, uh, channeland say, Hey, I have this idea. Uh, I want to work on this. And we can give you advicesor maybe saying to you, Hey, we are already working on that.
Let's collaborate and, and, and let's, uh, do it together. We have a pretty healthy, actually, that's exactlywhat we did with bu, uh, BU came to usand they say, Hey, uh, we want to integrate, uh,BU in private GPT, and we collaborate together over a week. And, and they integration was pretty smooth. Yeah. Cool.
And, uh, follow up question. So they, someone ran about track few months agoand they wanted to build with, uh, private dt. Uh, and apparently you have really good informationand documentation, so kudos for that. It's really hard. Thank you.
Like, as someonethat works in the open source world,it's very hard to have good documentation. SoThank you for that lot of effort. Yeah. Invested in thatDifficult to keep things up today. Yeah.
Cool. Any more questions from people?Otherwise we can wrap up and Yes. Okay. So we have, um, so I'm a student at the moment,not a pro at coding, but I'm interested in the idea of AIand especially locally running ai. So is Iand private GPT betterfor answering the scene as a beginner?Uh, I mean, the obvious reason here is private GPT isyour, uh, is your, uh,your should be your, your, uh, choice.
Uh, because private GT is the open source project as, uh,Stefan was saying, you have all the documentation thereto play around different setups. The ones that I explained today, the onesthat you can find in the documentation,uh, go for that for sure. I mean, we, we have a lot of students, um, or,and researchers, uh,leveraging private equity for their own projects. So it's a, it's a really nice project to play aroundbecause, and I didn't go into much detail on that,but you will see the documentation. We have put a lot of, uh,software engineering mindset when developing these.
So we have a lot of, uh, components that you can customize. So it's really easy to configure. You have, uh, different Yammer files, uh, in orderto configure things in a pretty neat waywithout actually coding anything. Cool. Thank you very much.
Uh, yeah, so I think we'll wrap up hereso the recording will be shared. Uh, also the slides, uh, will be sharedas well with the recording. So no worries. Everyone, you know, have, uh, everything. Thank you again, Daniel, for this amazing presentation.
My pleasure. Really cool and very interesting. My pleasure. Thank you everyone for attending and see you the next one. Thank you guys.
Bye-Bye.