Events
ASIMOV: Enterprise RAG at Dialog Axiata PLC

Webinar

ASIMOV: Enterprise RAG at Dialog Axiata PLC

Zilliz Webinar | Zoom

Join the Webinar

About the session

The presentation will delve into the ASIMOV project, a novel initiative that leverages Retrieval-Augmented Generation (RAG) to provide precise, domain-specific assistance to telecommunications engineers and technicians. The session will focus on the unique capabilities of Milvus DB, the chosen vector database for the project, and its advantages over other vector databases.

Topics Covered:

Introduction to the ASIMOV Project: An overview of the project, its objectives, and the role of RAG and Milvus DB in achieving these goals.
Why RAG?: A discussion on why RAG was chosen for the project, its advantages over traditional LLMs, and how it improves the efficiency of retrieving domain-specific knowledge.
Milvus DB vs. Chroma DB and Pinecone: A comparative analysis highlighting the advantages of Milvus DB over Chroma DB and Pinecone, focusing on aspects like performance, developer-friendly features, and on-premises storage.
Optimization of Similarity Search: Insights into how the reorganization of metadata attributes into distinct columns in Milvus DB enhances the similarity search functionality of the RAG system.
Implications for Telecommunications Engineering: A discussion on how the features of ASIMOV can solve common challenges in telecommunications engineering, streamline configurations and troubleshooting processes, and serve as a training resource for budding engineers.

Attending this session will give you a deeper understanding of the potential of RAG and Milvus DB in telecommunications engineering. You will learn how to address common challenges in the field and enhance the efficiency of their operations. The session will equip you with the knowledge to make informed decisions about the choice of vector databases, and how best to use them for your use-cases

View presentation slides

Transcript

Today I'm pleased to introduce to today's session,Asim Enterprise Rag at ogata, which is Sri Lanka's, uh,leading telecommunication providerand our guest speakers, uh, Bindu, Aita and Nula. So, they will talk about an novel initiative, uh,which leverage retro augmented generation or,or usually known as rag,to provide precise domain specific assistanceto telecommunication engineers and technicians. So, DU holds a Bachelor of Science in Electronicand Telecommunication Engineering from General Sir JohnKote, lava Defense University, which is quite a mouthfuland a master's in Big Data Analytics from Robert GoldenUniversity in the uk. He studied his career with Dialogue in 2014 as an internand then transitioned to the permanent team in 2016as a senior executive in product development. Currently, he's servingas the lead engineer in the networkanalytics and automation.

And his expertise spans in across data science, ai,and machine learning, big data analytics,cloud engineering, and networking. A detail, a graduate, uh, of general search on Tel Lava. The Defense University as well, uh,is holding a Bachelor in Science in computer science. Uh, bachelor of Science in Computer Science. Sorry.

He began his journey with dialogue as an intern in 2021and has seen advance the roleof senior executive in network analytics and automation. He's also a specialist in data scienceand AI and machine learning. He also has a pronounced proficiency in data engineering. And finally, we have Nala, which is an, an alumnusof the general surgeon, Kova DA Events University, uh,where he had his Bachelor of Science in electronicand telecommunication engineering. He studied his career at Dialogue as an intern in 2019and in progress to senior executive in product development.

He's currently servingas a senior data scientist in network networkanalytics and automation. And his areas of specialization include data science, ai,machine learning, and geographic informationsystem outside of work. He volunteers in public relation and communication rules. Welcome everyone. And the stage is yours.

You can share your screen. Uh, thank you very much, Steven. Um, hope I'm,um, lemme share my screen. Um, is, uh, hope everyone can see my screen. Yes.

Um, yeah, so, uh, thank you very much, um,won't go, uh, long into introductions as, uh,Steven had already done that for us. Um, so welcome everyone. Um, thank you for attending this session,and thank you to ris, uh, for inviting us,um, to conduct this session. Um, so I'll get straight into it. Um, so a bit about what, uh, we hope to cover today.

Um, so we'll start off with a brief introduction to, uh,who at ogata, as well as Asimov about the project. Um, then we will do aDeep dive into Asimov's architecture and functionality. Um, then we willAlso specifically, um, talk about why, uh, weselected Mil, um, as the vector database of choice, uh,for this, uh, productand also the future roadmap, um, that we have, uh,in plan, uh, for. So, um, just a quick, uh, overview of who we are. Um, we, uh, Ogata, we are part of ATA Group bahar.

Um, currently we Sri Lanka's largest, uh,connectivity providers,and we serve over 17 million customers, uh, island wide. Um, and then we provide, uh, services in mobile teleny,fixed teleny, um, digital television,international services,digital finance, and Enterprise Services. Um, was incorporated in 1993,so we've had over 30 years of service. Um, and, uh, we, throughout those years, we've spearheaded,uh, technology first in South Asiaand Sri Lanka, which includes South Asia's first two G,first 3G, first 4G and first 5G Trial Network,and which is soon to be, uh, commercially deployed. Um, we are also Sri Lanka's largest foreign direct investor,um, with investments, uh, totaling, uh,3.

25 billion US dollars. Um, and alsodialogue has throughout the years won multiple accolades,including six Global Mobile. Um, we've been the telecommunications brand of the yearfor 13 years running, um,Shanika's most valuable brand for five years running. And as of the latest, um, open signal country report, um,we have the best overall experience, coverage,and consistency across the board. Um, so that's just a brief introduction to who we are.

Um, and, uh, a brief introduction about dial. So, um, before I go into what asmo is, um, for the benefitof, uh, those of you who are new to the concept of, uh, RAGor retrieval generation, um, I just have a quick breakdownof what RAG is. So, um, the, uh, the first essential part of RAC is,uh, is embedding knowledge, uh, that is in the formof text data, uh, into an in dimensional vector space. So basically a mathematical representation, uh,of that text data. Um, and it's represented in this indi dimensional vectordata, uh, in this vector indi dimensional vector space, uh,such that, uh, uh, that, um, more related, uh,uh, related data, uh, falls close to each other inthat vector vector space.

Um, so then in the next step, what happens is, um, we, uh,we embed, uh, an input, um,which in our case would be a question, uh,with the exact same embedding model that we embedded, uh,uh, initial knowledge with. And then, um, what would happen is, uh, oncethat question is also, uh, a vectorized,then it can retrieve, uh, um,relevant chunks of information. And then those retrieve information then can be passed intogenerative model, which in our case is a transform modelor GPT, um, to generate the output that we need. So that's just a very brief, um, overview of what ragor retrieval augmented generation is. So, um, to introduce Asimovand our motivations for building this product.

So when it comes to, um, so we're all from, um,dialogues group technology, which is, uh,dialogues engineering on, and, uh, when it comesto our engineering staff, um, they tackle a very vast area,um, uh, segregated by various spheres of expertise. So the Teleco engineering domain, um, consists of, um,access network, core network, uh,telecommunication infrastructure, power and energy. Um, so it's, it's a very wide area. And then, uh, the amount of information or,and, uh, documentation that's there, um,it can either be general information or very productor system specific information. Um, all of that, uh, is, is used on a daily basis, um,yeah, throughout the course of, uh, our staff's work here.

Um, so what we aim to do with, uh,Asimov is to, uh, one, provide a centralizedand secure knowledge base, um, so that, um, all this, uh,all, all the resources, all the documentation,everything is, is stored in a centralized, secure, uh,uh, secure place. Um, and then where,where we provide compartmentalized access. So we provide access to, uh, to that knowledgeto each person based on, uh, the sensitivity levelsthat they have access to, um, on a need to know basis,as well as, um, reduced, uh, reduce the amountof the time taken and enhance,enhance the efficiency when it comes to their dayto day tasks, um, uh, uh, basically to reduce the amountof time they spend, uh, referencing, um, different, uh,sources of information, um, so that, uh, they can find quickand easy answers to the queries, uh, that arise inthat day-to-day, uh, work. So that is, uh, basically our, uh,objective here, uh, with Asmo. Um, so that is, uh, just a bit about, uh, uh, intentionsand our, uh, uh, uh,motivations when it comes to building esmo.

Um, so next, uh, to, uh, dive into the meatof Azimo and to go behind the scenes, I would liketo invite, uh, uh, stage. Okay, thanks, na, for the, into what ov right. So, so as Amber mentioned,has a lot of functionality, so it catersto the entire engineering staff, uh, dialogue. So, uh, so we, in order to do this, we needto understand how it works. So the esmo, uh, ESMO comprises of full main components,the front end, the agent, the vector database,and the model, the foundation model.

So, uh, they are, they're interconnectedby the, by the agent. So the agent is the heart of the system. So all of these are built on top of, uh, well,very well known, uh, frameworks. So, for example, the font end is built on top of streamli,the core, the agent is built on top of plan chainand is our vector database. Uh, so yeah,and, uh, open AI use to open area.

Uh, g PT model has our foundation models. So if you go a step further, uh, into this hole to seehow it works, we can see that the agent comprises of, uh,the chain, the retrieval chain. So for the initial flavor of, uh, azimo, we used the, uh,standard, the conversational chain, uh, developedby the chain community. So we had built on top of this,and, uh, we added our own, uh, quick, uh, clicksand, uh, features into this. Uh, so one thing that we did was we, uh,expanded on the filter function on the, uh, chain.

And, uh, we used the metadata filtering feature. So we added several metadata fieldsinto our middleware collection. And from that, we gave the users the ability to filter downto the types of documents to, in order to improve thefunction and, uh, improve the answersand the responses that the system gives back. And another factor, uh, that we can change here,and there is the K value. The K value here is the chunk number of chunks.

This is critical because, uh, in when we increase the numberof chunks, the responses given back by the, uh,LLM are more nuanced or are more refined. So if,but all of these, uh, need to be, uh, done in a certain way,uh, according to a certain guideline. So this is done using the prompt template. So basically the pro,if you have a very simple prompt template, such as,so this is not a very simple prompt, but, uh, medium levelor, uh, average level prompt. If you have something like this, the system will give agood thing output.

But if you have a very defined well, uh, defined, uh,prompted, that gives us precise guidelines on how to respondor how to analyze the chunks, the LLM gives us a very richor very, uh, refined output. So, uh, so we had a special favor of, uh, IMOthat we developed, so will, will explain that later. That gave us a very, uh, refined output, uh, for,so it was a custom build. So here you can see that, uh,we have incorporated the questions, the question,the chat history, and the context. So the question is basically what the user asks from the,uh, system.

So, uh, every time the user asks question, say, what's, uh,what's, uh, a radio antenna, something like that. This is where the question goes. Then, uh, the chat, chat history, uh, is the,all the previous chats within that sessionthat the user has, uh, uh, accumulated. So, uh, this, uh, grows every time, uh, uh, when thechat continues, uh, on, uh, when the user continuesto chat with the chat bot. So this can, uh, grow as much as possible,but there are certain limits here.

Uh, the context is, uh, uh, the context is the chunksor the data retrieve from the vector database. So I'll go into this later. So all of these are fed into the from templateand given to the LLM, so we can go,so the brains of the system, so the brains of the system,the foundation models, or the lms. Uh, so in our pre-release, we went with the GPT,uh, 3. 5.

Uh, this had a context draft, uh, 16,000 tokens. So it was good enough for the pre-releaseor the initial version of, uh, ov, uh,where we had very small chunksor, uh, chunks with, uh, 1,024,uh, tokens. So it is very small, it is sufficient,but we saw that, uh, when we switchedto GB four in our launch version, uh,which had a contact center of a thousand, uh,hundred 28 K tokens,the responses improved. And since GB four is more advanced,the responses were more well defined, uh,the refined responses were given out, uh,but, uh, this is also okay. But we had a problem with GBD fourbecause the responses took a relatively long timeto get back to the userbecause some prompts went up to like two minutes.

And this, uh, had an impact on our user experience. So as soon as, uh, g BT four O came out, we did a trialand we switched to GB four. And, uh, the user experience is really good now,because now on average, we get around 20 seconds, uh,per response. Uh, and another thing that we did was we wentwith Azure Open Air Service. The main reason for this is, uh, the privacy clausebecause, uh, Azure ensures, uh, that, that we, the,the, the prompts that we send to open air, uh, the,the opening, uh, surveys are not recordedor used for, uh, retraining their models.

So this is a big, uh, concern for usbecause, uh, we always wanted to keep the chunksor the documents confidential,because most of the documentsthat are stored in our vector database are companyconfidential, sometimes division confidential. So we wanted to keep everything safe. This is also another reason why we win. So a will go through, uh, go in depth about this later. So if you want make one.

Yeah. Okay. So another factor that we had tothink about was the embedding model. So we saw that, uh, so in general, we rented, uh,all Min, which is one of like thegenerally well performing models. It has been performing well for a long time.

So, uh, we used that for the general purpose, uh, documents. So it gave us really good responses, uh,from the technical documents, the instruction documents,and even business documents. So it was good enough for us. But, uh, we saw that when,when we changed the embedding model to something like by,uh, or follow ir, the performance changed. The responses became more, uh, refinedbecause the vector space changed,and, uh, the model was able to pick up more, more, uh,attributes of the text.

Uh, so, uh, so we want, we wanted to have the abilityto switch between, uh, embedding models or the vectors. So this is, uh, another thing that we really needed. Then, uh, yeah, so, uh, to add to this, we sawthat when we increased the size of the chunksand the overlaps of the chunks, the pro theresponses became more refined. So we expend experimented at a lot of this,and we came to a general, uh, norm to limitthe chunk sizes and all up to a certain amount. So, yeah, and the front end.

So frontend, uh, as I mentioned, it's built on topof Streamlet. So streamlet is a very easy, uh, framework to work with. Uh, uh, so,but we had to get some, we had to set up some workaroundfor certain things like integrating, uh,Azure AD into the stream rate framework,but it worked for us. So the entire, uh, stream rate front end, the,is not just a chat interface. So we gave the chat bot, uh,additional functionality like the filters I mentionedearlier, so they can, the users can, uh, select the domains,the subdomains, the extract, et cetera,and do all sort of, all sorts of filtering.

So that gives the user a lot of functionality there. Then, uh, so this is the metadata filtering. Then we gave the users the ability to see the most relevant,uh, chunk or the most relevant text that, uh,is useful for them. So this is a really good, uh, really big plus pointfor the front frontend users. Then, uh, another thingthat we did is we gave them the abilityto view the most relevant chunks also.

So if they wanted to go to the documentand review it with their relevant pages, we give themthat functionality as well. So the process, so as I mentioned, this is, uh, uh,sorry, it's not a very straightforward process. So when the user, uh, prompts the question onto the front,and it goes to the bank chain agent from the long chainagent, it, as NA mentioned earlier, uh, the question is, uh,converted into the vector space,and the nearest 10, now X number of, uh,vectors are retrieved from the vector db, so in this case,middleware, and it's sent to the open air service. From that, we get the response. So this is the general flow for SMO.

So the, so,but we are in the process of building a custom, uh, chainthat build on that give, give us more freedomto change the flow a bit,because we have some inst interesting stuff comingdown the pipeline. So, uh, number, we go through that also later on, right?So, and another thing that we wantedto do is give the user the abilityor the freedom to upload their own documentsand, uh, query the documents. So, because, uh, normally if, uh, the, normallythe technical teamor the technical staff, uh, they keep gettingupdated documents or new technical manualsand, uh, 3G PPP releases, for example. So they always need to have, uh, the ability to, uh,load this onto and get the,or always get the latest information. So in not for this, we develop a separate front endfor the users to upload the data, depending on the specific,uh, vendors or the domains and sub domains,and, uh, they can query the relevant information from, uh,these specific documents as well.

So this give us, give the users a lot of freedom. Yeah, so this is the general flowfor the frontend load, the load document load in front end. So then the document is loaded. There's a separate detail model in the back backend, which,uh, takes in the document, then it run, runs the, uh,e model, uh, flow by breaking thedocument into separate, uh, chunks. Then it runs it through the embedding modeland loads it onto the vector database.

So yeah, so that's the general in depth architectureflow of, uh, so I think will, uh,take us through why we went with, uh, ware,and we have some really good, uh, reasons as to why,rather you can take. Great, thank you. Um, thank you everyone for being here. So we knew going into this project that, uh,we would require a scalableand quite flexible set of solutions, uh,to implement our idea into this. So it started off small, uh,but we knew it could become a largeand a large scale project, uh,because it soon called the fraction of, uh,the higher management and, uh,and everyone who heard about it.

So, keeping all that in mind, we, uh, we were searchingfor some specific requirements, uh, especiallyto scale the database and everything regarding to that. Uh, so VUS came with some, uh, with a lot of features,and I'll take you through, take you through some of the,the features that caught our, so vus, um,supports multiple embeddings. That's the main feature. So we, as explainedbefore, we, we've tried out, um, all main and by embeddingand several other embeddings, uh,and Ware supports all of that,and it includes all the necessary a p and SDKs. So supportand, um, uh, Python mills collect connectors.

So the, the transition into, um,including it into one of our on-prem serversand getting it up, getting it built upand running was, uh, quite convenient when comparedwith some of the other available, uh, databases. So mails provides, um, flexible hosting capabilities mostly,um, on-prem and cloud hosting capabilities. So it, what our specific requirement, uh, wasthat we do not share our data with the outside. So we needed to have the database, um,included in one of our on-prem server, so was, uh,a good contender for that along with other databases suchas chroma and, um, BV eight. So that's, I'll get into more details on.

Mm. Hey, it, you crashed,Right? SoMiller supports dynamic no de location. So what that does is,No, you crashed again. Okay, now we can see you again. Oh, sorry, I think I got disconnected.

Sorry, everyone. Technical problem. You're back. Sorry about that. Can you, can I continue from,can I get back to where Ross, orYes.

Uh,we just, you were mentioning we andro a DBand everything, and then you were like starting,so you can start again from here. Alright, so, uh, so, uh, VO was a good contenderfor the on-prem deployable, uh, vector databasesand o other key features include, uh,distributed architectureand partitioning, uh, which is included in, uh, bu. So what dynamic node allocation does is with the increaseof the workload, and that is to say the sizeof the databases and, um, the pay callsand the access points, everything when it's being scaled up,um, would handle this dynamically. Um, and it's quite smooth and the, the load is balanced. So it's, there's very little ma manual interventioninvolved, um, does all for,and there's another key feature that is included,which is partitioning.

So with the inclusion of partitioning, we are able to, um,we are able to minimizeand increase the efficiency of the, uh,of the memory util utilization. Um, we can partition, uh,based onto our specific use case. And, uh, what the partitions does is, uh, it makes sothat the entire dataset does not need to be retrievedand just the, a specific partition,it will be accessed and loaded into memory. So, uh, in our use case, uh, what we, what we've done is,uh, partion through either vendoror some of the filtersthat you saw in the previous slides with the ui. So based on the partitioning there, um, that is accessed.

So those are some of the, the features that caught our eye,but the, the main,the key features why we chose mailbu over other contenderswere these, uh, these specific points. So it's, uh, first and foremost, it's open sourceand it has great community support. I mean, um, we found out, uh,we got invited into this webinar, which is to saying some,saying something, and, uh, was contacted on, uh,social media by the middle team, um, when tryingto, uh, query some of our questions. And so it's great community support, even online, uh, uh,forums and all that. Uh, and the other feature was on-prem deployment.

So this was one of our key requirements where we needed tokeep our data within our organizationand not hand it over to other organizations. Um, so this camewith certain limitationsbecause the, the databases were still evolvingand it was getting to a good, uh, stable level,and there were lots of competitors out therethat were performing really well. But, um, so we, we've tested out, uh, mostly Chroma DBand, but, uh, upon testing, we found that, um, had, uh,higher, um, queries per second. And the, the, the data backupand migration support was much better. So it per, it performed much, uh, betterand had low la low latencieswith higher precision when it comes to querying data.

So that's another key feature that is the hybrid search, um,which, which involves, uh, query data other than the,the embedded, uh, the embedded data. So once, as you saw in the,um, previous, uh, UI inter, uh, ui, um,the g that you also, um,we can filter out, um, specific setsof documents, uh, or we can apply certain filters. So what the filters would do is, uh, direct the rad towards,uh, a specific set of documentsso we can provide a more fine tuned output,uh, towards the model. And the other important factor was when it comes, uh,to scaling, we knew that we would needto increase the data sets, uh, and the, the number of usersand everything with regards to that. And, uh, the di uh, dynamic node allocation,uh, with ware hasprovided us, uh, pretty much everything weneed to handle the loads.

Okay. I think, uh, that sums it upfor why we chose Ware over Yeah. The competitor out there. Thank you. Uh, yeah.

So,Yeah, thank you. Um, so let, um,and so basically that, uh, was a breakdown as to why, um,we selected Mil, uh,as our vector database, um, for this project. Um, so I'll go through some of, uh, what's next Instore, uh,for Asimov. Um, so, uh, one of the thingsthat we are currently working on is expanding thisinto further domains. Um, so while we were working, um, with, uh, you know,technical staff at engineering, um, you know, we also,so the potential to expand this into other use cases, uh,throughout the company, um, uh, oneof the most recent examples is we were able to, um, workwith our, uh, basically we were able to, uh,develop a custom flavor of, uh, Asim of, um,for our corporate planning team.

So that, uh, helped them, um, when it comesto preparing certain documents for, uh,dialogues annual general meeting, which happened recently. Um, so, um, there are use cases like that, uh,uh, that, uh, provides potential. Um, so, uh, expanding into further domains, um,like corporate planning, like regulatory. Um, this, uh, is in the, in this, in the sites. Uh, uh, um,and speaking of that, um, we'll also then, uh,create a, a repository of prompt templates,um, for each use case.

Um, and that will, uh, uh, help us, um,uh, basically, uh, tailor, um, the experienceand to, uh, improve, uh, um, improve,uh, the M'S response, uh, when it comesto each specific use case. So, um, establishing that repository of, of, uh,from templates, uh, for each use case, uh, is somethingthat we have, um, in the pipeline. Um, and also, uh, so when it comesto our technical documentation, there's a lot of, uh,graphs, um, uh, different, uh, diagrams, things like that. So, uh, we will be leveraging GPT-4, uh,multimodal capabilities, uh,for much richer document loading, um, where we, uh,get GBT four oh to, uh, to, uh, to, to,to provide descriptions of, um,of image-based, um, information. Um, so that, uh, we'll have much, uh,richer context there when it comes to, uh, the documentsthat we load, uh, into.

Um, so that is also another, um, uh, thingthat we are looking at, uh,and also, uh, to enable chat with data capabilities. Um, this is not necessarily rack, uh,but, um, it's, uh, uh, an area that, uh,we saw has potential. Um, so we'll, uh, provide connectivity, uh, with, uh, ourinternal network configuration as wellas performance databases, as well as, uh, crowdsourced, um,intelligence, uh, databases such as thoseprovided by open signal meta, um, uh, so that, uh,um, you know, we would have much greater functionality, um,when it comes to Asimovand Vati can provide, uh, our engineers with. Um, yeah. So that is, uh, basically, um, uh,a quick, uh, look at, uh, so if we have any questions,um, we'd be glad to take them.

Uh, Steven. Ooh, thank you very much for the presentation. Um, it was really, really cool to hear about it. Uh, I don't know if anyone that is attending has a question,uh, feel free to to hear, readto write them, sorry, in the chat. Uh, otherwise I have some questions myself.

Uh, so I can maybe start with those, uh, then we will see. So fir first one is that I have is likewhat the scale you are facing from at the moment, like forhow many, how many people are using, uh, your rack system?Uh, so at the moment it's, uh, just the engineering staff,so the core engineering staff that, uh, comprises of theplanning team who Okay, they, they're the guyswho actually use it a lot. So that's around, uh, 50 odd people. So they use it. And so we see that some also, uh, they come, they,so as Nando mentioned, we worked really closely with the,the, these teams still in to develop the prompt templatesand to improve the responses.

So, so they had a hand in like refining the output. So, uh, so they,that got them interested in the system a lot. So they're using it, uh, on a daily basis. Okay. Cool.

And maybe a follow up question that I have with that. So do you have, maybe can you share,I don't know if you can share, but like,how many vectors do you have in middle at themoment, approximately?Uh,I think can give us a good idea thatMaybe you can't share, I don't know,Uh, number of vectors we can,but, uh, we can't, uh, give you the, uh,details on the content of theAh, no worries. So I can comment on that. Um, we have 1. 7 million right now,but it's increasing, uh, with each day we are including moreand more documents and, uh, in different domains.

Um, so technical and corporate, um, documents. So we are planning to involve, include, um, imagesand other multimodal, uh, uh,requirement requiring documents, uh, as another dimension. So it's going to, it's just goingto keep going up from there, but right now we are around,uh, around like 1. 72 million. Okay.

Okay, cool. Thank you. Um, and then we have a question. Uh, does service support higher availability?Um, I don't know if you guys, uh,wanna answer this one, otherwise I can. Uh, but yes, we do, uh, we do support.

I, I'll take this one. Uh,we do support high availability, yes. Um, we do. And then we also have, uh, cloud offering, which has, uh,high availability as well. It's 99.

99 ishpercent as well. So, yes. Uh,and then I have another question, maybe, so like, in the,maybe it's related more to the future. So, you know, like at the moment, uh,people are talking about rag, uh,but then there's also like what we call advanced rag. So is it something that you're like thinking of,of like checking out like maybe agentsor, you know, like, uh, splitting your questions, you know,depend like maybe your user asking one very long question,so maybe you wanna split the question into multiple ones.

Uh, is it something that you're likechecking out at the moment?Uh, yeah, so yeah, so generally when it comesto our users, they use, uh, they have a lotof follow up questions, so, okay. The chat history function and, uh,and, uh, the chat history helped us a lotto improve the answers. But as you mentioned, uh, the advanced rack, so I think, uh,yesterday we came across a new paper called, uh, raf,I think, uh, I think Nala can explain a bit about that. Yeah. Um, so something that we looked into, um,we were looking into yesterday, um, was, uh,augmented fine tuning.

So, um, so something that we want to do is, uh,with time, um, uh, also implement user feedback. Okay. Um, so that with time, um, we'll have, uh,a good collection of responses and how goodand how relevant those responses were for the users. Um, so what we can then use is we can, we do is we can gofor a hybrid, um, uh, rack plus, uh,fine tune model, okay. Um, where we can use, um, those, uh, the output,the feedback that we get from the users, um, to fine tune,um, GPT, um, to, to basically, uh, make a more accurate.

And, uh, so, so, um, raf that, um, doing,there was mentioning, uh, retrieval, augmented fine tuning. Um, it's a fairly new concept, um, which, uh,involves, uh, uh, training, uh, fine tuning, uh, the LLMwith, uh, uh, with relevant documents as wellas distracted documents. Um, and basically we, we teach the LLM to, uh, to,uh, when, when it, when, uh, uh, chunks areprovided to it, uh,to only take in the relevant information. Um, so, uh, that is also something interesting that, uh,we were looking at yesterday. So, um, uh, yeah, so things like that, uh, we hopeto implement in the future.

Yeah. Okay, cool. Thank you. ToAdd to that, sorry, to add to that, so this, uh, whatwith draft, I think, uh, we'll be ableto reduce the reliance on, uh, the embedding modelbecause we, that's a critical part of the re system. Yeah.

So we need to have a top notch embedding modelto get the top, but with draft, uh,we can rely less on the be bundleand go, uh, give that complexityor that, uh, intelligence to the LLM. So from that, uh, we can get a better response. Oh, yeah, yeah. I mean, they're very crucial. Uh, many models make big difference,usually, like in your results.

Um, okay, cool. Let me check. I don't think we have any more questions. So in that case, I'm gonna say thank you to you,thank you for the presentation. I was very informative.

Uh,the recording will be shared online. Um, so no worries. If you missed some part of it, uh, feel freeto check out our webpage as well. So it's vis tayo, check us out on GitHub as well directly. And thank you again, uh, to dialogue as wellfor like first writing the blog post.

That's how we saw them in the first place,you know, sharing their knowledge. And then, yeah, I'll see you in the next time. Uh, have a lovely morning, day, evening. Thank you again, you guys for the, for giving a talk here. Love.

Have a lovely evening onyour side because it's getting late. So thank you. Thank you, Steven. Thank you.

Meet the Speaker

Join the session for live Q&A with the speaker

Dumindu Ranasinghearachchi
Lead Engineer in Network Analytics and Automation at Dialog.
Dumindu Ranasinghearachchi holds a Bachelor of Science in Electronic and Telecommunication Engineering from General Sir John Kotelawala Defence University and a Master’s in Big Data Analytics from Robert Gordon University, UK. Dumindu embarked on his career with Dialog in 2014 as an intern and transitioned to the permanent team in 2016 as a Senior Executive in Product Development. Currently, he serves as the Lead Engineer in Network Analytics and Automation. His expertise spans Data Science, AI/ML, Big Data Analytics, Cloud Engineering, and Networking
Aditha Iddamalgoda
Senior Executive
Aditha Iddamalgoda is a graduate of General Sir John Kotelawala Defence University, holding a Bachelor of Science in Computer Science. He began his journey with Dialog as an intern in 2021 and has since advanced to the role of Senior Executive in Network Analytics and Automation. Aditha is a specialist in Data Science and AI/ML, with a pronounced proficiency in Data Engineering.
Nandula Asel Karunasingha
Senior Data Scientist
Nandula Asel Karunasingha is an alumnus of General Sir John Kotelawala Defence University, where he earned his Bachelor of Science in Electronic and Telecommunication Engineering. He started his career at Dialog as an intern in 2019, progressing to a Senior Executive in Product Development by 2018. Currently, Asel serves as a Senior Data Scientist in Network Analytics and Automation. His areas of specialization include Data Science, AI/ML, and Geographic Information Systems. Outside of work, he volunteers in Public Relations and Communications roles.

ASIMOV: Enterprise RAG at Dialog Axiata PLC

About the session

Topics Covered:

Meet the Speaker

AI Assistant