Community
LoRA Explained: Low-Rank Adaptation for Fine-Tuning LLMs
LoRA (Low-Rank Adaptation) is a technique for efficiently fine-tuning LLMs by introducing low-rank trainable weight matrices into specific model layers.
Community
How Inkeep and Milvus Built a RAG-driven AI Assistant for Smarter Interaction
Robert Tran, the Co-founder and CTO of Inkeep, shared how Inkeep and Zilliz built an AI-powered assistant for their documentation site.
Community
Understanding HNSWlib: A Graph-based Library for Fast Approximate Nearest Neighbor Search
HNSWlib is an open-source C++ and Python library implementation of the HNSW algorithm, which is used for fast approximate nearest neighbor search.
Community
Best Practices in Implementing Retrieval-Augmented Generation (RAG) Applications
In this article, we explored various RAG components and discussed the approaches with optimal performance in each component.
Community
The Evolution of Multi-Agent Systems: From Early Neural Networks to Modern Distributed Learning (Methodological)
In this article, we'll explore the evolution of MAS from a methodological or approach-based perspective.
Community
The Evolution of Multi-Agent Systems: From Early Neural Networks to Modern Distributed Learning (Algorithmic)
In this article, we'll discuss the evolution of MAS from its early days to the most recent developments from an algorithmic perspective.
Paper Reading
Efficient Memory Management for Large Language Model Serving with PagedAttention
PagedAttention and vLLM solve important challenges in serving LLMs, particularly the high costs and inefficiencies in GPU memory usage when using it for inference.
Community
Deep Residual Learning for Image Recognition
Deep residual learning solves the degradation problem, allowing us to train a neural network while still potentially improving its performance.
Engineering
Up to 50x Cost Savings for Building GenAI Apps Using Zilliz Cloud Serverless
Zilliz Cloud Serverless allows users to store, index, and query massive amounts of vectors at only a fraction of the cost while keeping a competitive level of performance.