
Engineering
Building RAG with Self-Deployed Milvus Vector Database and Snowpark Container Services
With Snowflake's Snowpark Container Service (SPCS), users can now run Milvus within the Snowflake ecosystem, allowing them to easily interact with Milvus using data stored in Snowflake.

Community
LLaVA: Advancing Vision-Language Models Through Visual Instruction Tuning
LaVA is a multimodal model that combines text-based LLMs with visual processing capabilities through visual instruction tuning.

Community
3 Key Patterns to Building Multimodal RAG: A Comprehensive Guide
These multimodal RAG patterns include grounding all modalities into a primary modality, embedding them into a unified vector space, or employing hybrid retrieval with raw data access.

Community
Mixture-of-Agents (MoA): How Collective Intelligence Elevates LLM Performance
Mixture-of-Agents (MoA) is a framework where multiple specialized LLMs, or "agents," collaborate to solve tasks by leveraging their unique strengths.

Community
Milvus on GPUs with NVIDIA RAPIDS cuVS
GPU-accelerated vector search through NVIDIA's cuVS library and CAGRA algorithm are highly beneficial for optimizing AI app performance in production.

Community
Building an End-to-End GenAI App with Ruby and Milvus
LangChain.rb eliminates the hassle of full-stack developers switching to another programming language when they want to leverage LLMs in their web applications.

Engineering
Introduction to LLM Customization
This article discusses several options for customizing LLMs to enhance their performance on specific tasks.

Engineering
Up to 50x Cost Savings for Building GenAI Apps Using Zilliz Cloud Serverless
Zilliz Cloud Serverless allows users to store, index, and query massive amounts of vectors at only a fraction of the cost while keeping a competitive level of performance.

Community
Matryoshka Representation Learning Explained: The Method Behind OpenAI’s Efficient Text Embeddings
Matryoshka Representation Learning (MRL) is a method for generating hierarchical, nested embeddings that capture information at multiple levels of abstraction.