Engineering
Building RAG with Milvus, vLLM, and Llama 3.1
vLLM is a fast and easy-to-use library for LLM inference and serving. We’ll share how to build a high-performance RAG with vLLM, Milvus, and Llama3.1.
Engineering
Building a Multimodal Product Recommender Demo Using Milvus and Streamlit
A step-by-step guide on how to build and run the Multimodal recommendation system with Milvus, Streamlit, MagicLens, and GPT-4o.
Engineering
Setting up Milvus on Amazon EKS
This blog provides step-by-step guidance on deploying a Milvus cluster using EKS and other services.
Engineering
Exploring Three Key Strategies for Building Efficient Retrieval Augmented Generation (RAG)
Three key strategies to get the most out of RAG: smart text chunking, iterating on different embedding models, and experimenting with different LLMs
Engineering
Clearing Up Misconceptions about Data Insertion Speed in Milvus
Around 97% of the "Milvus insert" time in LangChain or LlamaIndex is spent on embedding generation, while about 3% on the actual database insertion step.
Engineering
🚀 What’s New with Metadata Filtering in Milvus v2.4.3
Milvus introduced powerful string metadata matching! Now, you can match strings using prefix, postfix, infix, and even fuzzy searches.
Product
How to Connect to Milvus Lite Using LangChain and LlamaIndex
Milvus Lite is now the default method for third-party connectors like LangChain and LlamaIndex to connect to Milvus, the popular open-source vector database.
Engineering
Choosing the Right Embedding Model for Your Data
This blog touched on some popular embedding models used in RAG applications.
Engineering
Running Llama 3, Mixtral, and GPT-4o
This article will show a few ways to run some of the hottest contenders in the space: Llama 3 from Meta, Mixtral from Mistral, and the recently announced GPT-4o from OpenAI.