Embeddings represent the semantic fingerprint of data, and how you store them determines both retrieval speed and accuracy. In LangGraph projects, developers usually create one collection per data domain—such as documents, tool outputs, or conversation history—so each node queries only the relevant subset. Within Milvus, each collection contains vector fields and optional metadata (timestamp, agent ID, tags) for hybrid filtering.
Choosing the right index is critical. HNSW provides high-recall performance for interactive agent queries, while IVF or DiskANN scale better for large offline stores. Milvus lets you experiment with index types and distance metrics (cosine, L2, IP) without data reloads. Embeddings can be added incrementally as new nodes generate information, keeping the knowledge base continuously updated.
Versioning embeddings is also important. By storing metadata such as model ID or schema version, developers ensure compatibility when upgrading embedding models. This disciplined storage strategy makes LangGraph projects maintainable and reproducible, reducing drift and improving long-term retrieval fidelity.
