Lexical search and vector retrieval differ primarily in how they represent and compare text. Lexical search treats documents as bags of words, focusing on the literal presence and frequency of query terms. It relies on algorithms such as BM25 or TF-IDF, which assign weights based on term frequency and inverse document frequency. In this model, the meaning of words is not considered—only their statistical relationships within and across documents. As a result, Lexical search excels at exact matching and transparency but struggles when users use synonyms or paraphrases.
Vector retrieval, in contrast, represents text as high-dimensional numerical vectors using embeddings generated by machine learning models. These embeddings capture semantic meaning—so two texts with similar intent, like “AI development tools” and “machine learning platforms,” will have vectors located close together in space even though they share few words. Vector retrieval computes similarity based on distance metrics like cosine similarity or Euclidean distance, enabling concept-based search rather than word-based. This allows systems to return relevant results even when query terms are not explicitly present in the documents.
The two methods are complementary rather than competing. Lexical search ensures literal precision, which is important for factual or structured queries, while vector retrieval adds flexibility and semantic depth. In hybrid systems powered by Milvus, developers often use Lexical search for initial candidate selection and vector search for re-ranking. This way, they combine the precision of keyword matching with the contextual understanding of embeddings. The difference ultimately lies in focus—Lexical search looks for shared words, while vector retrieval looks for shared meaning.
