A dense vector in information retrieval (IR) is a numerical representation of data (such as text, images, or other content) where each dimension corresponds to a specific feature or latent factor. Unlike sparse vectors, which have a large number of zero or null values, dense vectors are typically compact and have meaningful values in all dimensions.
Dense vectors are commonly used in neural IR systems, where each document or query is embedded into a vector space using methods like word2vec, GloVe, or transformer models. These vectors capture semantic information, such as contextual relationships and meaning, enabling more accurate matching between queries and documents.
Dense vector representations are advantageous because they enable the comparison of data based on semantic similarity rather than just keyword matching. For example, in semantic search, two documents with similar meanings can have similar dense vector representations, even if they do not share the same words. This makes dense vectors particularly effective in improving the relevance of search results.