What is “recall” in the context of vector search results, and how is recall typically calculated when evaluating an ANN algorithm against ground-truth neighbors?

What is recall in vector search results? Recall in vector search measures how well an Approximate Nearest Neighbor (ANN) algorithm retrieves the true top results compared to an exact (ground-truth) search. When searching high-dimensional data, ANN algorithms trade precision for speed by approximating results, but this can miss some true matches. Recall quantifies the fraction of ground-truth neighbors successfully returned by the ANN method. For example, if an exact search identifies 100 neighbors for a query, and the ANN retrieves 80 of them, the recall is 80%.

How is recall calculated? Recall is computed by comparing the ANN’s output against a ground-truth dataset generated via brute-force exact search. For a given query, let the ground-truth set contain the top k nearest neighbors. If the ANN returns k results, recall is the ratio of overlapping items between the ANN’s results and the ground-truth set to k. Mathematically, recall = (number of shared items) / k. For instance, if the ground-truth has 10 neighbors and the ANN retrieves 7 of them, recall is 0.7. This is repeated across all queries, and the average is reported.

Practical considerations and examples Ground-truth sets are typically precomputed using exact methods, which are slow but accurate. ANN evaluation often uses fixed k (e.g., top 100) for consistency. For example, in image retrieval, if an ANN finds 90 out of 100 true matches for a query image, its recall is 90%. However, if the ANN returns more or fewer results than k, the calculation adjusts. If the ANN returns 200 results containing 95 of the ground-truth 100, recall is 95/100 = 0.95. High recall indicates the ANN closely approximates exact results, but it often comes at the cost of higher computational overhead.

Your AI Reference Guide
What is “recall” in the context of vector search results, and how is recall typically calculated when evaluating an ANN algorithm against ground-truth neighbors?

What is “recall” in the context of vector search results, and how is recall typically calculated when evaluating an ANN algorithm against ground-truth neighbors?

Recommended AI Learn Series

VectorDB for GenAI Apps

Share this article

Keep Reading

AI Assistant

Your AI Reference GuideWhat is “recall” in the context of vector search results, and how is recall typically calculated when evaluating an ANN algorithm against ground-truth neighbors?

What is “recall” in the context of vector search results, and how is recall typically calculated when evaluating an ANN algorithm against ground-truth neighbors?

Recommended AI Learn Series

VectorDB for GenAI Apps

Share this article

Keep Reading

AI Assistant

Your AI Reference Guide
What is “recall” in the context of vector search results, and how is recall typically calculated when evaluating an ANN algorithm against ground-truth neighbors?