Zilliz Triumphed in Billion-Scale ANN Search Challenge of NeurIPS 2021

By Zilliz on Jan 21, 2022

Zilliz Triumphed in Billion-Scale ANN Search Challenge of NeurIPS 2021

On December 6th, 2021, the world’s top AI academic conference, NeurIPS announced the result of its first Approximate Nearest Neighbor (ANN) Search Challenge. Zilliz research team has won the first place in the Disk-based ANN Search track with its Disk Performance Optimization Algorithm, leveraging the ANN search on billion-scale datasets to a higher altitude.

Email screenshot. Email screenshot.

The emergence of neural networks enables massive unstructured data such as speech, images, and videos to be embedded as vectors, making ANN search the key to understand these unstructured data. Led by experts and scholars from Microsoft Research, Facebook AI Research, Carnegie Mellon University, Yandex and other influential organizations, the first ANN Search Challenge attracted candidates from Tsinghua University, Nanjing University, Intel, NVIDIA, Kuaishou Technology, and more. A total of six billion-scale datasets were adopted as the example datasets in the challenge, and four of them were released by Facebook, Microsoft Turing, Microsoft Bing, and Yandex specifically for this event.

The Disk-based ANN Search Solution Block-based ANN (BBAnn), developed by the research team of Zilliz, ranked first in the ANN Search track of the Challenge. Its performance peaked during the search in the SimSearchNet++ dataset released by Facebook. This dataset simulates an accurate detection of subtle changes in image, posing a great challenge of retrieving all vectors within a certain radius around the target vector. To make the challenge tougher, the number of query results to return remained uncertain. According to the test result, Zilliz’s solution retrieved 88.573% of all the relevant results in the dataset, far beyond the baseline of 16.274%, marking a huge breakthrough in billion-scale ANN search.

In the future, Zilliz will be dedicated to the implementation of this research achievement in Milvus, an open source vector database, to cater to the needs of users from different application scenarios. Milvus is a graduate project of the LF AI & Data Foundation. Milvus is capable of managing a large number of unstructured datasets, and has a wide range of applications in new drug discovery, recommender systems, chatbots, and more. Zilliz will continue to invest in unlocking the hidden value of massive unstructured data for enterprises through open source and cloud-native solutions.