TiDB vs Aerospike Choosing the Right Vector Database for Your AI Apps
What is a Vector Database?
Before we compare TiDB and Aerospike, let's first explore the concept of vector databases.
A vector database is specifically designed to store and query high-dimensional vectors, which are numerical representations of unstructured data. These vectors encode complex information, such as the semantic meaning of text, the visual features of images, or product attributes. By enabling efficient similarity searches, vector databases play a pivotal role in AI applications, allowing for more advanced data analysis and retrieval.
Common use cases for vector databases include e-commerce product recommendations, content discovery platforms, anomaly detection in cybersecurity, medical image analysis, and natural language processing (NLP) tasks. They also play a crucial role in Retrieval Augmented Generation (RAG), a technique that enhances the performance of large language models (LLMs) by providing external knowledge to reduce issues like AI hallucinations.
There are many types of vector databases available in the market, including:
- Purpose-built vector databases such as Milvus, Zilliz Cloud (fully managed Milvus)
- Vector search libraries such as Faiss and Annoy.
- Lightweight vector databases such as Chroma and Milvus Lite.
- Traditional databases with vector search add-ons capable of performing small-scale vector searches.
TiDB is a traditional database and Aerospike is also a distributed, scalable NoSQL database. Both have vector search as an add-on.This post compares their vector search capabilities.
TiDB: Overview and Core Technology
TiDB, developed by PingCAP, is an open-source, distributed SQL database that offers hybrid transactional and analytical processing (HTAP) capabilities. It is MySQL-compatible, making it easy to adopt for teams already familiar with the MySQL ecosystem. TiDB's distributed SQL architecture provides horizontal scalability like NoSQL databases while retaining the relational model of SQL databases, making it highly flexible for handling both transactional and analytical workloads.
One of TiDB's core strengths is its HTAP architecture, which allows it to process transactional (OLTP) and analytical (OLAP) workloads in a single database, reducing the need for separate systems. Additionally, TiDB's MySQL compatibility makes it easy to integrate into existing environments that rely on MySQL without significant changes to the application code. The database also features auto-sharding, automatically distributing data across nodes to improve read and write performance while maintaining strong consistency.
TiDB supports vector search through integration with external libraries and plugins, enabling efficient management and querying of vectorized data. This feature, combined with TiDB's HTAP architecture, makes it a versatile option for businesses needing vector search capabilities alongside transactional and analytical workloads. The distributed architecture of TiDB allows it to handle large-scale vector queries once the necessary configurations are in place.
While including vector search functionalities in TiDB requires additional configuration, the system's SQL compatibility allows developers to combine vector search with traditional relational queries. This flexibility makes TiDB suitable for complex applications that require both vector search and relational database capabilities, offering a comprehensive solution for diverse data management needs.
Aerospike: Overview and Core Technology
Aerospike is a NoSQL database for high-performance real-time applications. It has added support for vector indexing and searching so it’s suitable for vector database use cases. The vector capability is called Aerospike Vector Search (AVS) and is in Preview. You can request early access from Aerospike.
AVS only supports Hierarchical Navigable Small World (HNSW) indexes for vector search. When updates or inserts are made in AVS, record data including the vector is written to the Aerospike Database (ASDB) and is immediately visible. For indexing, each record must have at least one vector in the specified vector field of an index. You can have multiple vectors and indexes for a single record, so you can search for the same data in different ways. Aerospike recommends assigning upserted records to a specific set so you can monitor and operate on them.
AVS has a unique way of building the index, it’s concurrent across all AVS nodes. While vector record updates are written directly to ASDB, index records are processed asynchronously from an indexing queue. This is done in batches and distributed across all AVS nodes, so it uses all the CPU cores in the AVS cluster and is scalable. Ingestion performance is highly dependent on host memory and storage layer configuration.
For each item in the indexing queue, AVS processes the vector for indexing, builds the clusters for each vector and commits those to ASDB. An index record contains a copy of the vector itself and the clusters for that vector at a given layer of the HNSW graph. Indexing uses vector extensions (AVX) for single instruction, multiple data parallel processing.
AVS queries during ingestion to “pre-hydrate” the index cache because records in the clusters are interconnected. These queries are not counted as query requests but show up as reads against the storage layer. This way, the cache is populated with relevant data and can improve query performance. This shows how AVS handles vector data and builds indexes for similarity search so it can scale for high-dimensional vector searches.
Key Differences
TiDB has vector search through external integrations and keeps its core as a distributed SQL database with HTAP capabilities. It’s MySQL compatible and combines vector search with traditional relational queries.
Aerospike takes a more specialized approach with Aerospike Vector Search (AVS) which uses HNSW indexing only. It allows concurrent index building across nodes and uses vector extensions for parallel processing.
Search Methodology:
TiDB integrates vector search through plugins, SQL and vector queries can work together. Aerospike uses HNSW indexes only, with async indexing queues for better performance.
Data:
TiDB handles transactional and analytical workloads with auto-sharding. Aerospike processes vector data async, immediate visibility for updates while index records are processed in batches.
Scalability:
TiDB distributes data across nodes automatically and maintains consistency. Aerospike’s distributed indexing uses all CPU cores in the cluster and scales based on host memory and storage.
Integration:
TiDB fits in MySQL environments and supports both SQL and vector operations. Aerospike’s AVS is more focused on vector search but integrates tightly with the core Aerospike database.
When to Choose TiDB
TiDB is best for enterprise applications that need both traditional database operations and vector search, especially in MySQL environments. Its HTAP architecture is perfect for organisations that have mixed workloads - from regular transactions to complex analytics - and need vector search. MySQL compatibility means existing teams can adopt it with minimal changes to their applications and workflows.
When to Choose Aerospike
Aerospike is best for applications where high performance vector search is the top priority, especially in real-time. Its AVS system with concurrent index building and vector processing optimisation is perfect for applications that need fast vector similarity search at scale, such as recommendation engines, image recognition systems or other AI powered applications that put search speed over traditional database operations.
Conclusion
TiDB and Aerospike are for different use cases in the vector search space. TiDB is a full database with vector capabilities, perfect for businesses that need both traditional database operations and vector search. Aerospike is high performance vector search with optimised indexing, perfect for specialist vector search applications. Choose what you need, a full featured database with vector search (TiDB) or a specialist vector search solution (Aerospike).
Read this to get an overview of TiDB and Aerospike but to evaluate these you need to evaluate based on your use case. One tool that can help with that is VectorDBBench, an open-source benchmarking tool for vector database comparison. In the end, thorough benchmarking with your own datasets and query patterns will be key to making a decision between these two powerful but different approaches to vector search in distributed database systems.
Using Open-source VectorDBBench to Evaluate and Compare Vector Databases on Your Own
VectorDBBench is an open-source benchmarking tool for users who need high-performance data storage and retrieval systems, especially vector databases. This tool allows users to test and compare different vector database systems like Milvus and Zilliz Cloud (the managed Milvus) using their own datasets and find the one that fits their use cases. With VectorDBBench, users can make decisions based on actual vector database performance rather than marketing claims or hearsay.
VectorDBBench is written in Python and licensed under the MIT open-source license, meaning anyone can freely use, modify, and distribute it. The tool is actively maintained by a community of developers committed to improving its features and performance.
Download VectorDBBench from its GitHub repository to reproduce our benchmark results or obtain performance results on your own datasets.
Take a quick look at the performance of mainstream vector databases on the VectorDBBench Leaderboard.
Read the following blogs to learn more about vector database evaluation.
Further Resources about VectorDB, GenAI, and ML
- What is a Vector Database?
- TiDB: Overview and Core Technology
- Aerospike: Overview and Core Technology
- Key Differences
- When to Choose TiDB
- When to Choose Aerospike
- Conclusion
- Using Open-source VectorDBBench to Evaluate and Compare Vector Databases on Your Own
- Further Resources about VectorDB, GenAI, and ML
Content
Start Free, Scale Easily
Try the fully-managed vector database built for your GenAI applications.
Try Zilliz Cloud for FreeKeep Reading
- Read Now
Navigating the Challenges of ML Management: Tools and Insights for Success
Learn how XetHub and vector databases like Milvus address ML model management challenges.
- Read Now
Build RAG with LangChainJS, Milvus, and Strapi
A step-by-step guide to building an AI-powered FAQ system using Milvus as the vector database, LangChain.js for workflow coordination, and Strapi for content management
- Read Now
Top 10 AI Agents to Watch in 2025 🚀
AI agents are like having a supercharged assistant by your side—analyzing data, making decisions, and seamlessly integrating with tools and environments.
The Definitive Guide to Choosing a Vector Database
Overwhelmed by all the options? Learn key features to look for & how to evaluate with your own data. Choose with confidence.