all-MiniLM-L12-v2 produces embeddings with a fixed dimensionality of 384. This means every input sentence or paragraph is represented as a vector of 384 floating-point values. Embedding size is a critical parameter because it affects storage requirements, memory usage, and search performance. A 384-dimensional vector is relatively compact, which is one of the reasons this model is widely adopted.
From a systems perspective, this dimensionality strikes a balance between expressiveness and efficiency. Lower-dimensional embeddings are cheaper to store and faster to search, but may lose semantic nuance. Higher-dimensional embeddings can capture more detail but increase memory and indexing costs. With 384 dimensions, you can store millions of vectors without extreme resource requirements, and you can often achieve low-latency search with approximate nearest neighbor indexes.
This embedding size works well with vector databases such as Milvus or Zilliz Cloud, which are optimized for dense vectors in this range. When designing your system, you should size memory and indexes based on the number of vectors and this dimensionality, not just the model file size. In practice, many teams find 384 dimensions sufficient for high-quality semantic search when combined with good chunking, metadata filtering, and index tuning.
For more information, click here: https://zilliz.com/ai-models/all-minilm-l12-v2
