Why might two different runs of the same Sentence Transformer model give slightly different embedding results (is there randomness involved, and how can I control it)?

Two runs of the same Sentence Transformer model can produce slightly different embeddings due to inherent randomness in certain operations, even during inference. This variability is not unique to Sentence Transformers but is common in neural networks. The primary sources of randomness include model architecture choices (like dropout layers), hardware-level computation differences (especially on GPUs), and framework-level nondeterministic operations. For example, matrix multiplications on GPUs may use parallelized algorithms that introduce tiny numerical variations due to floating-point precision limits.

To control this randomness, you can enforce deterministic behavior. First, set random seeds for libraries like PyTorch (which Sentence Transformers is built on) and NumPy using torch.manual_seed(), numpy.random.seed(), and Python’s random.seed(). Second, configure PyTorch to use deterministic algorithms with torch.backends.cudnn.deterministic = True and torch.backends.cudnn.benchmark = False. Third, ensure dropout layers are disabled by putting the model in evaluation mode (model.eval()). However, even with these steps, full determinism isn’t guaranteed on GPUs due to hardware-level optimizations. Testing on a CPU (using model.to('cpu')) may yield more consistent results but at the cost of speed.

Practical example: If you initialize the model and set all seeds and configurations, embeddings should match across runs. For instance:

import torch
from sentence_transformers import SentenceTransformer

model = SentenceTransformer('all-MiniLM-L6-v2')
model.eval() # Disables dropout

# Set seeds and deterministic settings
torch.manual_seed(42)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

embedding1 = model.encode("test sentence")
embedding2 = model.encode("test sentence")

print(torch.allclose(embedding1, embedding2)) # Should output True if deterministic

Note that discrepancies might still occur across different hardware or library versions, so environment consistency is key.

Your AI Reference Guide
Why might two different runs of the same Sentence Transformer model give slightly different embedding results (is there randomness involved, and how can I control it)?

Why might two different runs of the same Sentence Transformer model give slightly different embedding results (is there randomness involved, and how can I control it)?

Recommended AI Learn Series

VectorDB for GenAI Apps

Share this article

Keep Reading

AI Assistant

Your AI Reference GuideWhy might two different runs of the same Sentence Transformer model give slightly different embedding results (is there randomness involved, and how can I control it)?

Why might two different runs of the same Sentence Transformer model give slightly different embedding results (is there randomness involved, and how can I control it)?

Recommended AI Learn Series

VectorDB for GenAI Apps

Share this article

Keep Reading

AI Assistant

Your AI Reference Guide
Why might two different runs of the same Sentence Transformer model give slightly different embedding results (is there randomness involved, and how can I control it)?