Yes, Sentence Transformers can be applied to detect changes in meaning over time by comparing document similarities across different time periods. Sentence Transformers generate dense vector representations (embeddings) of text, capturing semantic meaning. By comparing embeddings of documents from different eras, you can quantify how their semantic content aligns or diverges. For example, embeddings of political speeches from the 1990s could be compared to those from the 2020s to identify shifts in topics, tone, or terminology. The key lies in measuring the cosine similarity or Euclidean distance between embeddings: lower similarity scores over time would suggest evolving meanings or contextual shifts.
To implement this, you would first generate embeddings for documents grouped by time intervals (e.g., decades) using a pre-trained Sentence Transformer model like all-mpnet-base-v2. Next, compute pairwise similarity scores between embeddings from different periods. For instance, comparing average embeddings of technical articles from the 2000s versus the 2010s might reveal semantic drift in terms like "machine learning" (e.g., shifting from rule-based systems to neural networks). Temporal clusters could also be analyzed using techniques like t-SNE or UMAP to visualize how document groups diverge over time. However, the model must be robust to vocabulary changes—older documents might use outdated terms that require careful normalization or domain adaptation.
There are important considerations. First, Sentence Transformers trained on modern data may not fully capture archaic language or cultural context from older texts, leading to biased comparisons. Retraining or fine-tuning the model on historical data could improve accuracy. Second, detecting meaning changes requires defining a baseline—e.g., establishing what constitutes a "significant" similarity drop. Statistical methods like permutation tests could help distinguish noise from meaningful drift. Finally, this approach works best for broad trends rather than pinpointing exact moments of change. For example, it might highlight a gradual shift in environmental policy language post-2010 but miss sudden semantic shifts caused by specific events. Combining Sentence Transformers with temporal analysis frameworks (e.g., dynamic topic modeling) could yield more nuanced insights.
