Webinar
How to Optimize Your Embedding Model Selection and Development through TDA Clustering
Join the Webinar
Loading...
About this webinar:
Embedding models are a crucial layer in vector database applications, yet figuring out which embedding model is best for your dataset has been a notoriously difficult task. However, an efficient and intuitive approach for many use cases can be produced through Topological Data Analysis (TDA) on your evaluation dataset. Identifying patterns of weak performing behavior in your model is made easy and scalable through a table that reveals the performance of different semantic categories of queries being made to your vector database.
Topics covered:
- Risks and limitations of current evaluation approaches for embeddings
- Compare embedding models on your own dataset using Navigable TDA clusters
- ML lifecycle case studies in ecommerce: model selection, fine-tuning, and post-deployment
Meet the Speaker
Join the session for live Q&A with the speaker
Gunnar Carlsson
Co-Founder, BluelightAI
Gunnar Carlsson is a co-founder of BluelightAI, and a Professor of Mathematics (emeritus) at Stanford University. He works on applications of the mathematics of shape in the form of graphs, called topology, to the study of large and complex data sets and to the design and behavior of deep neural networks. He led a multiuniversity DARPA initiative on these methods in the 2005-2010 time frame, and has also commercialized them for the analysis of Big Data. He is now applying the same ideas to AI, in particular to the analysis of embeddings and the behavior of layers in large language models.Gabriel Alon
Sr. Data Scientist, BluelightAI
Gabriel Alon is a Sr. Data Scientist at BluelightAI leading applications into vector databases. He developed BluelightAI’s clustering based methodology for comparing and improving embedding models for retrieval tasks. His academic work is centered around recommendation algorithms and LLMs.