Sohu’s AI News Aggregator

Sohu is a Nasdaq-listed company that offers advertising, search engine, online multiplayer gaming, and many other services. The company built the Sohu News App to provide its users with personalized content. To improve the click-through rate (CTR) and relevance of the app’s recommendations, Sohu leveraged semantic vector similarity search. Sohu used Milvus, an open-source vector database built by Zilliz, to power the core semantic similarity search engine inside its intelligent news aggregator.
Objective
Build a news aggregator that automatically recommends relevant news articles based on user preferences, search history, engagement, and more.
Challenges
- Tens of millions of semantic vectors are generated each day from recently published news content, leading to day-long vector retrieval times. - Useful content recommendations require an ideal balance between speed and precision, so any index used must be fast and maintain a high degree of accuracy.
Why verctor database
- Support for most mainstream indices makes Milvus a perfect fit for the news recommendation scenario. - High recall and instantaneous retrieval on massive, trillion-vector datasets. - Consumes much less memory than competing vector databases. - Offers targeted similarity search based on a specified partition that accelerates vector retrieval.
Results
- Vector retrieval time reduced from days to hours. - Text classification accuracy above 95%.
"Milvus greatly improved our efficiency when discovering 'bad cases' of short text classification. Additionally, the vector database drastically accelerated retrieval of millions of semantic vectors. We saw a near tenfold advancement compared to how things were when we were using other vector similarity search engines. "
Read the full storyshare this
