Veridien Powers EU Policy Analysis with Milvus

Prismatic Hybrid Search:
Refines intelligence through multi-faceted analysis, fusing semantic understanding with keyword precision to illuminate classifications across all 32 EU policy domains.
Multilingual Real-Time Analysis:
Enables real-time, multilingual classification of EU policy content for instant insights.
Scalable Knowledge Management:
Powers Policy Manager and AI Semantic Wiki, handling tens of thousands of documents efficiently.
I really like how Milvus' hybrid search allowed me to blend semantic and keyword search, which is crucial in a domain as technical and complex as EU policy.
Alessandro Saccoia
About Veridien
Veridien is a startup at the intersection of artificial intelligence and EU policy analysis. It was founded by a multidisciplinary team that includes AI experts, EU law specialists, and physics Ph. D.s. The company combines deep domain knowledge of European policy with cutting-edge AI capabilities to transform complex policy information into actionable insights. Their integrated solution, called Policy Manager, helps stakeholders navigate and understand the complex landscape of EU policy by providing real-time analysis, intelligent categorization, and automated knowledge management capabilities. Through innovative AI approaches, they're making the vast ecosystem of EU policy information more accessible and actionable for decision-makers.
Technical Challenges: Managing Complex Policy Data at Scale
Veridien faced several critical challenges in building its EU policy analysis system, each requiring capabilities beyond traditional databases:
Neither Semantic Search nor Keyword search alone is not enough
The EU policy domain presented a dual search challenge. First, when using pure semantic search with dense vectors, similar policy areas like "energy policy" and "environmental policy" became nearly indistinguishable due to overlapping terminology. Second, the policy landscape's specialized and constantly evolving terminology (across 32 distinct policy areas) required precise keyword matching capabilities. This indicated a need for hybrid search that could combine semantic understanding with traditional keyword matching. However, implementing such a hybrid approach required a vector database capable of efficiently handling both dense vectors for semantic search and sparse vectors for keyword-based matching within a single query.
Language Complexity
The multilingual nature of EU documentation demanded a sophisticated vector search system that could maintain semantic relationships across languages. Traditional databases struggle with cross-lingual search, but vector databases can represent documents in a shared semantic space regardless of their original language. The challenge was finding a solution that could preserve technical precision while enabling multilingual search capabilities.
Solutions: Hybrid Search with Milvus Powers a Complete Policy Intelligence Platform
Veridien developed a sophisticated knowledge management ecosystem powered by Milvus, combining traditional keyword search with modern semantic search capabilities. At the core of their architecture is a parallel database system that pairs Django's robust backend with Milvus's vector search capabilities. The systems are synchronized through a custom wrapper class that maintains consistency across them. "I never used just semantic (vector) databases before," Veridien Co-founder Alessandro Saccoia explains, "because in really vertical, deep domains with lots of technical jargon, you still need the usual keyword-based search. I really like how Milvus hybrid search allowed me to blend these two ways of searching." Their implementation centers around this hybrid search functionality, which proved crucial for handling EU policy's technical language. To optimize classification accuracy, Veridien leverages Milvus's ability to handle both dense and sparse vectors simultaneously. "When you try to classify EU policy documents using just vector search," Saccoia notes, "the points in the multidimensional space are too close together. This kind of hybrid search with 50/50 weight distribution gave us the best results." Each document undergoes parallel processing to generate: Dense vectors using the BGE-M3 embedding model to capture semantic meaning Sparse vectors from strategically transformed documents that emphasize domain-specific keywords Veridien leverages Milvus's metadata capabilities to maintain a growing knowledge base of pre-classified policy documents. For each document stored in Milvus, they attach metadata containing the policy category classification. When processing new content, whether from documents or real-time streams, the system performs hybrid searches against this collection to find the most similar existing documents. Classification is then determined through majority voting of the policy categories stored in the metadata of the top 100 similar documents.
The solution spans three key products that all leverage Milvus's vector search and metadata capabilities:
- Policy Manager uses Milvus's hybrid search and metadata to automatically categorize and organize policy-related content across all EU domains
- Stream Scope extends this capability to real-time broadcast analysis, using Milvus for live classification of parliamentary sessions by matching against pre-classified content
- AI Semantic Wiki acts as a knowledge base, using Milvus's metadata system to store and retrieve relationships between policy documents, enabling the construction of interconnected policy knowledge graphs
Outcome: Democratizing Real-time Policy Intelligence
- Milvus's hybrid search implementation has delivered exceptional technical performance:
- Successfully processes tens of thousands of documents while maintaining high classification accuracy across all 32 EU policy areas
- Enables real-time classification of streaming content from parliamentary sessions and press conferences
- Maintains consistent performance across multiple languages while preserving technical precision
The business impact has been transformative for both Veridien and its customers. Their solution has democratized access to EU policy insights in several key ways: Real-time Analysis: Previously, politicians and companies would pay thousands of euros for manual transcription of EU Commission press conferences, often waiting hours or days for results. Stream Scope now delivers these insights instantly, automatically categorizing content into the 32 official EU policy topics. This has been particularly valuable for stakeholders who need to respond quickly to policy developments.
Cost-Effective Scaling: The Policy Manager's ability to process vast amounts of multilingual content efficiently has made sophisticated policy analysis accessible to a broader range of organizations. What once required teams of policy analysts can now be accomplished through automated classification and analysis.
Enhanced Decision Making: The AI Semantic Wiki has transformed how organizations understand policy relationships, enabling them to identify connections and implications across multiple domains simultaneously. Users particularly value the system's ability to surface relevant policy developments in context, regardless of their source or language.
Comprehensive Coverage: By leveraging Milvus's hybrid search capabilities across all three products, Veridien's system can effectively monitor and analyze policy developments across all EU institutions, from formal legislative documents to real-time parliamentary debates. This comprehensive coverage ensures stakeholders never miss relevant policy developments, regardless of where they originate.
Future Developments
Looking ahead, Veridien plans to enhance its use of Milvus by: Exploring advanced hybrid search configurations beyond the current 50/50 weighting Expanding their use of Milvus's metadata system for more sophisticated knowledge graph applications Leveraging Milvus's evolving capabilities for handling domain-specific keywords The team envisions creating increasingly complex relationships between policy documents, enabling deeper analysis of how different policy areas influence each other. They're particularly interested in fine-tuning their hybrid search weights based on different policy domains to further improve classification accuracy in highly technical policy areas.