From Bottlenecks to Breakthroughs: How Orfium Scaled Billion-Vector Audio Search with Zilliz Cloud

1 Billion Vectors
Handled with Ease
Real-Time Response
for Immediate Copyright Protection
Reduced costs
while handling the same files
Single engineer migration
for faster development cycles
With Zilliz Cloud, we moved from operating at our limits to building with confidence. It gave us the scale, performance, and flexibility to protect music rights in real time—something we couldn’t achieve with traditional systems.
George Kastrinakis
Imagine tracking billions of music snippets flowing across YouTube, TikTok, radio, and TV—every single day—and ensuring artists get paid fairly, no matter where their songs show up. For Orfium, a global music rights and copyright technology company, this isn’t a thought experiment. It’s their mission.
However, as their Elasticsearch/OpenSearch stack began to strain, engineers found themselves firefighting infrastructure instead of building new capabilities. The custom setup was heavy to maintain and optimize, latency crept up, throughput wasn’t keeping pace with the business, and indexing hit limits. Costs also became unpredictable. “We were operating at the edge of what was possible with our old system,” said George Kastrinakis, Director of Data Science and AI Services at Orfium.
About Orfium
Orfium is a global technology leader shaping the future of music rights management. They provide AI-powered technology and expert services to the world's leading music and entertainment companies, enabling them to optimize the management, licensing, reporting, and monetization of copyrighted content.
By combining deep expertise in digital rights management with robust broadcast monitoring and cue sheet management, Orfium accurately identifies, matches, and reports music usage across the entire media landscape. This delivers maximum revenue, unparalleled accuracy, and operational efficiency for their clients.
Since its founding in 2015–2016, Orfium has become a trusted partner to the world’s top record labels, publishers, broadcasters, and platforms—including YouTube, TikTok, the BBC, and Sky. By combining advanced content recognition, AI-powered data linking, and transparent royalty attribution, Orfium empowers artists, composers, and rights owners to protect and maximize the value of their work at scale, in real time, and around the world.
The Challenge: Billion-Vector Audio Search on Legacy Infrastructure
As Orfium’s business expanded rapidly, so did the volume of content it needed to analyze. This growth placed immense pressure on their existing infrastructure, which was foundational to their content recognition and copyright management services. The heart of the issue was scale: the reference database had grown to encompass hundreds of thousands of audio files, and the systems in place weren’t built to handle this volume of vectors.
Orfium’s pipeline doesn’t just store MP3s and MP4s — it leverages machine learning models to extract audio embeddings for similarity matching. “A vector embedding is an information-rich, numerical representation of audio features in a high-dimensional space,” explained George Kastrinakis, Director of Data Science and AI Services at Orfium. “For a two-minute audio file, we extract multiple embeddings — each one capturing the key audio features of a specific segment of the track.”
This approach generates one fingerprint per audio segment, meaning every track produces dozens—sometimes hundreds—of vectors. These high-dimensional vectors capture the unique acoustic signature of the audio, enabling precise detection of reused content across different contexts. “You can imagine combining these fingerprints to run a search and detect which segments of a song appear in another file,” George added.
But this technique came with a cost. Orfium’s existing Elasticsearch and OpenSearch stack—initially designed for full-text keyword search—was not suited for high-dimensional vector similarity searches. “With traditional databases, you hit a wall fast. It becomes expensive and slow,” George said. The system was pushed to its limits. Indexing 500,000 audio files translated into a massive performance strain, leading to latency issues, skyrocketing costs, and an infrastructure operating at full throttle just to stay afloat.
The Search for a Vector-Native Solution
As Orfium’s infrastructure began to strain under the demands of large-scale audio fingerprinting, the engineering team launched a comprehensive search for a solution purpose-built for high-dimensional vector similarity search.
Benchmarking for Performance, Cost, and Scale
The Orfium team conducted in-house benchmarks against several candidates, including open-source Milvus, Zilliz Cloud (a managed version of Milvus), TileDB, Snowflake, and Pgvector, on three key criteria: retrieval accuracy, cost efficiency, and scalability.
Vector retrieval accuracy. Because their fingerprinting process generates multiple feature vectors per segment of audio and the vector space is becoming extremely populated, even slight differences in vectors caused by harsh quantisation can significantly impact the retrieval metrics.
Cost efficiency. With plans to scale from hundreds of thousands to potentially tens of millions of reference audio files—each producing multiple vectors—they projected a total footprint in the tens of billions of vectors. Under traditional pricing models, such growth would become prohibitively expensive.
Scalability and throughput. Their production pipeline processes audio from radio and TV broadcasts, as well as YouTube and TikTok, in massive volumes. A typical workload involves reference databases comprising up to millions of audio files, resulting in approximately billions of vectors. Any solution would need to support high-volume indexing and querying without bottlenecks.
The Breakthrough: Zilliz Cloud
Compared to other options, open-source Milvus offered promising flexibility, allowing the team to experiment with system-level tuning. However, the overhead was significant. While they appreciated the control it gave them, George admitted it “took a lot of effort to actually set up everything,” which ran counter to their goal of speeding up deployment and minimizing maintenance.
That operational burden made a fully managed alternative more attractive. After extensive testing, Zilliz Cloud, the managed Milvus, came out on top. It stood out as the most complete and production-ready solution. It has everything that the best of Milvus offers, was easy to adopt, performed well under load, and provided a managed experience that freed the team to focus on building applications rather than infrastructure.
Deployment was straightforward. One engineer led the full migration—from uploading the reference data and extracting features to configuring the system—entirely through the Zilliz Cloud console.
As George summarized, “it was the best thing to offer—performance-wise, cost-wise, and ease-of-use-wise.”
The Solution: Powering Audio Matching and Cover Song Detection with Zilliz Cloud
Now, Orfium uses Zilliz Cloud to power two mission-critical services: audio matching and cover song recognition. The first identifies the exact usage of known songs across different media platforms. The second goes a step further, detecting different versions or covers of those songs, even if they’re re-recorded or slightly altered.
To support these capabilities, Orfium relies on proprietary neural networks to create embeddings from audio content. These vectors are stored in Zilliz Cloud and retrieved using vector similarity searches. Traditional machine learning models and transformer-based architectures facilitate the analysis of metadata to determine the degree of relatedness between two assets. George explained that they “use neural networks to create embeddings and then do scoring on the vectors we retrieve,” while also applying models that assess the similarity of metadata between assets.
Zilliz Cloud now plays a central role in Orfium’s AWS-based infrastructure. Subscribed through the AWS Marketplace, it fits neatly alongside their existing cloud services for compute and storage.
The Result: Performance Breakthroughs and Operational Flexibility Unlock New Capabilities
Migrating to Zilliz Cloud delivered immediate and measurable improvements for Orfium, enhancing system performance, simplifying operations, and unlocking capabilities that were previously impossible with their legacy infrastructure.
Scalable Performance at Billion-Vector Scale
One of the most impactful gains was the ability to scale seamlessly without sacrificing performance. The team quickly transitioned from their initial setup to a configuration optimized for higher throughput, and the results exceeded expectations. What once felt like infrastructure limits turned out to be bottlenecks that their new system could easily overcome.
Today, Orfium handles a reference database of 500,000 to 1 million audio files on the cloud—roughly a quarter of a billion vectors—with ease. Under their previous Elasticsearch-based stack, this scale would have pushed them to the edge of system capacity. With Zilliz Cloud, those constraints are no longer a concern.
Real-Time Response for Immediate Copyright Protection
Latency has gone from a challenge to a competitive advantage. With Zilliz Cloud’s vector-native architecture, Orfium is now able to run accelerated audio matching across broadcast, social, and streaming platforms. This capability supports their mission of protecting artists’ intellectual property the moment content is published or aired.
As George put it, “Latency is important. At this stage, it's probably the most important.” The speed and responsiveness of Zilliz Cloud enable it to support time-sensitive detection at scale confidently.
Predictable, Cost-Efficient Scaling
Where their previous setup caused costs to spike as data volumes grew, Zilliz Cloud offers a more sustainable model. Its pricing aligns with usage and value, allowing Orfium to expand confidently without worrying about runaway infrastructure expenses.
With the same 500,000 audio files that once pushed their Elasticsearch system to the limit, Orfium now experiences consistently high performance at a fraction of the cost. “It’s really performant in terms of accuracy and latency and everything,” George said.
Simplified Operations and Faster Iteration
Operational simplicity has been another standout benefit. Zilliz Cloud’s managed experience eliminated the complexity of maintaining vector infrastructure, making it easy for the team to deploy updates and scale workloads without disruption.
George highlighted how smooth the transition was: “It was very, very quick from the moment we decided to go with Zilliz to the moment we actually had something working.” The ability to make infrastructure changes without impacting pipelines has enabled Orfium to iterate more quickly and stay focused on delivering customer value.
What’s Next: Building a Smarter Copyright Detection Ecosystem
With vector-based audio matching well established, Orfium is now expanding its copyright detection ecosystem into new frontiers, leveraging Zilliz Cloud for use cases such as lyrics transcription, metadata matching, and hybrid search.
Lyrics-Based Detection for Covers and Adaptations: Instead of identifying songs by their audio alone, Orfium plans to extract lyrics from a file and match them against a stored lyrics database. This technique offers complementary protection, especially useful when instrumentation, tempo, or vocal styling significantly alter the fingerprint of a song.
“The idea is that you get an audio file, extract the lyrics, and then match those lyrics with the database you already have,” George explained.
Hybrid Search: Combining Vectors with Text: Zilliz Cloud can support lyrics matching through hybrid search, blending vector similarity with text-based phrase detection. This opens the door to combining semantic understanding with traditional keyword matching.
Semantic Metadata Matching and Relationship Discovery: By comparing associated data points—such as artist names, track info, release dates, or genres—Orfium can surface relationships between songs and assets that aren’t obvious through audio alone. This would enable richer discovery mechanisms, from identifying covers and remixes to mapping musical influence networks.
Scaling for the Future: 100x Growth in Vector Volume: Orfium’s roadmap includes aggressive scale. While their current deployment involves approximately a million audio files, their long-term vision involves indexing tens of millions to over 100 million audio assets, resulting in tens of billions of vectors. Such a scale would be unmanageable without a purpose-built vector database. Zilliz Cloud’s architecture provides the scalability and flexibility necessary to support this growth while maintaining optimal performance and reliability.
Conclusion: A Scalable Foundation for the Future of Copyright Protection
By adopting Zilliz Cloud, Orfium moved from operating at its limits to innovating with confidence. They now deliver real-time detection across massive audio libraries, simplify operations for their engineers, and unlock new capabilities they couldn’t have imagined before.
We’re proud that Zilliz Cloud plays a role in powering Orfium’s vision. Their technical leadership and focus on innovation continue to set a high bar for what’s possible in rights management, and we’re excited to support their mission as they build the future of audio and content intelligence at a global scale.
- About Orfium
- The Challenge: Billion-Vector Audio Search on Legacy Infrastructure
- The Search for a Vector-Native Solution
- The Solution: Powering Audio Matching and Cover Song Detection with Zilliz Cloud
- The Result: Performance Breakthroughs and Operational Flexibility Unlock New Capabilities
- What’s Next: Building a Smarter Copyright Detection Ecosystem
- Conclusion: A Scalable Foundation for the Future of Copyright Protection
Content
Use case
Industry
Music
It was the best thing to offer—performance-wise, cost-wise, and ease-of-use-wise.
George Kastrinakis