Jiang Chen: Why I Joined Zilliz
New challenges
Over the past decade, I have specialized in various aspects of data infrastructure, including access control, data privacy, NoSQL databases, and, most recently, web-scale data indexing. During the web and mobile era, big data emerged as a significant infrastructure innovation, with technologies like MapReduce, distributed computing, and structured data storage leading the way. The AI era requires different technology stacks, especially with the growing popularity of Large Language Models. Embedding and vector stores are at the center of the stage, which are also the focus of Zilliz.
Right before joining Zilliz, my focus was on search indexing at Google. I worked on building ultra-flexible infrastructures to understand billions of images and videos on the public web, generate structured labels for web pages that encompass different multimedia data, and transform them into searchable indices. These technologies laid the foundation for user-facing product innovations, such as short video search. However, traditional methods of understanding unstructured data, like text, images, and videos, required generating many labels using multiple machine learning models. This process relied on a complex infrastructure involving ML model inference, business logic orchestration, and retrieval algorithms, which presented significant challenges.
Furthermore, search content retrieval also faced limitations with traditional rule-based systems (matching keywords with inverted indices). It became evident that embedding-based retrieval offered powerful improvements in search result relevance. While embedding is not new to search and has been used in traditional web search for years, adopting embedding-based retrieval, or a hybrid approach, for numerous new search products proved a significant challenge, even for Google.
When I first met with Robert, the product head at Zilliz, I was impressed by the ambition of democratizing embedding and vector search technologies. I’m also motivated to participate in this fantastic AI wave.
Empowering developers to unleash the potential of unstructured data
Despite recognizing the potential of embeddings, many developers and businesses need more infrastructure to organize unstructured data as vector embeddings effectively. However, in the modern AI era, I firmly believe that AI-native infrastructure holds the key to the future of business. These infrastructures utilize large neural network models, vector storage, and specialized computation systems designed specifically for them. Having observed the operation of such infrastructure on a global scale, I am enthusiastic about democratizing this highly complex infrastructure, enabling resource-limited startups to harness AI technology and create groundbreaking applications that propel humanity forward.
Awesome people
I have been in the startup world before. Having experienced the roller coaster of founding a new project, being burned out by finding poor product-market fit, and hunting for the next round of funding, I firmly know that when tackling complex problems in an uncertain context, the most important thing is to work with a team of great people that you can trust.
Upon meeting the exceptional team of Charles, Robert, and James, I can immediately tell they are seasoned entrepreneurs, fully aware of the uncertainty of entrepreneurship and still willing to devote themselves to it. I knew they were the perfect group to work with, so I decided to join the remarkable team without hesitation.
This is ultimately why I joined Zilliz – to navigate through uncertainty; you need both a vital mission and a bunch of intelligent and resilient folks to work with, which Zilliz has, and many more, which we would strive for together.
What excites me most here at Zilliz
What excites me the most now is the challenge of building highly usable and practical products that simplify unstructured data for developers. It's a complex problem with no one-size-fits-all solution, requiring us to find a roadmap for the future. At Zilliz, we are free to do whatever's right, a bonus of working for a startup. Solving this problem requires a strong engineering background, deep product insight, and empathy for developers. My daily routine involves:
Engaging in discussions with sales and solution architects.
Defining product details.
Instructing the technical design.
Giving code reviews.
This work style is something I enjoy, and I learn new things every day.
Join us on the frontier of AI infrastructures
Here at Zilliz, we build a suite of tooling and services that ease the information retrieval process on unstructured data. We have Towhee, the all-in-one unstructured data ETL and embedding solution; Akcio, the open-source implementation of Retrieval Augmented Generation; and the vector database that efficiently stores and searches vector embeddings.
If what I have described here interests you, please let us know! We have a variety of openings in engineering roles. If building infrastructures to manage and utilize unstructured data effectively is what you are excited about, you should not miss the chance to work with us. We also have openings for marketing and product roles; feel free to check out our careers page to learn more.
- New challenges
- Empowering developers to unleash the potential of unstructured data
- Awesome people
- What excites me most here at Zilliz
- Join us on the frontier of AI infrastructures
Content
Start Free, Scale Easily
Try the fully-managed vector database built for your GenAI applications.
Try Zilliz Cloud for FreeKeep Reading
- Read Now
Deliver RAG Applications 10x Faster with Zilliz and Vectorize
Zilliz Cloud delivers reliable vector storage and search, while Vectorize automates your RAG pipelines and keeps your embeddings up-to-date.
- Read Now
Simplifying Legal Research with RAG, Milvus, and Ollama
In this blog post, we will see how we can apply RAG to Legal data. Legal research can be time-consuming. You usually need to review a large number of documents to find the answers you need. Retrieval-Augmented Generation (RAG) can help streamline your research process.
- Read Now
Garbage In, Garbage Out: Why Poor Data Curation Is Killing Your AI Models
Encord highlighted the importance of data quality and market trends, presenting a roadmap to help organizations establish high-quality data production pipelines.