Building a Tax Appeal RAG with Milvus, LlamaIndex, and GPT
Zilliz, LlamaIndex, Microsoft, and enthusiasts in the Seattle area host monthly Hackathons to explore potential startup ideas. The chances of four strangers coming together to build a working prototype for an idea that originates the same day are slim at best. The SaveHaven project demonstrates such a story.
Imagine a Product Manager, a couple of full-stack developers with no idea of LlamaIndex, Zilliz Cloud, or the Retrieval Augmented Generation (RAG) framework, and an enthusiastic AI developer meeting for the first time. After an introduction to the RAG system and riffing off 3-4 ideas, we settled on one that seemed doable in the remaining 5 hours. This is the story of such a team.
The Bureaucracy Challenge
City, county, state, and federal bureaucrats collect over $850 billion in taxes, levies, and fines unchallenged by taxpayers. About half a million appeals, requests for information, objections, and follow-ups that most individuals are not well equipped to fulfill, leading to most people paying the problem off instead of demanding fair treatment.
The SaveHaven RAG app we built changes all that by helping consumers contest, appeal, handle objections, make a case for themselves, and save over $12 billion in excess collections tagged as “unclaimed money” with government agencies. By helping the common population save on income and property tax, SaveHaven helps individuals stand up for their rights. Successful outcomes depend on making the first appeal and effectively responding to follow-ups, which takes 4 to 5 iterations over a year.
The Solution: SaveHaven, a RAG App for Easier Tax Appeal
We recognized that the information to make a case on behalf of individuals is available in county, state, and federal records in addition to regulations and requirements prescribed and published by the respective agencies. By filing targeted “Freedom of Information Act” requests, we can collect examples of communications, appeals, objection responses, and documentation that succeeds with the right agency. By leveraging a RAG system, we can mimic successful petitions, appeals, protests, and objection handling to drive positive outcomes at a fraction of the cost of creating the documents necessary to make a case.
In the Hackathon, we adapted a web scraper to pull in county records, regulations, and rules for property tax appeals in King County. Then, we fed this data into a vector database using the provided cookbook. We then used Reactjs to create a web frontend and built a set of APIs that used LlamaIndex to orchestrate requests and fetch relevant embeddings from vector databases; we used Milvus as the vector database and OpenAPI’s GPT as the LLM.
Our solution is simple: we only ask for the county parcel number or address as input, and we then inquire about the county records for comparable properties to compose an Appeal compliant with the regulations and rules to minimize the land and property value using comparable property lists.
The demonstrated output generated a text that would start the appeals process. In the next iteration, we would incorporate objection handling and successfully reduce property taxes by $19.99 for one case. As property taxes are adjusted yearly, this is expected to be an ongoing activity. Similar issues are encountered in state and federal income tax instances, which was our roadmap.
Knowledgebase
We have leveraged open-source scrappers to fetch data from different government websites. Then, we chunked the data, transformed it into vector embeddings, and stored them in the Milvus vector database. We built this one-time knowledge base. On top of it, it keeps improving as we add more and more data when the platform is being used.
Request Orchestration
We used LlamaIndex as the base orchestrator and built a few pieces on top of it. If we don’t find relevant knowledge in Milvus, LlamaIndex will scrape the relevant data, update the knowledge base, and then talk to LLM to get a final response.
APIs
We have used the FastAPI framework to expose our APIs and built all these components using Python3 to keep them simple and quick to achieve.
Summary
SaveHaven is a RAG app that can help individuals contest and appeal property and income tax assessments. It streamlines the tax appeal process, making it more accessible and manageable for the general public. The system employs technologies like LlamaIndex, Milvus, and GPT from OpenAI to automate the collection and analysis of data from public records, thus facilitating the preparation of effective tax appeals.
The SaveHaven team intends this blog to be an example so future entrepreneurs can learn from our experience and build meaningful innovations. We understand GenAI will transform our lives, and the technologies from LlamaIndex, Microsoft, and Zilliz will make it possible to go from an idea to a working solution within a day with the right team.
Start Free, Scale Easily
Try the fully-managed vector database built for your GenAI applications.
Try Zilliz Cloud for FreeKeep Reading
- Read Now
Evaluating Multimodal RAG Systems Using Trulens
Understand multimodal models and multimodal RAG as well as learn how to evaluate multimodal RAG systems using Trulens
- Read Now
How Metadata Lakes Empower Next-Gen AI/ML Applications
Metadata lakes are centralized repositories that store metadata from various sources, connecting data silos and addressing various challenges in RAG.
- Read Now
Building Secure RAG Workflows with Chunk-Level Data Partitioning
Rob Quiros shared how integrating permissions and authorization into partitions can secure data at the chunk level, addressing privacy concerns.