Technologies
Python, FastAPI, Qdrant, MongoDB
AI: OpenAI models, Google Gemini models, Multi-Agent Workflows, Advanced RAG
duration
3 Months
customer
The client serves the aviation industry, specifically focusing on pilot advocacy and HR. The aim was to build a system that could accurately answer complex questions about different airlines' pilot contracts and enable direct comparison between them, empowering pilots in their career decisions.
Background and problem
Pilot employment contracts are dense, complex documents spanning 500-1000 pages. Manually comparing terms across different airlines is incredibly time-consuming and prone to error. The client needed a solution that could not only understand natural language questions about these documents but also perform accurate comparisons, handling poorly-scanned PDFs and complex legal jargon.
solution
We built a smart AI chatbot based on a multi-agent architecture. The solution begins with Document Understanding and Structure Repair, using Gemini and OpenAI models with a sliding window technique to transform poorly-structured PDFs into clean, structured text with precise section coordinate mapping.
For each query, the system performs Inquiry Classification and Feature Extraction to understand the user's intent. It then employs Query Expansion to create high-quality search queries and a Multi-step narrowing vector search in Qdrant to identify and retrieve only the most relevant information from the massive documents. The linked document system ensures that any retrieved section includes necessary clarifying context.
The main challenge was achieving high accuracy on long, complex documents without exceeding context window limits. We successfully overcame this by combining document summarization, precise structure repair, and our multi-step retrieval process, which aggressively reduces irrelevant context while preserving all critical information.

