Lesson 8 · 17 min
Capstone: a document Q&A system for 500-page PDFs
Build a complete document Q&A pipeline that handles arbitrarily large PDFs with hierarchical summarization, dynamic retrieval, prompt caching, and context debugging.
The problem
A 500-page technical specification PDF is ~400k tokens — twice the context window. Users ask questions like "what are the safety requirements for module 7?" and "how does this compare to the previous version?"
Naively, this fails. With context window engineering, it works.