Lesson 6 · 12 min
Dynamic context assembly
Build a context assembler that retrieves, scores, and deduplicates content at request time — so every prompt contains the most relevant content for that specific query.
Context is not static
The most common mistake in production RAG systems: treating retrieval as a binary include/exclude decision. "Retrieve 5 chunks → include them all → done."
Real-world context assembly is more nuanced:
- Relevance varies by query — chunk A is highly relevant for question 1, irrelevant for question 2
- Redundancy wastes tokens — chunk A and chunk B may say the same thing with 80% overlap
- Recency matters — a 2025 document beats a 2022 document on the same topic
- Source diversity matters — three chunks from the same source page aren't three independent facts