Lesson 3 · 11 min
Chunking — the most important boring decision
Bad chunking is the #1 cause of bad RAG. There's no universally right strategy — but there are clear wrong ones.
The tradeoff
When you split a document into chunks, two forces fight:
- Smaller chunks → more precise retrieval, but the LLM gets fragmented context that may miss key information.
- Larger chunks → more complete context per chunk, but retrieval scores dilute and you waste tokens stuffing irrelevant text into the prompt.
A reasonable starting point: 400–800 tokens per chunk, with 10–20% overlap. Adjust based on document type.