Skip to main content

Lesson 3 · 11 min

Chunking — the most important boring decision

Bad chunking is the #1 cause of bad RAG. There's no universally right strategy — but there are clear wrong ones.

The tradeoff

When you split a document into chunks, two forces fight:

  • Smaller chunks → more precise retrieval, but the LLM gets fragmented context that may miss key information.
  • Larger chunks → more complete context per chunk, but retrieval scores dilute and you waste tokens stuffing irrelevant text into the prompt.

A reasonable starting point: 400–800 tokens per chunk, with 10–20% overlap. Adjust based on document type.