Skip to main content

Lesson 5 · 10 min

Chunking strategies that survive production

Chunking is the highest-leverage knob in RAG. Three strategies that consistently work and the failure modes to avoid.

The three strategies that work

  1. Recursive character chunking with overlap. Split by paragraph, then by sentence, then by character if needed. Target 200-400 tokens with 10-20% overlap. Default for most prose.
  2. Semantic chunking. Use the embedding model itself: chunk where the embedding similarity drops sharply between adjacent sentences. Better for technical docs with shifting topics; slower to build.
  3. Structural chunking. For markdown, code, or any structured content: split on natural boundaries (headers, function definitions, sections). Preserve those headers in metadata.

None of the three is universally best. Pick based on content type.