Skip to main content

Lesson 1 · 9 min

Why vector search is the right primitive

A look at why dense retrieval works at all — and why it took until 2024-26 for it to become the default. The intuition behind embedding spaces.

The shift from BM25

For decades, search meant TF-IDF or BM25 — count token overlap, weight by inverse document frequency, sort. It works, and it still works (and is half of every hybrid retrieval pipeline).

What it does not handle: paraphrase. A user searching for 'how do I cancel my subscription' against a doc titled 'ending a recurring plan' shares zero tokens. BM25 misses it.

Dense retrieval (vector search) handles it. The query and the doc both get embedded into a high-dimensional space (typically 768-3072 dims). 'cancel my subscription' and 'ending a recurring plan' land near each other in that space. Cosine similarity finds the match.