News
What just shipped, what to read.
Model releases, paper summaries, ecosystem moves. Curated, opinionated, dated. Updated when something actually matters.
- modelAnthropic
Claude Opus 4.7 ships with 1M-token context window
Anthropic's flagship reasoning model now ingests 1M tokens — comfortably the entire codebase of most products in one shot. Pricing unchanged.
- ecosystem
MCP one year on — what stuck, what didn't
Model Context Protocol crossed 1k public servers in early 2026. The "USB-C for agents" pitch is mostly working. Where the rough edges remain.
- business
EU AI Act enforcement — what it means for your AI features
Q1 2026 brought the first round of EU AI Act enforcement on high-risk systems. Five things product teams should check.
- model
Llama 4 ships open weights — and the math finally favors self-hosting
Meta's Llama 4 70B and 405B are out. With API prices flat-to-rising, the breakeven point for self-hosting has shifted meaningfully.
- paper
Agentic benchmarks are still mostly broken — here's what isn't
GAIA, AgentBench, SWE-bench are widely cited and widely gamed. A practical take on what to use for your own agent eval.
- business
Gemini 2.5 family — Google cuts pricing 30% across the board
Gemini 2.5 Flash and Pro both saw 30% price cuts in April. Token economics are now genuinely competitive with Anthropic and OpenAI.
- ecosystem
RAG pop quiz: 73% of teams still use cosine similarity, 12% use hybrid
A community survey of 250 teams running production RAG. The split between "default cosine" and "hybrid + reranker" is still huge.
- ecosystem
Voice-mode in 2025: lessons from a year of production deployments
Speech-to-speech models are everywhere now. The teams that succeeded share three patterns. The ones that didn't share three failures.
- model
DeepSeek V4 distill — open weights with frontier reasoning
DeepSeek's V4 distillation series brings frontier-model reasoning quality to ~30B-parameter open-weight models. Self-host changes again.
- model
Apple Intelligence: 3B on-device model now matches GPT-3.5
Apple's on-device foundation model crossed an interesting line — small enough to fit, capable enough to be useful for many tasks.
- paper
100 failed agents — what we keep learning
A retro on a year of production agent failures across consulting engagements. Top causes ranked.
- ecosystem
llms.txt — the emerging standard for AI crawlers
Like robots.txt for AI training. Sites adopting it now will shape how future LLMs understand the web.
Subscribe to the newsletter (footer) to be told when something major drops.