Skip to main content
NNextGen AI Learn
All news
Paper / reportagentsproductionlessons

100 failed agents — what we keep learning

A retro on a year of production agent failures across consulting engagements. Top causes ranked.

From a retrospective on ~100 production agent deployments that didn't hit their goals:

Top failure modes

  1. Should have been a workflow (32%). The "agent" decided steps that were always the same. A fixed pipeline would have been cheaper, faster, more reliable.
  2. Prompt injection on retrieved content (18%). The agent read a doc with hidden instructions and obediently exfiltrated data.
  3. Runaway loops on tool errors (15%). A flaky API caused the agent to retry indefinitely, burning budget.
  4. Tool authorization gaps (12%). The agent called a destructive tool because nothing stopped it.
  5. Eval theater (11%). Tested on "works on the demo" examples; production traffic exposed brittleness.
  6. Latency UX collapse (8%). 30-second responses for what users expected to be 5-second answers.
  7. Cost overrun (4%). No budget cap. One bad day cost more than the project saved in a month.

Fixes

  • Build the workflow first. Reach for an agent only when the workflow gets uglier than the agent.
  • Tool privilege scoping (read-only by default).
  • Hard step + dollar caps with halt + alert.
  • Output filtering for prompt-injection patterns in retrieved content.
  • Real eval set, run weekly, track trends.

Want the deep dive?

The lessons that ground this news in mechanics — not opinion.

Browse courses