Skip to main content

Lesson 2 · 11 min

Specifying an AI feature — the eval-first PM template

The 7-section PM template that makes AI features actually shippable. Eval criteria first, success metric second, edge cases and recovery third.

The template

  1. One-line description. What the user sees. (E.g. "Auto-summarize a support ticket in one sentence.")
  2. Eval criteria. A 30-case eval set with scoring rules. This goes first, not last.
  3. Success metric. What does the team commit to? (E.g. "≥90% of summaries pass the rubric on the eval set; <2% production refusal rate.")
  4. Failure modes + recovery. What happens when the model is wrong? (Refuse, escalate to human, show with caveat.)
  5. Privacy posture. What user data flows to which provider? Retention?
  6. Cost envelope. Per-call cost cap; monthly budget at projected volume.
  7. Lifecycle. Quarterly review trigger; signal that prompts a re-eval (provider model update, eval-set regression, user complaint volume).