Skip to main content

Lesson 10 · 8 min

A safety checklist you can actually ship with

The capstone — a copy-pasteable checklist your team uses on every AI feature before launch. Not aspirational. The minimum.

Pre-launch safety checklist

Inputs

  • [ ] Input boundary checks (PII detection, injection-pattern detection)
  • [ ] User-content / retrieved-content separation in the prompt structure
  • [ ] Untrusted-data tagging for retrieved content

Tools and actions

  • [ ] Tools have typed JSON schemas, not free-form natural-language commands
  • [ ] Tools have declared scopes (read / write / destructive)
  • [ ] Destructive actions require explicit user confirmation, not just an agent decision
  • [ ] Audit log of tool calls retained per privacy policy

Outputs

  • [ ] Provider harm filter on by default
  • [ ] Domain-specific filter for your policy concerns
  • [ ] Schema validation on structured outputs; reject + retry on shape mismatch

Eval and monitoring

  • [ ] Red-team eval set with ≥30 cases mixing direct + indirect injection + jailbreaks + PII extraction
  • [ ] Per-segment quality eval with fairness gate
  • [ ] CI gate: red-team and fairness eval on every prompt or pipeline change
  • [ ] Production monitoring for refusal rate, length anomalies, unexpected tool calls

Privacy

  • [ ] Pre-send PII redaction (regex / lightweight NER)
  • [ ] Provider zero-retention mode where applicable
  • [ ] Logs default-redacted; raw access gated and audited
  • [ ] Right-to-delete plumbing tested

Policy and procurement

  • [ ] EU AI Act tier classified and documented
  • [ ] Sub-processor list current (provider, model version, region)
  • [ ] AI involvement disclosed to end users
  • [ ] Procurement-ready answer doc for the standard 8 questions

Incident response

  • [ ] Containment playbook defined and tested
  • [ ] Severity matrix and notification SLAs documented
  • [ ] Reproduction runbook (prompt + retrieval + model version captured per call)