AI and Liability: Who Pays When Models Mislead? A Practical Playbook

AI is now a product liability question, not just a tech debate. Sparked by Simon Willison’s “AI and liability”, here’s a clear, practical playbook for teams shipping AI features today. (Not legal advice.)

Why liability is shifting to deployers

Control = responsibility: The party that chooses prompts, guardrails, and UX copy often owns the risk in users’ eyes—and sometimes in contracts.
Proximity to harm: Deployers control the last mile (claims, defaults, disclosures), so they’re first in line when things go wrong.
Contracts beat vibes: Vendor TOS may limit model providers’ liability, pushing obligations to the business that ships the feature.

Practical steps to cut AI liability now

Design for verification: Add citations, sources, and confidence hints. Prefer retrieval-augmented generation (RAG) with linked references over free-form answers.
Human-in-the-loop for high stakes: Route edge cases to human review (medical, legal, financial, safety). Make the escalation path visible.
Precise UX copy: Replace generic “may hallucinate” with task-specific guidance and what the model can’t do. Show examples of safe/unsafe inputs.
Guardrails by default: Implement content filters, allow/deny lists, and policy-constrained system prompts. Test jailbreak resilience before launch.
Measure and prove it: Run red-team scenarios and task-level evaluations. Track failure rates, coverage, and time-to-correction across versions.
Data hygiene: Control what enters the context window. Strip PII, maintain document provenance, and watermark internal content to trace errors.
Contracts and insurance: Negotiate vendor SLAs, IP indemnity, and clear limits of liability. Consider E&O and cyber insurance riders for AI features.
Kill switches: Version your prompts and models. Keep a rollback plan and feature flag to disable risky behaviors instantly.

What to log for defensibility

Prompt stack: system, developer, and user prompts; tool calls; retries.
Model + params: provider, model name, version, temperature, top_p, safety settings.
Context: retrieved docs, embeddings index version, and chunk IDs.
UX state: user role, consent state, feature flags, and AB test bucket.
Timestamps and IDs: request ID, session, user/device, and IP (per policy).
Outcomes: user edits, overrides, reported issues, and human-review decisions.
Retention policy: how long you keep logs and how you purge sensitive data.

Regulatory watchlist and guidance

Ground your program in credible playbooks: the NIST AI Risk Management Framework for governance, the FTC’s guidance on truthful AI claims for marketing and disclosures, and the EU’s AI Act overview to map likely obligations by risk tier.

Takeaway

Assume you’re accountable for how AI is used in your product. Build for verification, put humans over hype, log everything you need to explain a bad outcome, and let contracts reflect reality.

Want weekly, no-fluff AI briefs like this? Subscribe to The AI Nuggets newsletter.

Subscribe

What's Hot