policy

What are dangerous capability evaluations?

June 1, 2026 · 4 min read

DANGEROUS CAPABILITY EVALUATIONLoad it until it nearly bends.Pile on the worst tasks and watch for the red line.CBRNcyberattackself-spreadcapability threshold · release gateAI modelYou don't learn a floor's strength by walking on it — you load it until you find where it would break.

Definition

A structured test of the most harm a powerful AI could do if pushed to its limit, used to decide whether it is safe to release.

At a glance

How it works

Instead of asking how a model usually behaves, testers ask what harm a determined bad actor could extract from it. They give it tools, let it reason in steps, and sample many attempts to draw out its true ceiling[2]. A 2024 Google DeepMind study grouped the dangers into persuasion, cyber-security, self-proliferation, and self-reasoning[1]; industry frameworks add CBRN weapon uplift[4].

How results are used

Each lab sets capability thresholds (Anthropic calls its tiers AI Safety Levels). Cross one, and the model is not released until stronger safeguards are shown to cut the risk[3]. The evaluation decides whether a model ships, ships with guardrails, or stays locked down.

Why it matters

This is the AI industry’s closest thing to a pre-market safety inspection. For a business, a vendor’s published safety framework and dangerous-capability testing are a practical signal that someone is managing risks that could otherwise land on you.

Bottom line

These tests probe an AI’s worst-case potential before launch — a published one is a quick sign your vendor checked the ceiling of risk first.

Connects to LawComputer Science

References

  1. Evaluating Frontier Models for Dangerous Capabilities — Mary Phuong, Matthew Aitchison, et al. (Google DeepMind). arXiv arxiv.org
  2. Dangerous Capability Evaluations — AI Safety Atlas. AI Safety Atlas ai-safety-atlas.com
  3. Anthropic's Responsible Scaling Policy — Anthropic. Anthropic www.anthropic.com
  4. Frontier Capability Assessments — Frontier Model Forum. Frontier Model Forum www.frontiermodelforum.org