Definition
AI security blocks intentional attacks on your AI system; AI safety stops a correctly-working system from causing harm.
At a glance
- The test: works as intended but still causes harm = safety problem; an attacker pushes it off track = security problem.[2]
- Security threats are deliberate: prompt injection, data poisoning[4], model theft.
- Safety risks show up in normal use: biased decisions, hallucinated falsehoods, harmful advice.
- You need both: governance frameworks treat them together, not as a choice.
How they split
Intent is the dividing line: security defends against deliberate attackers, safety against unintended consequences.[3] Security aims to keep data confidential, correct, and available.[1] A locked-down model can still quietly discriminate; a fair model can still be hijacked.
Why it matters to you
Security failures usually mean a breach or data leak. Safety failures usually mean legal, reputational, or discrimination exposure, because the harm comes from the product behaving as designed. The NIST AI Risk Management Framework folds both together, listing security alongside bias and privacy.[5]
Bottom line
Ask two questions of any AI tool: can someone break in, and can it hurt us even when it works?