Definition

Prompt injection is an attack that smuggles hidden instructions into an AI assistant’s input so it ignores its real job and does what the attacker wants instead.

At a glance

OWASP ranks it the #1 AI security risk (LLM01) because AI cannot reliably tell trusted instructions from untrusted text.^[1]
Two flavors: direct (a user types Ignore previous instructions…) and indirect (malicious text hidden in an email, webpage, or document the AI reads).^[4]
Real consequences: leaked confidential files, exposed API keys and credentials, and data pulled from connected tools like Google Drive or SharePoint.^[3]
Any AI tool that reads outside content (chatbots, email assistants, AI agents) is exposed; there is no perfect fix yet.^[2]

Why your business should care

If you connect an AI assistant to your email, files, or customer data, a single poisoned message or document can hijack it. In 2025, prompt-injection incidents leaked chat records, login credentials, and confidential files from tools linked to ChatGPT.^[3] The AI was working as designed, which is exactly the problem.

How attackers pull it off

They hide commands where your AI will read them, like white text in a webpage, a note in an email, or instructions in a shared document. The AI treats that planted text as a legitimate order.^[2] Stanford student Kevin Liu famously used Ignore previous instructions to make Bing Chat reveal its secret internal rules.^[4]

Bottom line

Treat any text your AI reads as a potential instruction from a stranger, and never connect AI tools to sensitive systems without limits and human review.

What is prompt injection?

At a glance

Why your business should care

How attackers pull it off

Bottom line

References