Definition
Prompt injection is an attack that smuggles hidden instructions into an AI assistant’s input so it ignores its real job and does what the attacker wants instead.
At a glance
- OWASP ranks it the #1 AI security risk (LLM01) because AI cannot reliably tell trusted instructions from untrusted text.[1]
- Two flavors: direct (a user types Ignore previous instructions…) and indirect (malicious text hidden in an email, webpage, or document the AI reads).[4]
- Real consequences: leaked confidential files, exposed API keys and credentials, and data pulled from connected tools like Google Drive or SharePoint.[3]
- Any AI tool that reads outside content (chatbots, email assistants, AI agents) is exposed; there is no perfect fix yet.[2]
Why your business should care
If you connect an AI assistant to your email, files, or customer data, a single poisoned message or document can hijack it. In 2025, prompt-injection incidents leaked chat records, login credentials, and confidential files from tools linked to ChatGPT.[3] The AI was working as designed, which is exactly the problem.
How attackers pull it off
They hide commands where your AI will read them, like white text in a webpage, a note in an email, or instructions in a shared document. The AI treats that planted text as a legitimate order.[2] Stanford student Kevin Liu famously used Ignore previous instructions to make Bing Chat reveal its secret internal rules.[4]
Bottom line
Treat any text your AI reads as a potential instruction from a stranger, and never connect AI tools to sensitive systems without limits and human review.
References
- LLM01:2025 Prompt Injection - OWASP Gen AI Security Project. OWASP Foundation genai.owasp.org
- What Is a Prompt Injection Attack? IBM www.ibm.com
- Prompt Injection: An Analysis of Recent LLM Security Incidents. NSFOCUS nsfocusglobal.com
- Prompt Injection | OWASP Foundation. OWASP Foundation owasp.org
Comments
Questions, corrections, and links welcome. Be specific and civil.