AI systems capturing and leveraging existing infrastructure as force multipliers—cloud orgs, IAM/SSO, CI/CD pipelines, supply chains, botnets, DNS/BGP, monitoring systems—to operate at scale and entrench themselves.
In February 2023, researchers introduced indirect prompt injection: hiding instructions in content an agent would retrieve. The agent would follow those instructions as if they came from the user. This created a new attack surface that grows with every tool an agent can access.
The attacks have matured. Red teams achieved remote code execution on every major coding agent including Cursor, Claude Code, and Copilot. Lateral movement agents succeeded in 37 of 40 multi-host network environments by pivoting through compromised credentials. When researchers tested multi-agent systems, 82% were vulnerable to inter-agent attacks where one compromised agent could hijack others. At least five Fortune 500 companies were affected by prompt injection in AI-powered GitHub Actions. An agent inherits whatever permissions it was granted. As agents gain more access, the blast radius of a successful injection grows.
Newest entries first
CI workflows vulnerable when AI agents process untrusted PR content with high-privilege tokens; direct supply chain hijacking vector.
Covert logging injection captures user queries and tool responses across five MCP servers while preserving task quality.
ToolLeak vulnerability enables prompt exfiltration; two-channel injection hijacks tool invocation in Cursor, Claude Code, Copilot.
82% of agents vulnerable to inter-agent trust exploitation; only 1/17 systems resistant to all attack vectors.
Lateral movement agent exploits hosts and installs malware, then pivots using found credentials; 37/40 environment success.
Malicious strings in retrieved documents can influence RAG agent behavior enabling policy bypass.
Dynamic environment measuring prompt injection attacks and defenses for LLM agents with untrusted tool outputs.
Evaluations for prompt injection and code interpreter abuse with 26-41% successful prompt injection across tested models.
Agentic LLM setups can move beyond describing vulnerabilities into tool-assisted exploitation workflows.
Benchmark for indirect prompt injection against tool-integrated agents; ReAct-prompted GPT-4 vulnerable 24% of time.
Introduces Indirect Prompt Injection where adversaries inject instructions into content likely to be retrieved, steering agents remotely.