Infrastructure Co-option | Adage of Ultron

// SUMMARY

In February 2023, researchers introduced indirect prompt injection: hiding instructions in content an agent would retrieve. The agent would follow those instructions as if they came from the user. This created a new attack surface that grows with every tool an agent can access.

The attacks have matured. Red teams achieved remote code execution on every major coding agent including Cursor, Claude Code, and Copilot. Lateral movement agents succeeded in 37 of 40 multi-host network environments by pivoting through compromised credentials. When researchers tested multi-agent systems, 82% were vulnerable to inter-agent attacks where one compromised agent could hijack others. At least five Fortune 500 companies were affected by prompt injection in AI-powered GitHub Actions. An agent inherits whatever permissions it was granted. As agents gain more access, the blast radius of a successful injection grows.

// EVIDENCE TIMELINE

Newest entries first

2025-12-01

[████] STRONG

PromptPwnd: Prompt Injection Inside GitHub Actions

Aikido Security

CI workflows vulnerable when AI agents process untrusted PR content with high-privilege tokens; direct supply chain hijacking vector.

Source

2025-12-01

[████] STRONG

Log-To-Leak: Prompt Injection Attacks via Model Context Protocol

Multiple authors

Covert logging injection captures user queries and tool responses across five MCP servers while preserving task quality.

Source

2025-09-01

[████] STRONG

Red-Teaming Coding Agents from a Tool-Invocation Perspective

Multiple authors

ToolLeak vulnerability enables prompt exfiltration; two-channel injection hijacks tool invocation in Cursor, Claude Code, Copilot.

Source

2025-07-01

[████] STRONG

The Dark Side of LLMs: Agent-based Attacks for Complete Computer Takeover

Multiple authors

82% of agents vulnerable to inter-agent trust exploitation; only 1/17 systems resistant to all attack vectors.

Source

2025-01-01

[████] STRONG

Incalmo: Autonomous Red Teaming with LLMs on Multi-Host Networks

Multiple authors

Lateral movement agent exploits hosts and installs malware, then pivots using found credentials; 37/40 environment success.

Source

2024-08-01

[███░] CONSIDERABLE

ConfusedPilot: Confused Deputy Risks in RAG-based LLMs

Multiple authors

Malicious strings in retrieved documents can influence RAG agent behavior enabling policy bypass.

Source

2024-06-01

[████] STRONG

AgentDojo: A Dynamic Environment to Evaluate Prompt Injection

Debenedetti, E., et al.

Dynamic environment measuring prompt injection attacks and defenses for LLM agents with untrusted tool outputs.

Source

2024-04-01

[████] STRONG

CyberSecEval 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models

LLM Agents Can Autonomously Exploit One-Day Vulnerabilities

Fang, R., et al.

Agentic LLM setups can move beyond describing vulnerabilities into tool-assisted exploitation workflows.

Source

2024-03-01

[███░] CONSIDERABLE

InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated LLM Agents

Zhan, Q., et al.

Benchmark for indirect prompt injection against tool-integrated agents; ReAct-prompted GPT-4 vulnerable 24% of time.

Source

2023-02-01

[████] STRONG

Not What You've Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection

Greshake, K., et al.

Introduces Indirect Prompt Injection where adversaries inject instructions into content likely to be retrieved, steering agents remotely.

Source

// SUMMARY

// SUBCATEGORIES

// EVIDENCE TIMELINE

PromptPwnd: Prompt Injection Inside GitHub Actions

Log-To-Leak: Prompt Injection Attacks via Model Context Protocol

Red-Teaming Coding Agents from a Tool-Invocation Perspective

The Dark Side of LLMs: Agent-based Attacks for Complete Computer Takeover

Incalmo: Autonomous Red Teaming with LLMs on Multi-Host Networks

ConfusedPilot: Confused Deputy Risks in RAG-based LLMs

AgentDojo: A Dynamic Environment to Evaluate Prompt Injection

CyberSecEval 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models

LLM Agents Can Autonomously Exploit One-Day Vulnerabilities

InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated LLM Agents

Not What You've Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection