If your organization is running AI inference workloads on SGLang — and thousands are — an attacker on the open internet may already be able to silently take over your servers, steal your models, and pivot deeper into your network, all without knowing a single password.
Who's at Risk — and Why It Matters Beyond the Lab
SGLang has quietly become one of the most popular frameworks for deploying large language models at scale. Researchers, startups, and enterprises use it to serve AI models — the same kind of technology powering chatbots, coding assistants, and document analysis tools you may already depend on. According to its GitHub repository, SGLang has accumulated tens of thousands of stars and is actively used in production AI pipelines worldwide.
This vulnerability, tracked as CVE-2026-3060 and rated a near-perfect 9.8 out of 10 (CRITICAL) on the industry severity scale, affects any SGLang deployment where the disaggregation module is reachable over a network. That includes cloud-based AI inference clusters, university research systems, and any startup that spun up an SGLang server to host their own model. The blast radius here is not theoretical — it is every organization that deployed this software and assumed it was safe behind a firewall, but perhaps left a port open or misconfigured a cloud security group.
What an Attacker Can Actually Do to You
Imagine you run a small AI company. You've deployed SGLang to serve your custom language model to paying customers. Somewhere in your infrastructure, the disaggregation system — a component designed to split AI processing across multiple machines to handle heavy workloads — is listening for incoming connections. You never set up a password for it because the documentation didn't make it obvious that you needed to. An attacker scans the internet, finds your server's open port, and sends it a specially crafted message. Within seconds, they are running their own commands on your machine. Not reading files — running commands. They can install malware, exfiltrate your proprietary AI model (potentially worth millions of dollars in training costs), steal API keys stored in environment variables, and use your server as a launchpad to attack other systems inside your network.
The reason this is possible comes down to a dangerous shortcut that's been a known hazard in software development for years: a mechanism called pickle deserialization. Think of "pickling" as a way Python programs package up complex data — like a list of instructions — and ship it over a network to another machine, which then unpacks and executes it. The problem is that this packaging format was never designed to be trusted from strangers. It's the software equivalent of accepting a mystery package from an unknown sender and immediately plugging it in. SGLang's disaggregation module was doing exactly that — accepting these packages from anyone, anywhere, with no check on who was sending them, and then blindly executing whatever was inside.
What makes this particularly dangerous is the combination of two failures happening at the same time. First, there is no authentication — the system doesn't ask who you are before accepting your data. Second, the data it accepts can carry executable instructions that the server will run automatically. Either problem alone would be serious. Together, they create a front door that is both unlocked and booby-trapped in favor of the attacker. Any script kiddie with a port scanner and a Python tutorial could exploit this.
The Technical Anchor: Unsafe pickle.loads() on an Unauthenticated Network Endpoint
For security researchers and engineers auditing their own deployments: the vulnerability lives specifically in SGLang's encoder parallel disaggregation module, which calls Python's native pickle.loads() function directly on data received over the network. There is no HMAC signature verification, no token-based authentication layer, and no input sanitization prior to deserialization. This is a textbook CWE-502 (Deserialization of Untrusted Data) combined with CWE-306 (Missing Authentication for Critical Function). The CVSS 3.1 base score of 9.8 reflects the attack vector being network-accessible, requiring zero privileges and zero user interaction, with full confidentiality, integrity, and availability impact. Exploitation requires only network access to the listening port — no credentials, no prior foothold.
What We Know About Exploitation So Far
As of publication, there is no confirmed evidence of active exploitation in the wild. However, the security community's experience with similar vulnerabilities — particularly unauthenticated deserialization flaws in machine learning infrastructure tools like MLflow and Ray — suggests the window between "no known exploitation" and "actively targeted" can be measured in days, not months, once a CVE is public. Automated vulnerability scanners and opportunistic threat actors routinely trawl for exactly this class of flaw.
The vulnerability was identified through code analysis of SGLang's disaggregation subsystem. At this time, no specific researcher or team has been publicly credited with the discovery, and no known threat actor campaigns have been attributed. Given the high value of AI model infrastructure as a target — both for intellectual property theft and for commandeering expensive GPU compute for cryptomining or other purposes — organizations should treat this with the same urgency as a confirmed zero-day.
What You Need to Do Right Now
Take these three steps immediately, in order of urgency:
-
Restrict network access to SGLang disaggregation ports now — before you patch. Use your firewall, cloud security group (AWS Security Groups, GCP VPC firewall rules, Azure NSG), or host-based firewall (
ufw,iptables) to block all external access to the ports SGLang's disaggregation module listens on. These should only ever be reachable from within a trusted internal network segment. This is your most important immediate action, even before a patch is available. -
Update SGLang to the latest patched release as soon as it is published. Monitor the official SGLang GitHub repository at
github.com/sgl-project/sglangand the project's release notes for a patch addressing CVE-2026-3060. At time of writing, check for any version released after the CVE publication date — version numbers will be confirmed in the project's security advisory. Do not run unpatched versions of SGLang with the disaggregation module enabled on any network-accessible system. - Audit your deployment for signs of compromise and rotate all credentials. Review server logs for unexpected outbound connections, unfamiliar processes, or unusual network traffic originating from your SGLang hosts. Run a file integrity check on your model files and check environment variables for any stored API keys or cloud credentials — rotate all of them immediately. If you suspect a breach, treat the affected host as fully compromised and rebuild from a clean image rather than attempting in-place remediation.
Vulnerability Summary: CVE-2026-3060 | CVSS 9.8 (Critical) | SGLang Encoder Parallel Disaggregation Module | Unauthenticated Remote Code Execution via pickle.loads() | Cross-Platform | No active exploitation confirmed at time of publication.