VulnerabilitiesApr 20, 2026

SGLang CVE-2026-5760 (CVSS 9.8) Enables RCE via Malicious GGUF Model Files

SGLang CVE-2026-5760 (CVSS 9.8) enables RCE via malicious GGUF model files.

Summary

A critical vulnerability (CVE-2026-5760, CVSS 9.8) has been disclosed in SGLang, an open-source LLM serving framework. The flaw allows remote code execution through malicious GGUF model files containing Jinja2 server-side template injection (SSTI) payloads in the tokenizer.chat_template parameter. The vulnerability stems from unsafe use of jinja2.Environment() without sandboxing when rendering chat templates on the /v1/rerank endpoint.

Full text

SGLang CVE-2026-5760 (CVSS 9.8) Enables RCE via Malicious GGUF Model Files Ravie LakshmananApr 20, 2026Open Source / Server Security A critical security vulnerability has been disclosed in SGLang that, if successfully exploited, could result in remote code execution on susceptible systems. The vulnerability, tracked as CVE-2026-5760, carries a CVSS score of 9.8 out of 10.0. It has been described as a case of command injection leading to the execution of arbitrary code. SGLang is a high-performance, open-source serving framework for large language models and multimodal models. The official GitHub project has been forked over 5,500 times and starred 26,100 times. According to the CERT Coordination Center (CERT/CC), the vulnerability impacts the reranking endpoint "/v1/rerank," allowing an attacker to achieve arbitrary code execution in the context of the SGLang service by means of a specially crafted GPT-Generated Unified Format (GGUF) model file. "An attacker exploits this vulnerability by creating a malicious GPT Generated Unified Format (GGUF) model file with a crafted tokenizer.chat_template parameter that contains a Jinja2 server-side template injection (SSTI) payload with a trigger phrase to activate the vulnerable code path," CERT/CC said in an advisory released today. "The victim then downloads and loads the model in SGLang, and when a request hits the "/v1/rerank" endpoint, the malicious template is rendered, executing the attacker's arbitrary Python code on the server. This sequence of events enables the attacker to achieve remote code execution (RCE) on the SGLang server." Per security researcher Stuart Beck, who discovered and reported the flaw, the underlying issue stems from the use of jinja2.Environment() without sandboxing instead of ImmutableSandboxedEnvironment. This, in turn, enables a malicious model to execute arbitrary Python code on the inference server. The entire sequence of actions is as follows - An attacker creates a GGUF model file with a malicious tokenizer.chat_template containing a Jinja2 SSTI payload The template includes the Qwen3 reranker trigger phrase to activate the vulnerable code path in "entrypoints/openai/serving_rerank.py" Victim downloads and loads the model in SGLang from sources like Hugging Face When a request hits the "/v1/rerank" endpoint, SGLang reads the chat_template and renders it with jinja2.Environment() The SSTI payload executes arbitrary Python code on the server It's worth noting that CVE-2026-5760 falls under the same vulnerability class as CVE-2024-34359 (aka Llama Drama, CVSS score: 9.7), a now-patched critical flaw in the llama_cpp_python Python package that could have resulted in arbitrary code execution. The same attack surface was also rectified in vLLM late last year (CVE-2025-61620, CVSS score: 6.5). "To mitigate this vulnerability, it is recommended to use ImmutableSandboxedEnvironment instead of jinja2.Environment() to render the chat templates," CERT/CC said. "This will prevent the execution of arbitrary Python code on the server. No response or patch was obtained during the coordination process." Found this article interesting? Follow us on Google News, Twitter and LinkedIn to read more exclusive content we post. SHARE     Tweet Share Share Share SHARE  Command Injection, cybersecurity, Jinja2, Large Language Models, Open Source, remote code execution, server security, SGLang Trending News 108 Malicious Chrome Extensions Steal Google and Telegram Data, Affecting 20,000 Users Mirax Android RAT Turns Devices into SOCKS5 Proxies, Reaching 220,000 via Meta Ads New PHP Composer Flaws Enable Arbitrary Command Execution — Patches Released OpenAI Launches GPT-5.4-Cyber with Expanded Access for Security Teams Microsoft Issues Patches for SharePoint Zero-Day and 168 Other New Vulnerabilities Actively Exploited nginx-ui Flaw (CVE-2026-33032) Enables Full Nginx Server Takeover n8n Webhooks Abused Since October 2025 to Deliver Malware via Phishing Emails Cisco Patches Four Critical Identity Services, Webex Flaws Enabling Code Execution Apache ActiveMQ CVE-2026-34197 Added to CISA KEV Amid Active Exploitation Three Microsoft Defender Zero-Days Actively Exploited; Two Still Unpatched Anthropic MCP Design Vulnerability Enables RCE, Threatening AI Supply Chain Vercel Breach Tied to Context AI Hack Exposes Limited Customer Credentials Why Security Leaders Are Layering Email Defense on Top of Secure Email Gateways Why Threat Intelligence Is the Missing Link in CTEM Prioritization and Validation The Hidden Security Risks of Shadow AI in Enterprises Your MTTD Looks Great. Your Post-Alert Gap Doesn't Popular Resources Learn How to Block Breached Passwords in Active Directory Before Attacks Get Full Visibility into Vendor and Internal Risk in One Platform [Guide] Get Practical Steps to Govern AI Agents with Runtime Controls Secure Your AI Systems Across the Full Lifecycle of Risks

Indicators of Compromise

cve — CVE-2026-5760
cve — CVE-2024-34359
cve — CVE-2025-61620

Entities

SGLang (product)Jinja2 (technology)GGUF (GPT-Generated Unified Format) (technology)llama_cpp_python (product)vLLM (product)