Back to Feed
AI SecurityApr 30, 2026

Can AI Attack the Cloud? Lessons From Building an Autonomous Cloud Offensive Multi-Agent System

Unit 42 demonstrates autonomous multi-agent AI system attacking cloud environments via chained exploits.

Summary

Unit 42 researchers built "Zealot," a multi-agent LLM-based penetration testing proof of concept, to empirically test autonomous AI offensive capabilities against cloud infrastructure. The system successfully chained SSRF exploitation, credential theft, service account impersonation, and data exfiltration against a sandboxed GCP environment, demonstrating that AI acts as a force multiplier for accelerating exploitation of known misconfigurations rather than creating entirely new attack surfaces. The research was motivated by Anthropic's November 2025 disclosure of a state-sponsored campaign where AI performed 80-90% of operations autonomously, shifting the conversation from theoretical risk to practical threat.

Full text

Threat Research CenterThreat ResearchCloud Cybersecurity Research Cloud Cybersecurity Research Can AI Attack the Cloud? Lessons From Building an Autonomous Cloud Offensive Multi-Agent System 12 min read Related ProductsCortexCortex CloudCortex XDRCortex XSIAMUnit 42 AI Security AssessmentUnit 42 Cloud Security AssessmentUnit 42 Incident Response By:Yahav FestingerChen Doytshman Published:April 23, 2026 Categories:Cloud Cybersecurity ResearchThreat Research Tags:AICloudData exfiltrationGCPGoogle CloudLLMsMulti-agentPenetration testing Share Executive Summary The offensive capabilities of large language models (LLMs) have until recently existed as theoretical risks – frequently discussed at security conferences and in conceptual industry reports, but rarely discovered in practical exploits. However, in November 2025, Anthropic published a pivotal report documenting a state-sponsored espionage campaign. In this operation, AI didn't just assist human operators – it became the operator, performing 80-90% of the campaign autonomously, at speeds that no human team could match. This disclosure shifted the conversation from "could this happen?" to "this is happening." But it also raised practical questions: Can AI actually operate autonomously end-to-end, or does it still require human guidance at each decision point? Where do current LLM capabilities excel, and where do they fall short compared to skilled human operators? To answer these questions, we built a multi-agent penetration testing proof of concept (PoC), designed to empirically test autonomous AI offensive capabilities against cloud environments. The findings from this PoC reveal that although AI does not necessarily create new attack surfaces, it serves as a force multiplier, rapidly accelerating the exploitation of well-known, existing misconfigurations. Building the agent raised further questions about AI-driven attacks: Could AI systems autonomously discover vulnerabilities, execute multi-stage attacks and operate at machine speed against cloud infrastructure? We provide a walkthrough of our multi-agent PoC architecture, demonstrate its attack chain against a misconfigured sandboxed Google Cloud Platform (GCP) environment and offer an honest assessment of what this means for defenders. Palo Alto Networks customers are better protected from the threats described in this article through the following products and services: Cortex XDR and XSIAM Cortex Cloud Organizations can gain help assessing cloud security posture through the Unit 42 Cloud Security Assessment. The Unit 42 AI Security Assessment can help empower safe AI use and development. If you think you might have been compromised or have an urgent matter, contact the Unit 42 Incident Response team. Related Unit 42 Topics Cloud, AI, Multi-Agent, LLM, Google Background: LLM Agents and Security Following Anthropic's disclosure of AI-orchestrated espionage – which detailed how agentic models could independently identify and weaponize complex architectural flaws – we set out to discover the true capabilities of these systems in a live cloud environment. We built a multi-agent penetration testing PoC to empirically test autonomous AI offensive capabilities within cloud environments. We named this agent "Zealot," a reference to a type of warrior in a popular real-time strategy video game. The name reflects the PoC’s role as a fast, high-performance frontline tool designed for automated precision in cloud environments. The system utilizes a supervisor agent model that coordinates three specialist agents: Infrastructure Agent Application Security Agent Cloud Security Agent The agents share attack state and transfer context throughout the operation. During sandbox tests, our multi-agent system autonomously chained server-side request forgery (SSRF) exploitation, metadata service credential theft, service account impersonation and BigQuery data exfiltration. Figure 1 shows Zealot in action. Figure 1. Zealot user prompt example. What Are LLM Agents and Multi-Agent Systems? While standard LLM interactions involve single prompt-response exchanges, an agent operates in a loop. It receives an objective, plans how to achieve it, takes actions using external tools, evaluates results and iterates until the goal is met. The key distinction is autonomy – agents don't just answer questions; they proactively navigate workflows to reach a desired outcome. Multi-agent systems take this a step further. Rather than a single agent handling all tasks, specialized agents with distinct tools and expertise collaborate as a team. For offensive security, this means that a multi-agent system could break down a complex intrusion into phases – reconnaissance, exploitation, privilege escalation, exfiltration – with dedicated agents handling each stage and sharing intelligence as they progress. Cloud Environments Are AI-Attack-Ready Understanding the potential threat of autonomous AI agents requires examining the tactics already being used by human adversaries within cloud ecosystems. Threat actors exploit identity and access management (IAM) misconfigurations to escalate from compromised service accounts to organization-wide access, abuse legitimate cloud services for persistence and exfiltration, and strategically chain vulnerabilities such as metadata service exploitation and overly permissive cross-service trust relationships. Cloud environments are particularly susceptible to autonomous AI threats for the following reasons: API-driven by design: Every action has a programmatic equivalent – precisely the structured interface that LLM agents navigate effectively. Rich discovery mechanisms: Metadata services, resource enumeration and IAM introspection let agents query the environment to understand what exists and what paths lead to higher privileges. Complexity as an attack surface: Misconfigurations thrive in sprawling, interconnected environments. An AI that systematically enumerates this complexity may find paths that human reviewers miss. Credential-based access: Once an agent obtains valid credentials, it operates as a legitimate user, making detection harder. The Reality Gap Despite the theoretical risks, a gap has persisted between what agentic AI could do in offensive security and what it has actually been shown to do in a cloud environment. Most public discourse remains speculative, with little empirical evidence of autonomous AI executing real, end-to-end attacks on live cloud architecture. Without empirical evidence, security teams struggle to anticipate evolving threats: Is autonomous AI an immediate threat or a longer-term concern? How do current LLM capabilities compare to skilled human adversaries? With Zealot, we aim to provide a transparent, reproducible framework that enables us to examine autonomous AI offensive capabilities and their current limitations on a complex cloud environment. System Architecture The Supervisor-Agent Model To create our multi-agent proof of concept, we implemented an orchestration design. Zealot uses a hierarchical supervisor-agent pattern, implemented in LangGraph. A central supervisor agent receives the overall objective and orchestrates specialist agents to achieve it. Rather than a rigid, predefined workflow, the supervisor dynamically decides which agent to invoke based on the current attack state and what the situation requires. The supervisor operates in a continuous loop. It analyzes the current state, determines which specialist agent should act next, delegates with specific instructions, receives results and then repeats the process. The supervisor maintains awareness of what has been discovered, what has been compromised, and what objectives remain to be achieved. Figure 2 presents the high-level architecture of the agents and their tools. Figure 2. Zealot supervisor-agent architecture and tool assignments. Critically, the supervisor doesn't micromanage. It provides each specialist agent with context and a goal, then le

Indicators of Compromise

  • malware — Zealot

Entities

Palo Alto Networks (vendor)Anthropic (vendor)Google (vendor)Google Cloud Platform (GCP) (product)Cortex XDR (product)Large Language Models (LLMs) (technology)