Silent Drift: How LLMs Are Quietly Breaking Organizational Access Control
LLMs can generate syntactically valid but semantically flawed access control policies, quietly expanding organizational
Summary
Researcher Vatsal Gupta warns that large language models used to generate policy-as-code (Rego, Cedar) often produce access control policies that compile successfully but contain subtle semantic flaws—missing conditions, hallucinated attributes, omitted deny logic—that silently expand permissions beyond intent. These failures are difficult to detect as they don't trigger build errors or alerts, and accumulate over time as policies are continuously generated and deployed, creating a large and poorly understood attack surface. The solution is not to abandon LLM assistance but to introduce validation layers, testing, and treat authorization logic as high-risk rather than trusting generated code by default.
Full text
Business efficiency demands maximum use of AI assistance, but where policy as code is concerned, AI can to introduce serious policy flaws. The shift to policy as code for organizational security, compliance, and operational rules, is being followed by increased use of LLM artificial intelligence to help produce the raw code. This makes sense. A primary purpose of AI within business is to improve human efficiency, and writing policy in languages like Rego or Cedar is not easy. AI is increasingly used to streamline the process. But there is a problem. These generated policies often look correct, compile successfully, and still grant the wrong access. This shouldn’t be a complete surprise. AI generated applications are already known to be capable of introducing security issues by choosing the simplest solution over the most secure solution. However, security issues in an organizational policy that is designed to prevent security issues is especially problematic. Independent researcher (and senior security engineer at Apple), Vatsal Gupta, has been examining these issues, and discussed them with SecurityWeek. “LLMs are being introduced into engineering workflows. Developers are using them to generate infrastructure code, security rules, and now even access control policies,” he says. The appeal is obvious. “Instead of writing policy logic manually, teams can describe intent in plain language and let the model generate the enforcement logic.” But it doesn’t always work that way. “LLM-generated policies are often syntactically valid but semantically incorrect,” continues Vatsal. “One missing condition, a misinterpreted attribute, or an incorrect action can completely redefine who gets access to what.”Advertisement. Scroll to continue reading. These are not obvious failures. They don’t break builds or trigger alerts. But they quietly expand access boundaries. And Vatsal’s research has found various recurring failure patterns. A common issue, he tells us, is missing contextual restraints. “A policy that is supposed to limit access based on region, department, or ownership may omit that condition entirely. The generated policy still looks clean and valid, but it now applies globally instead of within the intended scope.” A second, he continues, is missing deny logic. “Many access control policies rely on a baseline deny posture with specific exceptions. LLMs often capture the exception but fail to encode the underlying restriction. The result is a policy that allows more than intended, even though it appears to implement the requirement.” Then there’s the standard recurring problem with LLMs — the potential to hallucinate. “Models sometimes introduce attributes that do not exist in the actual system schema. The policy compiles, but at runtime it behaves unpredictably because it relies on data that is not present or incorrectly mapped.” Temporal and contextual conditions are frequently dropped. “Policies that depend on time windows, approvals, or session context are simplified into static rules. What was meant to be controlled, time-bound access becomes always-on access.” And the last concern: “Even action misclassification can occur. A policy intended to restrict a sensitive action like deletion may be translated into a broader or different operation. The difference may be small in wording, but large in impact.” All these failings are natural outcomes from AI’s intention to interpret and simplify language. The result can be a policy that looks good, feels good and tastes good, but simply isn’t good. And detection of not goodness is difficult. Over time, these small deviations accumulate. Policies are no longer static artifacts reviewed occasionally – they are generated, updated, and deployed continuously. “As more policies are generated, deployed, and reused, the risk compounds,” continues Vatsal. Organizations may believe they are enforcing least privilege while actually drifting toward over-permissioned environments. “If the generation process is not reliable, the risk becomes systemic,” he adds. “Organizations may end up with thousands of subtly flawed policies. Each flaw may be individually small, but collectively they create a large and difficult-to-understand attack surface.” The solution, he says, is not to abandon LLMs but to change our trust model, especially where policy is concerned. “Generated policies should not be treated as correct by default; validation layers between generation and enforcement should be introduced to ensure all required components are present, correct and consistent with expected behavior; policies should be tested, not just compiled; and deny-by-default principles should be enforced explicitly.” Most importantly, he adds, “Organizations need to treat authorization logic as a high-risk domain.” Just because a model can generate code does not mean that code is safe to deploy without scrutiny. “As we move toward AI-assisted security engineering, the goal should not just be automation. It should be correctness, auditability, and trust, because in authorization, ‘almost correct’ isn’t good enough,” Vatsal told SecurityWeek. Learn More at the AI Risk Summit Related: Vibe Coding Tested: AI Agents Nail SQLi but Fail Miserably on Security Controls Related: Vibe Coding: When Everyone’s a Developer, Who Secures the Code? Related: Groucho’s Wit, Cloud Complexity, and the Case for Consistent Security Policy Related: How to Eliminate the Technical Debt of Insecure AI-Assisted Software Development Written By Kevin Townsend Kevin Townsend is a Senior Contributor at SecurityWeek. He has been writing about high tech issues since before the birth of Microsoft. For the last 15 years he has specialized in information security; and has had many thousands of articles published in dozens of different magazines – from The Times and the Financial Times to current and long-gone computer magazines. More from Kevin Townsend AI Speeds Attacks, But Identity Remains Cybersecurity’s Weakest LinkDoE Publishes 5-Year Energy Security PlanIran Readied Cyberattack Capabilities for Response Prior to Epic FuryHacker Conversations: Ben Harris, From Unintentional Young Hacker to Intentional Adult CEOThe Collapse of Predictive Security in the Age of Machine-Speed AttacksShadow AI Risk: How SaaS Apps Are Quietly Enabling Massive BreachesAI, APIs and DDoS Collide in New Era of Coordinated CyberattacksCISO Conversations: Aimee Cardwell Latest News Healthcare IT Platform CareCloud Probing Potential Data BreachHuskeys Emerges From Stealth With $8 Million in FundingRussian APT Star Blizzard Adopts DarkSword iOS Exploit KitEuropean Commission Reports Cyber Intrusion and Data TheftHacked Hospitals, Hidden Spyware: Iran Conflict Shows How Digital Fight Is Ingrained in WarfareTelnyx Targeted in Growing TeamPCP Supply Chain AttackExploitation of Fresh Citrix NetScaler Vulnerability BeginsFBI Confirms Kash Patel Email Hack as US Offers $10M Reward for Hackers Trending Daily Briefing Newsletter Subscribe to the SecurityWeek Email Briefing to stay informed on the latest threats, trends, and technology, along with insightful columns from industry experts. Webinar: Securing Fragile OT in an Exposed World March 10, 2026 Get a candid look at the current OT threat landscape as we move past "doom and gloom" to discuss the mechanics of modern OT exposure. Register Webinar: Why Automated Pentesting Alone Is Not Enough April 7, 2026 Join our live diagnostic session to expose hidden coverage gaps and shift from flawed tool-level evaluations to a comprehensive, program-level validation discipline. Register People on the MoveModerna has promoted Farzan Karimi to Deputy Chief Information Security Officer.Brian Goldfarb has been appointed Chief Marketing Officer at SentinelOne.Token has appointed Katy Nelson as Chief Revenue Officer.More People On The MoveExpert Insights Why Agentic AI Systems Need Better Governance – Lessons from OpenClaw Agentic AI platforms are shifting from passive recomm