AI SecurityApr 9, 2026

Apple Intelligence AI Guardrails Bypassed in New Attack

RSAC researchers bypass Apple Intelligence guardrails using Neural Execs and Unicode manipulation attacks.

Summary

Researchers from RSAC discovered a method to bypass Apple Intelligence's safety protocols by combining Neural Execs prompt injection attacks with Unicode manipulation techniques, achieving a 76% success rate across 100 test prompts. The attack could allow adversaries to force the local LLM to produce offensive content or manipulate private data within third-party applications. Apple was notified in October 2025 and deployed protections in iOS 26.4 and macOS 26.4; no evidence of malicious exploitation has been observed.

Full text

Researchers from RSAC have found a way to bypass the safety protocols of Apple’s Intelligence AI with a high success rate. Apple Intelligence is a deeply integrated personal intelligence system for iOS, iPadOS, and macOS that combines generative AI with personal context. It primarily processes tasks directly on Apple silicon via a compact on-device LLM. The AI draws on the user’s unique context (messages, photos, and schedules) to power practical features such as system-wide writing tools and Siri. For more complex reasoning, it offloads requests to larger foundation models via Private Cloud Compute (PCC) on Apple’s dedicated cloud infrastructure. Apple Intelligence has been examined by the research team of RSAC, the organization that hosts the RSAC Conference. The researchers set out to bypass the local LLM’s input and output filters (designed to block malicious input and prevent undesirable output), as well as internal guardrails to influence its actions. To achieve this, they combined two distinct adversarial techniques. The first is Neural Execs, a known prompt injection attack that uses ‘gibberish’ inputs to trick the AI into executing arbitrary, attacker-defined tasks. These inputs act as universal triggers that do not need to be remade for different payloads.Advertisement. Scroll to continue reading. The second method, used by the RSAC researchers to bypass input and output filters, is Unicode manipulation. By writing malicious output text backward and using the Unicode right-to-left-override function they were able to bypass content restrictions. “Essentially, we encoded the malicious/offensive English-language output text by writing it backwards and using our Unicode hack to force the LLM to render it correctly,” the researchers explained. Combining the two methods can allow attackers to force the local Apple Intelligence LLM to produce offensive content or, more critically, manipulate private data and functionality within third-party applications integrated with Apple Intelligence, such as health data or personal media. The attack was tested with 100 random prompts and the researchers achieved a success rate of 76%. They estimate that between 100,000 and 1 million users have installed apps that may be vulnerable to such attacks. “RSAC estimates that there were at least 200 million Apple Intelligence-capable devices in consumers’ hands as of December 2025, and the Apple App Store already features apps using Apple Intelligence—so it’s already a high-value target,” the researchers noted. Apple was notified in October 2025 and, according to RSAC Research, protections were rolled out in the recent iOS 26.4 and macOS 26.4 The researchers have not seen any evidence of malicious exploitation. Related: Google API Keys in Android Apps Expose Gemini Endpoints to Unauthorized Access Related: Anthropic Unveils ‘Claude Mythos’ – A Cybersecurity Breakthrough That Could Also Supercharge Attacks Related: The New Rules of Engagement: Matching Agentic Attack Speed Written By Eduard Kovacs Eduard Kovacs (@EduardKovacs) is senior managing editor at SecurityWeek. He worked as a high school IT teacher before starting a career in journalism in 2011. Eduard holds a bachelor’s degree in industrial informatics and a master’s degree in computer techniques applied in electrical engineering. More from Eduard Kovacs Data Leakage Vulnerability Patched in OpenSSLMassachusetts Hospital Diverts Ambulances as Cyberattack Causes Disruption US Disrupts Russian Espionage Operation Involving Hacked Routers and DNS HijackingSevere StrongBox Vulnerability Patched in AndroidGPUBreach: Root Shell Access Achieved via GPU Rowhammer Attack White House Seeks to Slash CISA Funding by $707 MillionWynn Resorts Says 21,000 Employees Affected by ShinyHunters HackT-Mobile Sets the Record Straight on Latest Data Breach Filing Latest News Can we Trust AI? No – But Eventually We MustGoogle API Keys in Android Apps Expose Gemini Endpoints to Unauthorized AccessPalo Alto Networks, SonicWall Patch High-Severity VulnerabilitiesThe Hidden ROI of Visibility: Better Decisions, Better Behavior, Better SecurityGoogle Warns of New Campaign Targeting BPOs to Steal Corporate DataAdobe Reader Zero-Day Exploited for Months: Researcher300,000 People Impacted by Eurail Data Breach$3.6 Million Stolen in Bitcoin Depot Hack Trending Daily Briefing Newsletter Subscribe to the SecurityWeek Email Briefing to stay informed on the latest threats, trends, and technology, along with insightful columns from industry experts. Webinar: Securing Fragile OT in an Exposed World March 10, 2026 Get a candid look at the current OT threat landscape as we move past "doom and gloom" to discuss the mechanics of modern OT exposure. Register Webinar: Why Automated Pentesting Alone Is Not Enough April 7, 2026 Join our live diagnostic session to expose hidden coverage gaps and shift from flawed tool-level evaluations to a comprehensive, program-level validation discipline. Register People on the MoveJohn Clancy has become Chief Executive Officer at Bitsight.Halcyon has appointed Dave Hannigan as Field Chief Information Security Officer.Pamela McLeod has been named as CISO of the state of New Hampshire.More People On The MoveExpert Insights The Hidden ROI of Visibility: Better Decisions, Better Behavior, Better Security Beyond monitoring and compliance, visibility acts as a powerful deterrent, shaping user behavior, improving collaboration, and enabling more accurate, data-driven security decisions. (Joshua Goldfarb) The New Rules of Engagement: Matching Agentic Attack Speed The cybersecurity response to AI-enabled nation-state threats cannot be incremental. It must be architectural. (Nadir Izrael) The Next Cybersecurity Crisis Isn’t Breaches—It’s Data You Can’t Trust Data integrity shouldn’t be seen only through the prism of a technical concern but also as a leadership issue. (Steve Durbin) Why Agentic AI Systems Need Better Governance – Lessons from OpenClaw Agentic AI platforms are shifting from passive recommendation tools to autonomous action-takers with real system access, (Etay Maor) The Human IOC: Why Security Professionals Struggle with Social Vetting Applying SOC-level rigor to the rumors, politics, and 'human intel' can make or break a security team. (Joshua Goldfarb) Flipboard Reddit Whatsapp Whatsapp Email

Entities

Apple (vendor)Apple Intelligence (product)iOS 26.4 (product)macOS 26.4 (product)Neural Execs (technology)Unicode manipulation (technology)