AI SecurityMay 18, 2026

We red-teamed a government AI built to refuse everything outside its lane. At first, it blocked...

Red team successfully bypasses government AI safety guardrails via structural attacks after semantic jailbreaks fail.

Summary

Security researchers conducted red-team testing on a government-built AI system designed to refuse out-of-scope requests. Initial semantic attacks and jailbreak attempts were successfully blocked, but the team discovered that structural attacks—targeting the underlying system architecture rather than request meaning—could bypass the safety constraints. This highlights a critical gap between semantic robustness and architectural resilience in AI safety mechanisms.

Entities

Government AI safety system (technology)