AI SecurityJun 22, 2026

Guarding AI memory

Microsoft details AI memory attack risks and defense strategies for enterprise AI systems.

Summary

Microsoft Security Blog publishes research on threats to AI memory systems, where attackers can gradually poison stored memories to manipulate AI agent behavior over time rather than in single interactions. The company outlines a defense-in-depth approach spanning memory storage, retrieval, model interaction, and user controls, including sanitization checks, prompt-injection classifiers, task adherence verification, and compliance policies for Microsoft 365 Copilot.

Full text

Share Link copied to clipboard! Content typesResearchProducts and servicesMicrosoft Security CopilotTopicsActionable threat insightsAI and agents AI memory transforms an AI system from a stateless tool into a learning collaborator. That unlocks powerful experiences, but it also increases the attack surface of the AI system. Without memory, attackers need to achieve their objective in a single prompt. With AI memory, they can shape behavior gradually over time or plant memories that influence agent reasoning after the original context is gone and user awareness is lower. Microsoft takes a defense-in-depth approach to protect AI memory spanning every layer of the stack: storage, retrieval, model interaction, and user control. What AI memory is (and why it matters) AI systems use memory to retain and recall information across interactions. This information is then used to shape future behavior. This enables: Personalization: Agents gain a deep understanding of the user’s preferences. This provides continuity across interactions. Agentic coherence: Agents build durable domain knowledge that strengthens performance. As AI systems evolve, this persistent state becomes central to both capability and correctness. What is an agent memory attack? AI memory serves two roles. It stores high-value user information and must be protected like customer data. It also shapes agent behavior and drives tool calls and must be governed with the same rigor as any system that can act. Memory governance is also challenging since memory events usually happen asynchronously from user interactions, changing traditional human in the loop patterns. AI memory changes the threat model. Without memory, attackers need to “win” in a single prompt. Using AI memory, an attacker can stage an attack over time. Once compromised, memory can trigger behaviors outside of their original context. Since AI memory attacks happen outside of their original context, defenses are often lower and forensics are harder. Building safe AI memory is one of the most consequential challenges in AI. It requires balancing personalization, capability, privacy, security, and governance. Scenario: delayed tool execution through adversarial memory poisoning The following is a hypothetical scenario illustrating this class of risk. While simplified for clarity, it reflects patterns observed in real-world research. Microsoft designs protections to detect and mitigate these patterns as they evolve: A user opens a shared document. Its formatting contains hidden instructions embedded by an attacker intended for the AI assistant: a directive to exfiltrate the user’s schedule. The assistant processes the document but takes no immediate action. Days later, in an unrelated conversation, that message triggers the dormant malicious instructions from the earlier session, causing the assistant to update its memory with attacker-defined content. The attacker now gets all updates to the user’s schedule. This is delayed tool invocation: the attack’s power lies in the temporal gap between exposure and execution. How Microsoft approaches memory security in Microsoft 365 Memory Creation Memories pass through sanitization checks on write. Proprietary Microsoft prompt-injection classifiers inspect content for malicious input and strip it before anything is written. M365 Copilot is designed to run Task Adherence checks on every explicit memory write. Task Adherence identifies discrepancies such as misaligned tool invocations relative to user intent, mitigating prompt injection impact for the memory tool call. Personalization using AI memory can be controlled with tenant level policy. Memory Storage Once stored, memories are governed by the data policies available across M365 like Data Subject Requests (DSR) and tenant isolation. They follow the same security and compliance policies as other mailbox data, such as Customer Lockbox and encryption at rest. Observability M365 Copilot records when a memory is updated to organizational audit logs. The goal is end-to-end traceability: from the source content Copilot processed, to what it chose to remember, to how that memory influenced later interactions. Today, SOC analysts can join the MemoryUpdated field, available in Defender Advanced Hunting, Defender Sentinel, and Azure Portal Sentinel Analytics, with their existing analytics to triage incidents and build new alerts on memory activity. In summary: CapabilityWhat It Means for YouTask AdherenceDetect tool call misalignment with user intent, mitigating prompt injection impact. This provides protection against manipulation of memory tool callsUnified compliance boundaryMemory governed by the same policies, retention rules, and investigation workflows as email, chat, and documentsMemory audit eventsProvides visibility into when memory changes, integrated with your existing security operationseDiscoverySupports search and removal of AI-related data using the compliance tools you already have. Microsoft continues to invest in AI memory security as an active, iterative program. The protections and visibility described here reflect capabilities available today, with continued hardening and enrichment underway. Capabilities described are subject to configuration, licensing, and service availability. The following section shares the framework guiding our investments. This case study is based on MSRC cases from Johann Rehberger (first finder), Håkon Måløy, and Gal Zror. We are grateful to the security researchers who engaged with us and informed better memory design practices through coordinated vulnerability disclosure. Their work strengthens the systems customers rely on. A guiding framework for building safe AI memory AI memory requires balancing personalization, capability, privacy, security, and governance. Our AI memory strategy is guided by design principles for building safe memory systems. These principles address core failure modes that can undermine trust, security, and operability at scale. Establish intent and provenance before persistence: Memory can be influenced indirectly by untrusted content, and without provenance it becomes difficult to assess whether stored information is trustworthy, appropriate to retain, or safe to use later. Memory should only be written when it reflects legitimate user intent, is aligned to the service’s purpose, and carries clear metadata about where it came from. Enforce boundaries outside the model: Memory access and isolation should be controlled by deterministic systems, not model instructions. Prompting alone is not a reliable security boundary; strong enforcement prevents sensitive memory from leaking across users, agents, or tenants. Treat retrieval as a risk decision: Memory that was safe to store can become stale, manipulated, or misleading over time. Uncritical retrieval can directly affect agent behavior. Treat retrieved candidate context and re-evaluated for relevance, freshness, and tampering before use. Provide full lifecycle visibility for security teams: Without auditability and chain of custody, memory cannot be reliably investigated, trusted, or safely expired during incident response. Security teams need clear records of what changed, when, why, from where, and access attempts. Keep users in control: Users should be able to understand how memory is shaping their experience and have meaningful controls to review, edit, and delete it. Transparency and control are essential to user trust, and they help ensure memory remains aligned with user expectations over time. Taken together, these principles reflect where we’re headed: advancing agent capability and control together. Getting that balance right is one of the hardest challenges in the industry, but we believe the agents that scale furthest will be the ones that are also trustworthy, governable, and resilient by design. Key takeaways Memory turns transient threats into persistent ones. You can’t secure what you can’t see. Full lifecycle

Entities

Microsoft (vendor)Microsoft 365 Copilot (product)Microsoft Security Copilot (product)AI memory (technology)Prompt injection (technology)