IBM Built a Privacy Guard for Shared AI Memory. It Was Only Tested on Open Models.
When AI agents need to work together efficiently, they share a memory structure called a KV cache, a record of everything they've processed together so far. Researchers at IBM say that shared memory is also a data leak. A new paper describes a tool to close it, but with an important caveat: the tool has only been tested on AI models whose inner workings are publicly visible, not on the proprietary models that power most enterprise deployments.
The research, submitted to arXiv on May 21 by IBM Research and Rensselaer Polytechnic Institute, identifies a vulnerability in how multi-agent AI systems share working memory to avoid redundant computation. These systems, increasingly common in enterprise settings, use a technique called KV cache sharing: instead of each agent independently reprocessing everything the conversation has said so far, they share a cache of prior computations. It's faster and cheaper. It's also, the researchers argue, a privacy hazard.
"The KV cache encodes contextual inputs, intermediate reasoning states, and agent-specific information, creating an opaque channel through which sensitive content may propagate across agents without explicit textual disclosure," the paper states. In plain English: when one agent asks another for help, the helper sees not just the question but the entire working memory behind it — including information the requesting agent may have been given in confidence.
The researchers measured the problem using a benchmark called AgentLeak. They found that internal communication channels between agents leaked sensitive content at a 74 percent rate, compared to 28.2 percent for external channels — and an average of 79.7 percent across five tested models. NVIDIA has separately acknowledged the risk in its developer guidance, noting that prefix caching — a related performance technique — "can introduce security risks by allowing attackers to infer details about other users' prompts through timing differences."
IBM's proposed fix is a framework called LCGuard (Latent Communication Guard). Rather than sharing raw cache artifacts between agents, LCGuard applies learned transformations to the shared memory before transmission — stripping recoverable sensitive content while preserving the information agents actually need to complete their tasks. The framework uses an adversarial training setup: one part of the system tries to reconstruct sensitive inputs from the transformed cache, while LCGuard learns to block that reconstruction. The paper shows this consistently reduces reconstruction-based leakage while keeping task performance competitive with baselines.
The benchmark results are real and the approach is novel — LCGuard appears to be the first framework-level solution using representation-level transformations for this specific problem, according to IBM Research. But the evaluation suite covers only open-weight model families: Qwen3 in three sizes (4B, 8B, and 14B parameters), Gemma-2-9B from Google, and LLaMA in 3B and 8B sizes. These are models anyone can download, inspect, and modify. They are not the frontier models running most enterprise AI deployments.
The gap matters. Open-weight models and proprietary frontier models differ in how they represent information internally — a difference that affects both what gets stored in a shared cache and what a privacy guard needs to filter. LCGuard's protective properties are proven for one class of models; there is no public evidence they transfer to Claude, GPT-4, Gemini, or the other systems enterprises are actually building on. The paper evaluates a real vulnerability with a real mitigation. Whether the mitigation holds where it would matter most is an open question.
LCGuard joins a small but growing category of agent-specific security tools. NVIDIA's developer guidance, a 2025 NDSS paper on KVComm, and an NDSS 2026 paper on securing agent platforms all address the same underlying problem from different angles — suggesting the industry is converging on the view that agent memory sharing is a surface that needs hardening. IBM's contribution is the first to propose a learned, adversarial training approach at the representation level rather than relying on static rules or output filtering.
What to watch next is whether the approach gets tested on proprietary models — and whether enterprise AI vendors treat that gap as their problem to solve or their customers'. The research is dated May 21; it is new enough that no public response from major model providers has surfaced. The vulnerability is not theoretical: shared KV caches are a common production pattern in systems built for efficiency. The fix is new, the test is incomplete, and the stakes are real for any team putting sensitive data through a multi-agent pipeline.
LCGuard was evaluated on AgentLeak, MAGPIE, and PrivacyLens benchmarks across Qwen3, Gemma-2-9B, and LLaMA model families, reducing reconstruction-based leakage while maintaining competitive task performance. The full evaluation is in the arXiv preprint. NVIDIA's guidance on KV cache security risks is on the NVIDIA Developer Blog. The AgentLeak benchmark showing 79.7 percent average leak rate is on GitHub.