LCGuard: Latent Communication Guard for Safe KV Sharing in Multi-Agent Systems
Multi-agent systems powered by Large Language Models (LLMs) are increasingly using "latent communication" to work together more efficiently. Instead of writing out messages in natural language, agents share their internal "Key-Value (KV) caches"—the high-dimensional data structures that represent the model's current reasoning state. While this makes collaboration faster and more effective, it creates a significant security risk: these caches contain sensitive, private information that is not meant to be shared. LCGuard is a new framework designed to protect this private data by transforming these internal representations before they are sent between agents, ensuring that sensitive details cannot be reconstructed by unauthorized parties while still allowing the agents to complete their tasks.
The Hidden Risk of Latent Communication
When agents share KV caches, they are essentially sharing their "working memory." Because these caches are dense and complex, they often hold onto contextual inputs or intermediate reasoning steps that were never intended to be part of the final output. An adversary—such as a compromised agent or a malicious actor monitoring the communication channel—could potentially train a decoder to "read" these caches and extract private user data or agent-specific secrets. Current safety tools are designed to filter text or monitor actions, but they are blind to this type of deep, representation-level leakage.
How LCGuard Works
LCGuard treats the communication process as a strategic game between two players: a "communicator" and an "adversary." * The Communicator: This component learns a transformation function that modifies the KV cache before it is transmitted. Its goal is to strip away sensitive information while keeping the data useful enough for the receiving agent to perform its job.
- The Adversary: This component acts as a stress-tester. It is trained to try and reconstruct the original sensitive inputs from the modified caches.
By pitting these two against each other in a minimax optimization process, LCGuard forces the communication function to become increasingly effective at hiding private data. The system uses a tunable parameter to balance the trade-off between privacy and task performance, allowing developers to decide how much "noise" or transformation is necessary based on their specific security needs.
Performance and Privacy Results
Empirical testing across various model families—including Qwen3, Gemma-2, and LLaMA—shows that LCGuard successfully reduces the success rate of reconstruction attacks compared to standard, unprotected KV sharing. While traditional methods like injecting noise (differential privacy) often significantly degrade the agents' ability to work together, LCGuard maintains competitive task accuracy and helpfulness. The framework is flexible enough to handle different communication structures, such as sequential or hierarchical setups, and effectively mitigates both local leakage (between two agents) and system-level leakage (where information is aggregated across multiple communication paths).
Key Takeaways
LCGuard provides a principled way to secure the "hidden" channels of multi-agent systems. By focusing on the reconstructability of data rather than just the final text output, it addresses a fundamental vulnerability in modern AI collaboration. It demonstrates that it is possible to maintain the efficiency benefits of latent communication without sacrificing the privacy of the underlying inputs, offering a robust defense against representation-level data exposure.
Comments (0)
to join the discussion
No comments yet
Be the first to share your thoughts!