Back to AI Research

AI Research

LCGuard: Latent Communication Guard for Safe KV Sha... | AI Research

Key Takeaways

  • LCGuard: Latent Communication Guard for Safe KV Sharing in Multi-Agent Systems Multi-agent systems powered by Large Language Models (LLMs) are increasingly u...
  • Large language model (LLM)-based multi-agent systems increasingly rely on intermediate communication to coordinate complex tasks.
  • To address this, we introduce \textbf{LCGuard} (Latent Communication Guard), a framework for safe KV-based latent communication in multi-agent LLM systems.
  • LCGuard treats shared KV caches as latent working memory and learns representation-level transformations before cache artifacts are transmitted across agents.
  • We formalize representation-level sensitive information leakage operationally through reconstruction: a shared cache artifact is unsafe if an adversarial decoder can recover agent-specific sensitive inputs from it.
Paper AbstractExpand

Large language model (LLM)-based multi-agent systems increasingly rely on intermediate communication to coordinate complex tasks. While most existing systems communicate through natural language, recent work shows that latent communication, particularly through transformer key-value (KV) caches, can improve efficiency and preserve richer task-relevant information. However, KV caches also encode contextual inputs, intermediate reasoning states, and agent-specific information, creating an opaque channel through which sensitive content may propagate across agents without explicit textual disclosure. To address this, we introduce \textbf{LCGuard} (Latent Communication Guard), a framework for safe KV-based latent communication in multi-agent LLM systems. LCGuard treats shared KV caches as latent working memory and learns representation-level transformations before cache artifacts are transmitted across agents. We formalize representation-level sensitive information leakage operationally through reconstruction: a shared cache artifact is unsafe if an adversarial decoder can recover agent-specific sensitive inputs from it. This leads to an adversarial training formulation in which the adversary learns to reconstruct sensitive inputs, while LCGuard learns transformations that preserve task-relevant semantics and reduce reconstructable information. Empirical evaluations across multiple model families and multi-agent benchmarks show that LCGuard consistently reduces reconstruction-based leakage and attack success rates while maintaining competitive task performance compared to standard KV-sharing baselines.

LCGuard: Latent Communication Guard for Safe KV Sharing in Multi-Agent Systems
Multi-agent systems powered by Large Language Models (LLMs) are increasingly using "latent communication" to work together more efficiently. Instead of writing out messages in natural language, agents share their internal "Key-Value (KV) caches"—the high-dimensional data structures that represent the model's current reasoning state. While this makes collaboration faster and more effective, it creates a significant security risk: these caches contain sensitive, private information that is not meant to be shared. LCGuard is a new framework designed to protect this private data by transforming these internal representations before they are sent between agents, ensuring that sensitive details cannot be reconstructed by unauthorized parties while still allowing the agents to complete their tasks.

The Hidden Risk of Latent Communication

When agents share KV caches, they are essentially sharing their "working memory." Because these caches are dense and complex, they often hold onto contextual inputs or intermediate reasoning steps that were never intended to be part of the final output. An adversary—such as a compromised agent or a malicious actor monitoring the communication channel—could potentially train a decoder to "read" these caches and extract private user data or agent-specific secrets. Current safety tools are designed to filter text or monitor actions, but they are blind to this type of deep, representation-level leakage.

How LCGuard Works

LCGuard treats the communication process as a strategic game between two players: a "communicator" and an "adversary." * The Communicator: This component learns a transformation function that modifies the KV cache before it is transmitted. Its goal is to strip away sensitive information while keeping the data useful enough for the receiving agent to perform its job.

  • The Adversary: This component acts as a stress-tester. It is trained to try and reconstruct the original sensitive inputs from the modified caches.
    By pitting these two against each other in a minimax optimization process, LCGuard forces the communication function to become increasingly effective at hiding private data. The system uses a tunable parameter to balance the trade-off between privacy and task performance, allowing developers to decide how much "noise" or transformation is necessary based on their specific security needs.

Performance and Privacy Results

Empirical testing across various model families—including Qwen3, Gemma-2, and LLaMA—shows that LCGuard successfully reduces the success rate of reconstruction attacks compared to standard, unprotected KV sharing. While traditional methods like injecting noise (differential privacy) often significantly degrade the agents' ability to work together, LCGuard maintains competitive task accuracy and helpfulness. The framework is flexible enough to handle different communication structures, such as sequential or hierarchical setups, and effectively mitigates both local leakage (between two agents) and system-level leakage (where information is aggregated across multiple communication paths).

Key Takeaways

LCGuard provides a principled way to secure the "hidden" channels of multi-agent systems. By focusing on the reconstructability of data rather than just the final text output, it addresses a fundamental vulnerability in modern AI collaboration. It demonstrates that it is possible to maintain the efficiency benefits of latent communication without sacrificing the privacy of the underlying inputs, offering a robust defense against representation-level data exposure.

Comments (0)

No comments yet

Be the first to share your thoughts!