MemPrivacy: Secure Edge-Cloud Framework for AI Memory

Key Takeaways

  • Enables personalized AI memory without exposing sensitive user data like passwords or health records to cloud providers.
  • Reduces memory utility loss to under 1.6%, significantly outperforming traditional masking methods that often break AI reasoning.
  • Provides a modular, edge-based framework that integrates easily with existing memory systems like Mem0 and LangMem.

As LLM-powered agents transition from research to production, developers face a significant design tension: the more personalized and useful cloud-hosted memory becomes, the more sensitive user data it exposes. To address this, researchers from MemTensor, HONOR Device, and Tongji University have introduced MemPrivacy, an edge-cloud framework that protects user data through local reversible pseudonymization without compromising the semantic utility of AI memory systems.

Solving the Cloud Memory Privacy Gap

In standard edge-cloud deployments, raw user data—including health information, financial details, and passwords—is transmitted to the cloud for processing. This architecture leaves sensitive information vulnerable to multi-turn memory attacks and leakage, which have been shown to reach success rates of up to 69% and 75%, respectively. Previous attempts to mitigate these risks, such as masking sensitive values with generic tokens, often destroy the semantic structure required for AI agents to function effectively.
MemPrivacy resolves this by replacing sensitive content with typed placeholders, such as or , before the data leaves the user's device. Because these placeholders retain semantic context, the cloud model can reason and store memories normally without ever accessing the actual private values. When the cloud returns a response, the local device performs a secure lookup to restore the original information, ensuring the user receives a coherent, personalized experience.

A Granular Approach to Data Protection

The research team established a four-level privacy taxonomy, ranging from PL1 to PL4, to define the sensitivity of different data types. PL1 covers general preferences and habits, while PL2 includes identifiable information like names and email addresses. PL3 encompasses highly sensitive data such as medical records and financial details, and PL4 covers immediately exploitable credentials like passwords and API keys. Users can configure their masking threshold, allowing for granular control over the balance between privacy and utility.
To evaluate the framework, the researchers developed MemPrivacy-Bench, a dataset containing over 155,000 privacy instances. The best-performing model, MemPrivacy-4B-RL, achieved an F1 score of 85.97% on the benchmark, significantly outperforming general-purpose models like Gemini-3.1-Pro. Furthermore, when tested across memory systems such as LangMem, Mem0, and Memobase, the framework limited memory utility loss to within 1.6%, a substantial improvement over the 41.87% accuracy drop observed with traditional irreversible masking.

Deployment and Integration

MemPrivacy is designed for practical implementation, with models available in 0.6B, 1.7B, and 4B parameter scales based on the Qwen3 architecture. The framework operates in a three-stage pipeline: uplink desensitization on the device, cloud-side processing of sanitized inputs, and downlink restoration of original data. By keeping the original-to-placeholder mappings in a local secure database, the system ensures that the same values are consistently pseudonymized across different sessions.
This modular design allows developers to integrate MemPrivacy into existing memory-augmented agent architectures without requiring changes to cloud-side configurations. With per-message inference times under two seconds, the framework provides a viable path for deploying privacy-preserving AI agents on edge hardware.

Comments (0)

No comments yet

Be the first to share your thoughts!