Separable Expert Architecture: Toward Privacy-Preserving LLM Personalization via Composable Adapters and Deletable User Proxies
Modern Large Language Models (LLMs) often learn user preferences by fine-tuning their internal weights. This creates a significant privacy problem: once a user’s data is baked into the model’s shared parameters, it becomes nearly impossible to remove that specific user's influence without retraining the entire model. This paper introduces the Separable Expert Architecture (SEA), a design that keeps user-specific data entirely separate from the model's shared core. By storing personalization in isolated, deletable files, the authors turn the complex task of "machine unlearning" into a simple, deterministic deletion operation.
A Three-Layer Design
The SEA architecture organizes the model into three distinct layers to ensure that user information never enters the shared weights:
- Base Layer: A frozen, shared model that provides general language capabilities. It is never modified by user data. 2. Expert Layer: A set of shared, domain-specific adapters (e.g., for coding or security) that are dynamically combined based on the user's query. 3. User Layer: A "proxy" artifact for each user. This small file (roughly 2–5 MB) contains a routing bias, steering vectors, and a personal adapter. These components work together at inference time to personalize the output without ever changing the underlying model weights.
Deterministic Deletion
Because the user’s personal information is stored in a separate, external file rather than being absorbed into the model's brain, "unlearning" is straightforward. When a user requests data removal, the system simply deletes their specific proxy file. The authors verify this process by comparing the model's output after deletion to a non-personalized baseline. If the output matches the baseline, the system confirms that all user-specific influence has been removed. This approach provides a structural guarantee of privacy that traditional, weight-based models cannot offer.
Performance and Privacy
The researchers tested this architecture on Phi-3.5-mini and Llama-3.1-8B models. Their results show that the system successfully provides personalized responses while maintaining strict isolation between users. In tests, the deletion protocol achieved an 82–89% verification pass rate, confirming that the model returns to its baseline behavior once the proxy is removed. Furthermore, because user data never enters the shared weights, the architecture is inherently resistant to common privacy threats like model inversion and training-data extraction.
Key Considerations
While the SEA approach solves the problem of intractable unlearning, it involves a trade-off. The personalization is intentionally kept moderate in scope—limited by the size of the personal adapter—to ensure the system remains efficient and the separation guarantee stays clear. Additionally, while the architecture is compatible with privacy-preserving training methods like DP-SGD for the shared components, the system requires that cached baselines be refreshed whenever the shared model is updated. Overall, the paper demonstrates that by architecting for separation from the start, developers can provide personalized AI experiences that respect user privacy and the right to be forgotten.
Comments (0)
to join the discussion
No comments yet
Be the first to share your thoughts!