Beyond Binary Edits: Robust Multimodal Knowledge Editing with Adversarial Subspace Alignment
Multimodal Large Language Models (MLLMs) are powerful tools that process both text and images, but they often struggle when we try to update their knowledge. While we can "edit" these models to correct facts, these updates are often fragile. They work for the specific example used during the edit but fail when the same information is presented in a slightly different way—such as a different image of the same object or a rephrased sentence. This paper introduces ASAM, a framework designed to make these knowledge updates more robust and generalizable across different ways of expressing the same concept.
The Challenge of Generalization
Current methods for editing MLLMs often treat knowledge as a simple "on/off" switch, categorizing inputs as either in-scope or out-of-scope. This binary approach is too rigid for the complex, continuous nature of multimodal data. Furthermore, because these models are often trained on specific, biased examples, they tend to overfit to those specific images or phrases. When the model encounters a variation of the edited fact, it fails to recognize the underlying concept, leading to a breakdown in performance.
Generating Adversarial Variants
To solve this, the researchers developed Latent Adversarial Robustification (LAR). Instead of relying on manual rephrasing, which is time-consuming and limited, LAR uses the model’s own internal logic to create "adversarial variants." By calculating gradients that challenge the model’s current understanding, the system generates diverse, semantically equivalent versions of the input in the model's latent space. This exposes the model to a wider range of scenarios, forcing it to learn the core concept rather than just memorizing a single data point.
Aligning Knowledge via Subspace Learning
Once these variants are generated, the framework uses Rank-Constrained Subspace Learning (RCSL) to ensure the model treats them all as the same piece of information. The researchers discovered that if the model truly understands a concept, the internal representations of all its variations should align along a single, shared "subspace." By using a mathematical technique called Singular Value Decomposition (SVD), the system enforces this alignment. It suppresses secondary, irrelevant directions in the data and forces the model to focus on the primary semantic axis, ensuring that the edited knowledge remains consistent regardless of how the input is presented.
Adaptive Learning and Results
To ensure the model doesn't lose its existing capabilities, ASAM uses an "asymmetric gradient flow." This means the original, correct example acts as a stable anchor, while the adversarial variants are used to nudge the model's parameters toward a more robust understanding. By combining this with standard reliability and locality checks, the model learns to update its knowledge without degrading its performance on unrelated tasks. Empirical analysis shows that this approach significantly improves the model's ability to generalize edited facts across diverse visual and linguistic contexts compared to existing methods.
Comments (0)
to join the discussion
No comments yet
Be the first to share your thoughts!