Back to AI Research

AI Research

ResilPhase: Plug-and-Play Phase Mapping and Noise-R... | AI Research

Key Takeaways

  • ResilPhase: Plug-and-Play Phase Mapping and Noise-Resilient Macro-Trajectory Extrapolation for Diffusion Acceleration addresses the significant latency issue...
  • The adoption of powerful diffusion models is hindered by their significant inference latency.
  • Recent ``cache-then-forecast'' schemes alleviate this issue by accelerating DiTs using derivative-based polynomials, but they suffer from severe quality degradation at high acceleration ratios.
  • Our analysis reveals its root cause: the discrete extrapolation performed on representations that are misaligned with the continuous diffusion trajectory and are numerically unstable.
  • Thus, accelerated DiTs suffer from accumulated spatial errors, noisy derivative amplification, and high-order instability.
Paper AbstractExpand

The adoption of powerful diffusion models is hindered by their significant inference latency. Recent ``cache-then-forecast'' schemes alleviate this issue by accelerating DiTs using derivative-based polynomials, but they suffer from severe quality degradation at high acceleration ratios. Our analysis reveals its root cause: the discrete extrapolation performed on representations that are misaligned with the continuous diffusion trajectory and are numerically unstable. Thus, accelerated DiTs suffer from accumulated spatial errors, noisy derivative amplification, and high-order instability. We therefore reformulate accelerated inference as stable macro-trajectory extrapolation in ordinary differential equation (ODE) space. Instead of predicting intermediate features, we align forecasting with the model's Global Drift (GD), i.e., the end-to-end state evolution, thereby eliminating feature inconsistency and memory overhead. However, even this smooth macro-trajectory remains vulnerable to the derivative fallacy: its higher-order temporal derivatives are intrinsically noisy. Thus, we introduce a derivative-free barycentric Lagrange extrapolator to effectively bypass derivative instability and approximation error. We further propose a bounded Phase Mapping that regularizes the extrapolation domain, suppressing oscillatory error growth. These elements collectively constitute ResilPhase, a noise-resilient acceleration framework. Experiments on FLUX.1-dev and HunyuanVideo demonstrate state-of-the-art fidelity under aggressive acceleration ratios.

ResilPhase: Plug-and-Play Phase Mapping and Noise-Resilient Macro-Trajectory Extrapolation for Diffusion Acceleration addresses the significant latency issues inherent in Diffusion Transformers (DiTs). While these models are powerful, their iterative denoising process requires many sequential steps, making real-time deployment difficult. Current acceleration methods attempt to predict intermediate features to skip steps, but they often suffer from poor image quality at high acceleration ratios due to numerical instability and error accumulation. ResilPhase introduces a new framework that stabilizes this process by changing how the model predicts its own evolution.

Moving Beyond Layer-by-Layer Prediction

Existing acceleration methods typically try to predict the output of every individual layer within the Transformer block. This approach is problematic because small errors in early layers are amplified as they pass through the rest of the network, leading to a "cascading" effect that degrades the final image. ResilPhase replaces this with "Global Drift" targeting. Instead of predicting internal features, the model focuses on predicting the total change (the "drift") from the input to the final output. By treating the entire network as a single, unified step, the framework eliminates the accumulation of errors across layers and significantly reduces memory overhead.

Eliminating Derivative Noise

Many previous acceleration techniques rely on calculating derivatives—the rate of change—to forecast future steps. However, the researchers found that while the overall trajectory of a diffusion model is smooth, its higher-order derivatives are inherently chaotic and noisy. Using these derivatives for prediction introduces significant errors. ResilPhase solves this by using a "derivative-free" approach based on Barycentric Lagrange interpolation. This mathematical method allows the model to predict future states using only historical data points, completely bypassing the need to calculate unstable derivatives and ensuring a much more stable and accurate prediction.

Stabilizing Predictions with Phase Mapping

Even with a stable prediction method, polynomial extrapolation can suffer from "Runge’s phenomenon," a numerical issue where errors grow uncontrollably at the edges of an interval, causing chaotic oscillations in the generated output. ResilPhase introduces a "Phase Mapping" mechanism to fix this. By non-linearly projecting discrete time steps into a bounded "phase space" using either Chebyshev nodes or a data-driven "Balanced Mapping," the framework regularizes the domain where predictions occur. This effectively transforms a divergent numerical problem into a stable one, keeping the extrapolation error within a strictly controlled mathematical bound.

Performance and Impact

Experiments conducted on high-end models like FLUX.1-dev and HunyuanVideo demonstrate that ResilPhase achieves approximately 5x speedups while maintaining high-fidelity results. Because the Phase Mapping mechanism is designed as a plug-and-play component, it can also be used to stabilize existing acceleration frameworks. By combining the shift to Global Drift, the use of derivative-free math, and the implementation of Phase Mapping, ResilPhase provides a robust, noise-resilient solution for making large-scale diffusion models faster and more efficient.

Comments (0)

No comments yet

Be the first to share your thoughts!