🤖 AI Summary
This work addresses the insufficient local Lipschitz stability of the self-attention mechanism in Transformers. We establish, for the first time, an explicit theoretical connection between the distribution of attention scores and the local Lipschitz constant, revealing that the concentration level of softmax output distributions directly governs model robustness. To this end, we propose JaSMin—a Jacobian-based Softmax regularization method—that leverages a closed-form expression of the spectral norm of the softmax Jacobian matrix to directly constrain attention score distributions in the gradient domain. JaSMin significantly reduces the local Lipschitz constant, yielding consistent improvements in adversarial robustness and generalization across text classification and question answering tasks. Furthermore, empirical evaluation in security-critical settings validates the effectiveness of interpretable, distribution-level control over attention mechanisms.
📝 Abstract
We present a novel local Lipschitz bound for self-attention blocks of transformers. This bound is based on a refined closed-form expression for the spectral norm of the softmax function. The resulting bound is not only more accurate than in the prior art, but also unveils the dependence of the Lipschitz constant on attention score maps. Based on the new findings, we suggest an explanation of the way distributions inside the attention map affect the robustness from the Lipschitz constant perspective. We also introduce a new lightweight regularization term called JaSMin (Jacobian Softmax norm Minimization), which boosts the transformer's robustness and decreases local Lipschitz constants of the whole network.