HeSS: Head Sensitivity Score for Sparsity Redistribution in VGGT

📅 2026-03-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the significant accuracy degradation observed in the VGGT model when applying uniform sparsification to global attention layers. To mitigate this issue, the authors propose a two-stage non-uniform sparsification approach. First, leveraging a small calibration set, they approximate the Hessian to compute the sensitivity of each attention head to sparsification, introducing the Head Sensitivity Score (HeSS) to quantitatively capture inter-head sensitivity variations. Subsequently, attention budgets are dynamically reallocated based on HeSS, enabling sensitivity-aware heterogeneous sparsification. This method substantially alleviates performance loss under high sparsity ratios and demonstrates remarkable robustness across varying sparsity levels.

Technology Category

Application Category

📝 Abstract
Visual Geometry Grounded Transformer (VGGT) has advanced 3D vision, yet its global attention layers suffer from quadratic computational costs that hinder scalability. Several sparsification-based acceleration techniques have been proposed to alleviate this issue, but they often suffer from substantial accuracy degradation. We hypothesize that the accuracy degradation stems from the heterogeneity in head-wise sparsification sensitivity, as the existing methods apply a uniform sparsity pattern across all heads. Motivated by this hypothesis, we present a two-stage sparsification pipeline that effectively quantifies and exploits headwise sparsification sensitivity. In the first stage, we measure head-wise sparsification sensitivity using a novel metric, the Head Sensitivity Score (HeSS), which approximates the Hessian with respect to two distinct error terms on a small calibration set. In the inference stage, we perform HeSS-Guided Sparsification, leveraging the pre-computed HeSS to reallocate the total attention budget-assigning denser attention to sensitive heads and sparser attention to more robust ones. We demonstrate that HeSS effectively captures head-wise sparsification sensitivity and empirically confirm that attention heads in the global attention layers exhibit heterogeneous sensitivity characteristics. Extensive experiments further show that our method effectively mitigates performance degradation under high sparsity, demonstrating strong robustness across varying sparsification levels. Code is available at https://github.com/libary753/HeSS.
Problem

Research questions and friction points this paper is trying to address.

sparsity
attention heads
accuracy degradation
heterogeneous sensitivity
VGGT
Innovation

Methods, ideas, or system contributions that make the work stand out.

Head Sensitivity Score
sparsity redistribution
heterogeneous attention heads
Hessian approximation
efficient transformers
🔎 Similar Papers
No similar papers found.
Y
Yongsung Kim
IPAI, Seoul National University
W
Wooseok Song
ECE, Seoul National University
Jaihyun Lew
Jaihyun Lew
Seoul National University
Computer Vision
H
Hun Hwangbo
IPAI, Seoul National University
J
Jaehoon Lee
IPAI, Seoul National University
Sungroh Yoon
Sungroh Yoon
Professor, Electrical and Computer Engineering & Artificial Intelligence, Seoul National University
AIdeep learningmachine learningon-device AIbioinformatics