Reg4Pru: Regularisation Through Random Token Routing for Token Pruning

📅 2026-02-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the quadratic growth in computational complexity of Vision Transformers caused by increasing token counts, a challenge exacerbated by existing token pruning methods that often degrade dense prediction performance in deep networks. To mitigate this, the authors propose Reg4Pru, a novel training regularization framework that, for the first time, integrates stochastic token routing into the pruning process. By reactivating pruned representations during training, Reg4Pru effectively alleviates performance degradation while preserving the original Transformer architecture. The method achieves a favorable trade-off between computational efficiency and model accuracy. Evaluated on the FIVES retinal vessel segmentation dataset, Reg4Pru yields an absolute improvement of 46% in average precision over pruning baselines without routing and delivers a 29% speedup in inference.

Technology Category

Application Category

📝 Abstract
Transformers are widely adopted in modern vision models due to their strong ability to scale with dataset size and generalisability. However, this comes with a major drawback: computation scales quadratically to the total number of tokens. Numerous methods have been proposed to mitigate this. For example, we consider token pruning with reactivating tokens from preserved representations, but the increased computational efficiency of this method results in decreased stability from the preserved representations, leading to poorer dense prediction performance at deeper layers. In this work, we introduce Reg4Pru, a training regularisation technique that mitigates token-pruning performance loss for segmentation. We compare our models on the FIVES blood vessel segmentation dataset and find that Reg4Pru improves average precision by an absolute 46% compared to the same model trained without routing. This increase is observed using a configuration that achieves a 29% relative speedup in wall-clock time compared to the non-pruned baseline. These findings indicate that Reg4Pru is a valuable regulariser for token reduction strategies.
Problem

Research questions and friction points this paper is trying to address.

token pruning
computational efficiency
dense prediction
representation stability
vision transformers
Innovation

Methods, ideas, or system contributions that make the work stand out.

token pruning
regularization
random token routing
vision transformers
dense prediction
🔎 Similar Papers
No similar papers found.