Robust Basis Spline Decoupling for the Compression of Transformer Models

📅 2026-05-11
📈 Citations: 0
Influential: 0
📄 PDF

career value

192K/year
🤖 AI Summary
Existing tensor decoupling methods for neural network compression suffer from limited expressiveness and numerical instability due to their reliance on polynomial or piecewise-linear parameterizations. This work proposes a structured decoupling framework based on B-splines, modeling multivariate functions as compositions of linear transformations and univariate B-spline nonlinearities. By leveraging the local support and controllable smoothness of B-splines, the approach substantially enhances both representational capacity and numerical stability. To this end, we formulate a constrained matrix–tensor joint factorization model and develop a robust alternating least squares algorithm (R-CMTF-BSD) incorporating normalization and Tikhonov regularization. Experiments on Vision Transformer and Swin Transformer demonstrate that the method achieves significant parameter compression while maintaining competitive accuracy.
📝 Abstract
Decoupling is a powerful modeling paradigm for representing multivariate functions as compositions of linear transformations and univariate nonlinear functions. A single-layer decoupling can be viewed as a fully connected neural network with a single hidden layer and flexible activation functions, providing a direct link with neural networks. Because of this, the use of decoupling methods has gained increasing attention in neural network domains, particularly compression, since it enables structured approximations with reduced parameter complexity. Existing tensor-based decoupling methods typically rely on polynomial or piecewise-linear parameterizations of the internal nonlinear functions, which can suffer from numerical instability or limited expressiveness. In this work, we introduce a B-spline-based decoupling framework that generalizes these existing approaches. By exploiting the local support and flexible smoothness control of B-splines, the proposed formulation yields a more numerically stable and expressive representation. We derive a constrained coupled matrix-tensor factorization and propose a robust alternating least-squares algorithm, called R-CMTF-BSD, incorporating normalization and Tikhonov regularization. The proposed method is validated through experiments on synthetic data and transformer model compression. Results on the Vision and Swin Transformer architectures demonstrate that B-spline decoupling enables substantial parameter reduction while maintaining competitive accuracy, making the R-CMTF-BSD algorithm a promising tool for structured neural network compression.
Problem

Research questions and friction points this paper is trying to address.

decoupling
neural network compression
B-spline
tensor factorization
nonlinear function approximation
Innovation

Methods, ideas, or system contributions that make the work stand out.

B-spline decoupling
structured compression
matrix-tensor factorization
alternating least squares
Transformer compression