🤖 AI Summary
Can nonstandard vector spaces be constructed such that conventional nonlinear neural networks become linear operators over them?
Method: We propose the Linearizer framework—a compositional architecture integrating invertible neural networks (INNs) with embedded linear operators—to explicitly define self-consistent vector space operations (addition and scalar multiplication), thereby inducing a vector space in which neural networks act as linear operators.
Contribution/Results: We theoretically establish that standard linear algebraic tools—including singular value decomposition (SVD), Moore–Penrose pseudoinverse, and orthogonal projection—apply rigorously within this space. Moreover, cascaded Linearizer modules preserve linearity. This work achieves, for the first time, end-to-end linear algebraic operations on nonlinear models: in diffusion models, it compresses百余-step sampling into a single global projection step and enables disentangled, modular style transfer.
📝 Abstract
Neural networks are famously nonlinear. However, linearity is defined relative to a pair of vector spaces, $f$$:$$X$$ o$$Y$. Is it possible to identify a pair of non-standard vector spaces for which a conventionally nonlinear function is, in fact, linear? This paper introduces a method that makes such vector spaces explicit by construction. We find that if we sandwich a linear operator $A$ between two invertible neural networks, $f(x)=g_y^{-1}(A g_x(x))$, then the corresponding vector spaces $X$ and $Y$ are induced by newly defined addition and scaling actions derived from $g_x$ and $g_y$. We term this kind of architecture a Linearizer. This framework makes the entire arsenal of linear algebra, including SVD, pseudo-inverse, orthogonal projection and more, applicable to nonlinear mappings. Furthermore, we show that the composition of two Linearizers that share a neural network is also a Linearizer. We leverage this property and demonstrate that training diffusion models using our architecture makes the hundreds of sampling steps collapse into a single step. We further utilize our framework to enforce idempotency (i.e. $f(f(x))=f(x)$) on networks leading to a globally projective generative model and to demonstrate modular style transfer.