DiffATS: Diffusion in Aligned Tensor Space

📅 2026-05-09
📈 Citations: 0
Influential: 0
📄 PDF

career value

210K/year
🤖 AI Summary
This work addresses the computational expense of direct diffusion modeling for high-resolution spatiotemporal fields and the non-uniqueness limitations of conventional low-rank tensor representations. The authors propose a data-dependent tensor basis that extracts multilinear low-rank structure via Tucker decomposition and resolves factor rotational ambiguity through orthogonal Procrustes alignment, yielding a compact, invertible tensor representation that eliminates the need for pretrained autoencoders. They achieve the first end-to-end tensor diffusion generation on the Grassmann manifold and theoretically prove the resulting mapping is a homeomorphism, ensuring non-degeneracy and topological fidelity. The method demonstrates high-quality unconditional and conditional generation on image, video, and PDE solution datasets, achieving compression ratios of 3.9–210× and significantly outperforming existing approaches reliant on pretrained compression modules.
📝 Abstract
Direct diffusion modeling of high-resolution spatiotemporal fields is computationally challenging. Parameter-efficient primitives address this by representing high-dimensional data with a compact set of parameters. In this paper, we construct data-dependent tensor primitives without pretrained compression autoencoders. Our construction starts from Tucker decomposition, which captures low-rank multilinear structure through a core tensor and mode-wise factors. However, Tucker factors are non-unique: the same tensor can be represented by different rotated factors, which complicates generative modeling. We address this issue with orthogonal Procrustes (OP) alignment. Specifically, we select medoid anchor matrices from the data and align the factor matrices to resolve the gauge ambiguity. This yields matrix Grassmannian primitives and tensor Grassmannian primitives that are compact, data-adaptive, and directly decodable by explicit multilinear reconstruction. Theoretically, we prove that the proposed primitive maps are homeomorphisms between low-rank tensors and their corresponding primitive spaces, certifying that the representations are non-degenerate and topologically faithful. Building on these primitives, we propose *Diffusion in Aligned Tensor Space* (DiffATS), a generative framework that trains diffusion models directly on aligned tensor primitives. Across images, videos, and PDE solutions, DiffATS achieves strong unconditional and conditional generation performance while compressing original data by $3.9\times$ to $210\times$, without relying on any pretrained deep compression autoencoders.
Problem

Research questions and friction points this paper is trying to address.

diffusion modeling
spatiotemporal fields
tensor decomposition
gauge ambiguity
parameter-efficient representation
Innovation

Methods, ideas, or system contributions that make the work stand out.

tensor decomposition
orthogonal Procrustes alignment
Grassmannian primitives
diffusion models
data-adaptive representation
🔎 Similar Papers
No similar papers found.