Spline-Based Transformers

📅 2025-04-03
🏛️ European Conference on Computer Vision
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional Transformers rely on positional encodings to model sequential order, yet suffer from poor extrapolation capability and limited interpretability. This paper proposes SplineFormer, a novel Transformer architecture that eliminates explicit positional encodings. Its core innovation is embedding input sequences as learnable smooth spline trajectories in latent space, where sequence order is implicitly modeled via learnable control points; spline interpolation is integrated with self-attention to enable arbitrary-length sequence extrapolation and interactive user editing of control points. To our knowledge, this is the first work to incorporate spline geometric priors into sequence modeling. Extensive experiments on synthetic 2D data and real-world multimodal tasks—including image, 3D shape, and animation generation—demonstrate substantial improvements over state-of-the-art positional encoding schemes. SplineFormer achieves superior generalization, enhanced interpretability, and fine-grained controllability.

Technology Category

Application Category

📝 Abstract
We introduce Spline-based Transformers, a novel class of Transformer models that eliminate the need for positional encoding. Inspired by workflows using splines in computer animation, our Spline-based Transformers embed an input sequence of elements as a smooth trajectory in latent space. Overcoming drawbacks of positional encoding such as sequence length extrapolation, Spline-based Transformers also provide a novel way for users to interact with transformer latent spaces by directly manipulating the latent control points to create new latent trajectories and sequences. We demonstrate the superior performance of our approach in comparison to conventional positional encoding on a variety of datasets, ranging from synthetic 2D to large-scale real-world datasets of images, 3D shapes, and animations.
Problem

Research questions and friction points this paper is trying to address.

Eliminate positional encoding in Transformer models
Embed input sequences as smooth latent space trajectories
Enable direct manipulation of latent control points
Innovation

Methods, ideas, or system contributions that make the work stand out.

Spline-based Transformers eliminate positional encoding
Embed sequences as smooth latent trajectories
Enable direct manipulation of latent control points
🔎 Similar Papers
No similar papers found.
P
Prashanth Chandran
DisneyResearch|Studios, Switzerland
A
Agon Serifi
ETH Zurich, Switzerland
M
Markus Gross
ETH Zurich, Switzerland
Moritz Bächer
Moritz Bächer
Disney Research
Computational RoboticsComputational FabricationComputer Graphics