Spline-Based Transformers

📅 2025-04-03

🏛️ European Conference on Computer Vision

📈 Citations: 0

✨ Influential: 0

career value

152K/year

🤖 AI Summary

Traditional Transformers rely on positional encodings to model sequential order, yet suffer from poor extrapolation capability and limited interpretability. This paper proposes SplineFormer, a novel Transformer architecture that eliminates explicit positional encodings. Its core innovation is embedding input sequences as learnable smooth spline trajectories in latent space, where sequence order is implicitly modeled via learnable control points; spline interpolation is integrated with self-attention to enable arbitrary-length sequence extrapolation and interactive user editing of control points. To our knowledge, this is the first work to incorporate spline geometric priors into sequence modeling. Extensive experiments on synthetic 2D data and real-world multimodal tasks—including image, 3D shape, and animation generation—demonstrate substantial improvements over state-of-the-art positional encoding schemes. SplineFormer achieves superior generalization, enhanced interpretability, and fine-grained controllability.

Technology Category

Application Category

📝 Abstract

We introduce Spline-based Transformers, a novel class of Transformer models that eliminate the need for positional encoding. Inspired by workflows using splines in computer animation, our Spline-based Transformers embed an input sequence of elements as a smooth trajectory in latent space. Overcoming drawbacks of positional encoding such as sequence length extrapolation, Spline-based Transformers also provide a novel way for users to interact with transformer latent spaces by directly manipulating the latent control points to create new latent trajectories and sequences. We demonstrate the superior performance of our approach in comparison to conventional positional encoding on a variety of datasets, ranging from synthetic 2D to large-scale real-world datasets of images, 3D shapes, and animations.

Problem

Research questions and friction points this paper is trying to address.

Eliminate positional encoding in Transformer models

Embed input sequences as smooth latent space trajectories

Enable direct manipulation of latent control points

Innovation

Methods, ideas, or system contributions that make the work stand out.

Spline-based Transformers eliminate positional encoding

Embed sequences as smooth latent trajectories

Enable direct manipulation of latent control points

🔎 Similar Papers

No similar papers found.