RubiConv -- Efficient Boundary-Respecting Convolutions

📅 2026-05-08

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

Standard FFT-based convolution struggles to simultaneously respect sequence boundaries and achieve hardware efficiency when processing packed sequence data, preventing its theoretical advantages from translating into practical training gains. To address this challenge, this work proposes RubiConv—a boundary-aware, hardware-efficient convolution algorithm that, for the first time, enables efficient FFT-based convolution on packed sequences while preserving sequence boundaries. By bridging the gap between theoretical computational complexity and real-world training performance, RubiConv consistently outperforms both conventional FFT convolution and attention mechanisms across multiple large-scale experiments, achieving substantially faster training speeds and higher model efficiency.

📝 Abstract

Convolutional architectures have emerged as powerful alternatives to Transformers for sequence modeling. The primary advantage is that they offer improved theoretical sequence length complexity by leveraging the Fast Fourier Transform (FFT). However, this theoretical improvement does not always meaningfully land in practice. One critical obstacle is that applying standard FFTs is not amenable to the large-scale training pipeline wherein data is packed from different sources into a single sequence for hardware efficiency. Indeed, standard FFT algorithms are not easily amenable to document packing. Existing workarounds suffer from severe inefficiencies, crippling the practical performance of convolutional architectures. We close this gap with RubiConv, a novel algorithm for performing hardware-efficient, boundary-respecting convolutions on packed sequences. Extensive experiments show that RubiConv achieves significant speedups over both attention and standard FFT-based baselines. This work makes the theoretical efficiency of long convolutional models a practical reality for large-scale, real-world data packing.

Problem

Research questions and friction points this paper is trying to address.

convolution

sequence packing

FFT

boundary handling

hardware efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

RubiConv

boundary-respecting convolution

FFT-based convolution