🤖 AI Summary
This work addresses the geometric mismatch inherent in applying continuous flow models to discrete language generation—stemming from Euclidean regression losses—and the efficiency bottleneck of autoregressive sequential decoding. To overcome these limitations, the authors propose a geometrically aligned, single-step discrete flow generation framework that collapses the generative trajectory into a direct mapping. By leveraging the intrinsic geometry of the probability simplex, they formulate a training objective that enables, for the first time, a theoretically sound adaptation of flow-based models to discrete domains. Experimental results demonstrate that the proposed method preserves the discrete nature of language while significantly improving both the quality and efficiency of parallel generation, outperforming state-of-the-art non-autoregressive approaches.
📝 Abstract
The sequential nature of autoregressive next-token prediction imposes a fundamental speed limit on large language models. While continuous flow models offer a path to parallel generation, they traditionally demand expensive iterative integration. Flow Maps bypass this bottleneck by compressing generative trajectories into single-step mappings, theoretically enabling the generation of full text sequences from noise in a single forward pass. However, standard formulations rely on Euclidean regression losses that are geometrically ill-suited for discrete data. In this work, we resolve this conflict with Discrete Flow Maps, a framework that reconciles trajectory compression with the geometry of the probability simplex. We recast standard flow map training for the discrete domain, aligning the training dynamics with the discrete nature of language. Empirically, this strict geometric alignment allows our method to surpass previous state-of-the-art results in discrete flow modeling.