Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces

๐Ÿ“… 2024-10-13
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 3
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the challenge of balancing inference efficiency and accuracy in Transformer models by proposing Dualformerโ€”the first unified, switchable architecture that emulates human dual-system cognition (fast, intuitive System 1 and slow, sequential System 2) within a single model. Methodologically, it introduces a structured stochastic reasoning trajectory training paradigm, incorporating trajectory dropout augmentation, dual-mode decoding control, and LLM-adapted fine-tuning. Key contributions include: (i) the first realization of dynamic, on-the-fly switching among fast, slow, and automatic inference modes within a single model; (ii) on the 30ร—30 maze task, slow-mode achieves 97.6% optimal solution rate (+4.3% over baseline) with 45.5% fewer reasoning steps; (iii) fast-mode attains 80% optimal rate, improving upon the baseline by 30 percentage points; and (iv) substantial gains in mathematical reasoning performance under fine-tuning.

Technology Category

Application Category

๐Ÿ“ Abstract
In human cognition theory, human thinking is governed by two systems: the fast and intuitive System 1 and the slower but more deliberative System 2. Recent studies have shown that incorporating System 2 process into Transformers including large language models (LLMs), significantly enhances their reasoning capabilities. Nevertheless, models that purely resemble System 2 thinking require substantially higher computational costs and are much slower to respond. To address this challenge, we present Dualformer, a single Transformer model that seamlessly integrates both the fast and slow reasoning modes. Dualformer is obtained by training on data with randomized reasoning traces, where different parts of the traces are dropped during training. The dropping strategies are specifically tailored according to the trace structure, analogous to analyzing our thinking process and creating shortcuts with patterns. At inference time, our model can be configured to output only the solutions (fast mode) or both the reasoning chain and the final solution (slow mode), or automatically decide which mode to engage (auto mode). In all cases, Dualformer outperforms the corresponding baseline models in both performance and computational efficiency: (1) in slow mode, Dualformer optimally solves unseen 30 x 30 maze navigation tasks 97.6% of the time, surpassing the Searchformer (trained on data with complete reasoning traces) baseline performance of 93.3%, while only using 45.5% fewer reasoning steps; (2) in fast mode, Dualformer completes those tasks with an 80% optimal rate, significantly outperforming the Solution-Only model (trained on solution-only data), which has an optimal rate of only 30%. For math problems, our techniques have also achieved improved performance with LLM fine-tuning, showing its generalization beyond task-specific models.
Problem

Research questions and friction points this paper is trying to address.

Integrates fast and slow reasoning modes in Transformers
Reduces computational costs while maintaining performance
Enhances reasoning capabilities in large language models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates fast and slow reasoning modes
Uses randomized reasoning traces training
Configurable inference modes for efficiency
๐Ÿ”Ž Similar Papers
No similar papers found.