Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces

📅 2024-10-13

🏛️ arXiv.org

📈 Citations: 3

✨ Influential: 0

career value

183K/year

🤖 AI Summary

This work addresses the challenge of balancing inference efficiency and accuracy in Transformer models by proposing Dualformer—the first unified, switchable architecture that emulates human dual-system cognition (fast, intuitive System 1 and slow, sequential System 2) within a single model. Methodologically, it introduces a structured stochastic reasoning trajectory training paradigm, incorporating trajectory dropout augmentation, dual-mode decoding control, and LLM-adapted fine-tuning. Key contributions include: (i) the first realization of dynamic, on-the-fly switching among fast, slow, and automatic inference modes within a single model; (ii) on the 30×30 maze task, slow-mode achieves 97.6% optimal solution rate (+4.3% over baseline) with 45.5% fewer reasoning steps; (iii) fast-mode attains 80% optimal rate, improving upon the baseline by 30 percentage points; and (iv) substantial gains in mathematical reasoning performance under fine-tuning.

Technology Category

Application Category

📝 Abstract

In human cognition theory, human thinking is governed by two systems: the fast and intuitive System 1 and the slower but more deliberative System 2. Recent studies have shown that incorporating System 2 process into Transformers including large language models (LLMs), significantly enhances their reasoning capabilities. Nevertheless, models that purely resemble System 2 thinking require substantially higher computational costs and are much slower to respond. To address this challenge, we present Dualformer, a single Transformer model that seamlessly integrates both the fast and slow reasoning modes. Dualformer is obtained by training on data with randomized reasoning traces, where different parts of the traces are dropped during training. The dropping strategies are specifically tailored according to the trace structure, analogous to analyzing our thinking process and creating shortcuts with patterns. At inference time, our model can be configured to output only the solutions (fast mode) or both the reasoning chain and the final solution (slow mode), or automatically decide which mode to engage (auto mode). In all cases, Dualformer outperforms the corresponding baseline models in both performance and computational efficiency: (1) in slow mode, Dualformer optimally solves unseen 30 x 30 maze navigation tasks 97.6% of the time, surpassing the Searchformer (trained on data with complete reasoning traces) baseline performance of 93.3%, while only using 45.5% fewer reasoning steps; (2) in fast mode, Dualformer completes those tasks with an 80% optimal rate, significantly outperforming the Solution-Only model (trained on solution-only data), which has an optimal rate of only 30%. For math problems, our techniques have also achieved improved performance with LLM fine-tuning, showing its generalization beyond task-specific models.

Problem

Research questions and friction points this paper is trying to address.

Integrates fast and slow reasoning modes in Transformers

Reduces computational costs while maintaining performance

Enhances reasoning capabilities in large language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates fast and slow reasoning modes

Uses randomized reasoning traces training

Configurable inference modes for efficiency

🔎 Similar Papers

No similar papers found.