FReM: A Flexible Reasoning Mechanism for Balancing Quick and Slow Thinking in Long-Context Question Answering

📅 2025-03-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In long-context question answering (LCQA), large language models (LLMs) often rely on “fast thinking” (shallow pattern matching), leading to insufficient logical reasoning, or “slow thinking” (exhaustive step-by-step inference), resulting in redundant computation and degraded efficiency. Method: We propose a dynamic multi-granularity reasoning mechanism that adaptively schedules reasoning depth based on question complexity. We introduce the first explicit chain-of-thought (CoT) injection strategy guided by synthetically generated reference QA pairs, enabling synergistic fast–slow thinking. Our approach integrates synthetic data augmentation, dynamic path control, and multi-granularity scheduling. Contribution/Results: Evaluated on seven QA benchmarks, our method significantly improves accuracy on multi-hop questions, reduces average inference overhead by 32%, and enhances scalability to long texts—overcoming limitations of fixed-depth reasoning paradigms.

Technology Category

Application Category

📝 Abstract
Long-context question-answering (LCQA) systems have greatly benefited from the powerful reasoning capabilities of large language models (LLMs), which can be categorized into slow and quick reasoning modes. However, both modes have their limitations. Slow thinking generally leans to explore every possible reasoning path, which leads to heavy overthinking and wastes time. Quick thinking usually relies on pattern matching rather than truly understanding the query logic, which misses proper understanding. To address these issues, we propose FReM: Flexible Reasoning Mechanism, a method that adjusts reasoning depth according to the complexity of each question. Specifically, FReM leverages synthetic reference QA examples to provide an explicit chain of thought, enabling efficient handling of simple queries while allowing deeper reasoning for more complex ones. By doing so, FReM helps quick-thinking models move beyond superficial pattern matching and narrows the reasoning space for slow-thinking models to avoid unnecessary exploration. Experiments on seven QA datasets show that FReM improves reasoning accuracy and scalability, particularly for complex multihop questions, indicating its potential to advance LCQA methodologies.
Problem

Research questions and friction points this paper is trying to address.

Balancing quick and slow reasoning in LCQA systems
Reducing overthinking in slow reasoning modes
Avoiding superficial pattern matching in quick reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adjusts reasoning depth by question complexity
Uses synthetic reference QA examples
Balances quick and slow thinking modes
🔎 Similar Papers
No similar papers found.
Zhengyi Zhao
Zhengyi Zhao
The Chinese University of Hong Kong
Natural Language ProcessMachine LearningInformation Extraction
S
Shubo Zhang
University of International Relations
Zezhong Wang
Zezhong Wang
Institute of Science Tokyo
VLSI physical design
B
Bin Liang
The Chinese University of Hong Kong
B
Binyang Li
University of International Relations
K
Kam-Fai Wong
The Chinese University of Hong Kong