Improving Complex Reasoning with Dynamic Prompt Corruption: A soft prompt Optimization Approach

📅 2025-03-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing prompt tuning (PT) methods often suffer from performance degradation in complex reasoning tasks due to error propagation induced by accumulated soft prompt information—particularly in later reasoning steps of deep models. This work is the first to identify and characterize the dual nature of soft prompts in multi-step reasoning. We propose Dynamic Prompt Contamination Suppression (DPC), a novel method that employs a reasoning-path-aware dynamic discrimination mechanism to selectively mask harmful soft prompt tokens, thereby enabling real-time, layer-wise modulation of their influence on attention mechanisms. Evaluated on rigorous benchmarks—including GSM8K, MATH, and AQuA—DPC consistently outperforms standard PT, achieving absolute accuracy gains of 4–8%. These results demonstrate substantial improvements in model robustness and generalization for challenging mathematical and logical reasoning tasks.

Technology Category

Application Category

📝 Abstract
Prompt-tuning (PT) for large language models (LLMs) can facilitate the performance on various conventional NLP tasks with significantly fewer trainable parameters. However, our investigation reveals that PT provides limited improvement and may even degrade the primitive performance of LLMs on complex reasoning tasks. Such a phenomenon suggests that soft prompts can positively impact certain instances while negatively affecting others, particularly during the later phases of reasoning. To address these challenges, We first identify an information accumulation within the soft prompts. Through detailed analysis, we demonstrate that this phenomenon is often accompanied by erroneous information flow patterns in the deeper layers of the model, which ultimately lead to incorrect reasoning outcomes. we propose a novel method called extbf{D}ynamic extbf{P}rompt extbf{C}orruption (DPC) to take better advantage of soft prompts in complex reasoning tasks, which dynamically adjusts the influence of soft prompts based on their impact on the reasoning process. Specifically, DPC consists of two stages: Dynamic Trigger and Dynamic Corruption. First, Dynamic Trigger measures the impact of soft prompts, identifying whether beneficial or detrimental. Then, Dynamic Corruption mitigates the negative effects of soft prompts by selectively masking key tokens that interfere with the reasoning process. We validate the proposed approach through extensive experiments on various LLMs and reasoning tasks, including GSM8K, MATH, and AQuA. Experimental results demonstrate that DPC can consistently enhance the performance of PT, achieving 4%-8% accuracy gains compared to vanilla prompt tuning, highlighting the effectiveness of our approach and its potential to enhance complex reasoning in LLMs.
Problem

Research questions and friction points this paper is trying to address.

Limited improvement of prompt-tuning in complex reasoning tasks
Erroneous information flow in deeper model layers affects reasoning
Dynamic Prompt Corruption enhances reasoning by adjusting soft prompts
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Prompt Corruption optimizes soft prompts.
DPC adjusts soft prompt influence dynamically.
Selective token masking improves reasoning accuracy.
🔎 Similar Papers
No similar papers found.
S
Sinan Fan
Hangzhou YunQi Academy of Engineering, Zhejiang University
Liang Xie
Liang Xie
Wuhan University of Technology
Time Series ForecastingCross-modal Learning
C
Chen Shen
Alibaba Cloud Computing
G
Ge Teng
Alibaba Cloud Computing
Xiaosong Yuan
Xiaosong Yuan
Jilin University | Alibaba Group
NLPLLMDeep Learning
X
Xiaofeng Zhang
Alibaba Cloud Computing
C
Chenxi Huang
Alibaba Cloud Computing
W
Wenxiao Wang
Zhejiang University, Alibaba Cloud Computing
Xiaofei He
Xiaofei He
Professor of Computer Science, Zhejiang University
machine learningcomputer visiondata mining
J
Jieping Ye
Alibaba Cloud Computing