Efficient OpAmp Adaptation for Zoom Attention to Golden Contexts

πŸ“… 2025-02-18
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the problem of LLM attention dispersion and insufficient focus on critical information caused by noisy documents in retrieval-augmented generation (RAG) and long-context question answering, this paper proposes OpAmp Adapterβ€”a lightweight, plug-in adapter inspired by the common-mode rejection mechanism of operational amplifiers (OpAmps). For the first time, it introduces the high common-mode rejection ratio (CMRR) principle from analog electronics into Transformer self-attention, enabling differential attention calibration without modifying the backbone architecture. The design achieves both strong CMRR and low computational overhead. Evaluated on noisy long-context benchmarks, Qwen2.5-OpAmp-72B outperforms DeepSeek-V3 and GPT-4o, demonstrating significant improvements in both question-answering accuracy and attention alignment with gold contexts.

Technology Category

Application Category

πŸ“ Abstract
Large language models (LLMs) have shown significant promise in question-answering (QA) tasks, particularly in retrieval-augmented generation (RAG) scenarios and long-context applications. However, their performance is hindered by noisy reference documents, which often distract from essential information. Despite fine-tuning efforts, Transformer-based architectures struggle to prioritize relevant content. This is evidenced by their tendency to allocate disproportionate attention to irrelevant or later-positioned documents. Recent work proposes the differential attention mechanism to address this issue, but this mechanism is limited by an unsuitable common-mode rejection ratio (CMRR) and high computational costs. Inspired by the operational amplifier (OpAmp), we propose the OpAmp adaptation to address these challenges, which is implemented with adapters efficiently. By integrating the adapter into pre-trained Transformer blocks, our approach enhances focus on the golden context without costly training from scratch. Empirical evaluations on noisy-context benchmarks reveal that our Qwen2.5-OpAmp-72B model, trained with our OpAmp adaptation, surpasses the performance of state-of-the-art LLMs, including DeepSeek-V3 and GPT-4o.
Problem

Research questions and friction points this paper is trying to address.

Improve attention in LLMs
Reduce noise in reference documents
Enhance focus on relevant content
Innovation

Methods, ideas, or system contributions that make the work stand out.

OpAmp adaptation
efficient adapters
golden context focus
πŸ”Ž Similar Papers
No similar papers found.