LIFT: Improving Long Context Understanding of Large Language Models through Long Input Fine-Tuning

📅 2025-02-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) face inherent limitations in long-document understanding due to fixed context windows. This paper proposes Long-Input Fine-Tuning (LIFT), a framework that avoids context window expansion by dynamically encoding long-input information into model parameters—enabling short-context LLMs to answer questions without explicitly loading the full input during inference. Its core contributions are: (1) the first “parameterized input storage” paradigm, which compresses long contexts into learnable, task-specific parameters; and (2) a Gated Memory attention adapter that adaptively balances retrieval from long-term parametric memory and in-context learning from local context. Experiments demonstrate that LIFT substantially improves performance of short-context LLMs on long-document question answering and multi-hop reasoning, while fully preserving their original in-context learning capabilities. LIFT thus offers a parameter-efficient, architecture-agnostic approach to long-context modeling.

Technology Category

Application Category

📝 Abstract
Long context understanding remains challenging for large language models due to their limited context windows. This paper presents Long Input Fine-Tuning (LIFT), a novel framework for long-context modeling that can improve the long-context performance of arbitrary (short-context) LLMs by dynamically adapting model parameters based on the long input. Importantly, LIFT, rather than endlessly extending the context window size to accommodate increasingly longer inputs in context, chooses to store and absorb the long input in parameter. By fine-tuning the long input into model parameters, LIFT allows short-context LLMs to answer questions even when the required information is not provided in the context during inference. Furthermore, to enhance LIFT performance while maintaining the original in-context learning (ICL) capabilities, we introduce Gated Memory, a specialized attention adapter that automatically balances long input memorization and ICL. We provide a comprehensive analysis of the strengths and limitations of LIFT on long context understanding, offering valuable directions for future research.
Problem

Research questions and friction points this paper is trying to address.

Enhances long-context understanding in language models.
Adapts model parameters for long input dynamically.
Balances long input memorization with in-context learning.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Long Input Fine-Tuning (LIFT)
Gated Memory attention adapter
Dynamic model parameter adaptation
🔎 Similar Papers
No similar papers found.
Y
Yansheng Mao
Institute for Artificial Intelligence, Peking University
Y
Yufei Xu
Institute for Artificial Intelligence, Peking University; State Key Laboratory of General Artificial Intelligence, BIGAI
J
Jiaqi Li
State Key Laboratory of General Artificial Intelligence, BIGAI
F
Fanxu Meng
Institute for Artificial Intelligence, Peking University; State Key Laboratory of General Artificial Intelligence, BIGAI
Haotong Yang
Haotong Yang
Peking University
Machine learningLarge language modelKnowledge graphInterpretable machine learning
Z
Zilong Zheng
State Key Laboratory of General Artificial Intelligence, BIGAI
X
Xiyuan Wang
Institute for Artificial Intelligence, Peking University
Muhan Zhang
Muhan Zhang
Peking University
Machine LearningGraph Neural NetworkLarge Language Models