Beyond In-Context Learning: Aligning Long-form Generation of Large Language Models via Task-Inherent Attribute Guidelines

📅 2025-06-02

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

To address the challenge of effectively modeling task-specific language and formatting distributions in context learning (ICL) for long-text generation tasks (e.g., summarization), this paper proposes a dual-track guidance mechanism. The *Metric Guidelines* enforce token- and sentence-level optimization driven by self-assessed evaluation metrics, while the *Output Constraint Guidelines* provide fine-grained structural and format constraints. These two guidance tracks are jointly aligned to facilitate knowledge distillation from weaker models to enhance stronger ones, and seamlessly integrate with automated prompt optimizers. Our approach leverages parallel guideline generation, dynamic combinatorial search, and cross-model capability transfer. Empirically, it significantly improves performance across both zero-shot and few-shot settings—yielding consistent gains of over +5% on major open- and closed-source large language models—while demonstrating strong generalizability and plug-and-play compatibility.

Technology Category

Application Category

📝 Abstract

In-context learning (ICL) is an important yet not fully understood ability of pre-trained large language models (LLMs). It can greatly enhance task performance using a few examples, termed demonstrations, without fine-tuning. Although effective in question answering, ICL often underperforms in long-form generation tasks such as summarization. Under appropriately realistic assumptions, we empirically and theoretically show that ICL demonstrations alone are insufficient to teach LLMs the task language and format distributions for generation. We argue for explicit exposure to the task distributions and hypothesize that defining them by prompting enhances model performance. To this end, we present LongGuide, which efficiently generates two parallel streams of guidelines capturing task language and format properties: (i) Metric Guidelines (MGs) that instruct models to optimize self-evaluated metrics; and (ii) Output Constraint Guidelines (OCGs) that constrain generation at both token and sentence levels. LongGuide automatically selects the best combination of guidelines, improving both strong open- and closed-source LLMs by over 5% in both zero- and few-shot settings. We show that LongGuide is generalizable, learnable by weak models to enhance strong ones, and integrates synergistically with automatic prompt optimizers.

Problem

Research questions and friction points this paper is trying to address.

ICL insufficient for long-form generation tasks

Need explicit task distribution exposure

LongGuide improves LLMs via automatic guideline selection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates parallel task language and format guidelines

Automatically selects best guideline combinations

Improves LLMs by over 5% in various settings

🔎 Similar Papers

No similar papers found.