Improving Attributed Long-form Question Answering with Intent Awareness

📅 2026-03-28

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This work addresses the challenge that large language models often produce lower-quality, knowledge-intensive long-form reports due to neglecting authors’ implicit writing and citation intentions. To mitigate this, the study introduces an intent-aware mechanism into long-form question answering for the first time, explicitly modeling such intentions through a structured tagging schema. The approach integrates zero-shot reasoning, synthetic data generation, and fine-tuning of smaller models to effectively incorporate intent information into the generation process. Experimental results demonstrate significant performance gains across multiple scientific report generation tasks: large models achieve an average improvement of 2.9 absolute percentage points, while smaller models show a more pronounced gain of 12.3 points. Moreover, the method substantially enhances both citation appropriateness and overall report readability.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) are increasingly being used to generate comprehensive, knowledge-intensive reports. However, while these models are trained on diverse academic papers and reports, they are not exposed to the reasoning processes and intents that guide authors in crafting these documents. We hypothesize that enhancing a model's intent awareness can significantly improve the quality of generated long-form reports. We develop and employ structured, tag-based schemes to better elicit underlying implicit intents to write or cite. We demonstrate that these extracted intents enhance both zero-shot generation capabilities in LLMs and enable the creation of high-quality synthetic data for fine-tuning smaller models. Our experiments reveal improved performance across various challenging scientific report generation tasks, with an average improvement of +2.9 and +12.3 absolute points for large and small models over baselines, respectively. Furthermore, our analysis illuminates how intent awareness enhances model citation usage and substantially improves report readability.

Problem

Research questions and friction points this paper is trying to address.

intent awareness

long-form question answering

citation intent

scientific report generation

attributed generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

intent awareness

structured tagging

long-form question answering