GenerationPrograms: Fine-grained Attribution with Executable Programs

📅 2025-06-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current large language models (LLMs) lack fine-grained, verifiable attribution capabilities in source-conditioned generation, undermining output credibility and interpretability; existing approaches fail to reveal how models leverage source documents to produce responses. To address this, we propose a two-stage executable program framework: first, planning a modular program composed of explicit text operations (e.g., rewriting, compression, fusion); second, executing the program to generate the response. This work pioneers the decoupling of generation into planning and execution, enabling attribution-based interpretability, local editability, and post-hoc correction. Inspired by code-agent paradigms, we design structured operation operators and a program-driven execution engine. Experiments on long-document question answering and multi-document summarization demonstrate significant improvements in both document- and sentence-level attribution quality. As a post-hoc method, our approach also surpasses baselines and supports module-level continual optimization.

Technology Category

Application Category

📝 Abstract
Recent large language models (LLMs) achieve impressive performance in source-conditioned text generation but often fail to correctly provide fine-grained attributions for their outputs, undermining verifiability and trust. Moreover, existing attribution methods do not explain how and why models leverage the provided source documents to generate their final responses, limiting interpretability. To overcome these challenges, we introduce a modular generation framework, GenerationPrograms, inspired by recent advancements in executable"code agent"architectures. Unlike conventional generation methods that simultaneously generate outputs and attributions or rely on post-hoc attribution, GenerationPrograms decomposes the process into two distinct stages: first, creating an executable program plan composed of modular text operations (such as paraphrasing, compression, and fusion) explicitly tailored to the query, and second, executing these operations following the program's specified instructions to produce the final response. Empirical evaluations demonstrate that GenerationPrograms significantly improves attribution quality at both the document level and sentence level across two long-form question-answering tasks and a multi-document summarization task. We further demonstrate that GenerationPrograms can effectively function as a post-hoc attribution method, outperforming traditional techniques in recovering accurate attributions. In addition, the interpretable programs generated by GenerationPrograms enable localized refinement through modular-level improvements that further enhance overall attribution quality.
Problem

Research questions and friction points this paper is trying to address.

Improving fine-grained attribution in LLM outputs
Enhancing interpretability of source document usage
Modular generation for better attribution quality
Innovation

Methods, ideas, or system contributions that make the work stand out.

Modular generation framework for fine-grained attribution
Executable program plan with tailored text operations
Two-stage process: program creation then execution
🔎 Similar Papers
No similar papers found.