On Synthesizing Data for Context Attribution in Question Answering

📅 2025-02-21

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

153K/year

🤖 AI Summary

To address the hallucination problem in large language model (LLM)-based question answering—where answers lack verifiable textual support—the paper proposes SynQA, a generative data synthesis framework specifically designed for context attribution tasks. SynQA leverages controllable LLM generation to produce high-quality question-answer pairs explicitly grounded in designated context sentences, ensuring each answer is traceable and attributable to specific textual evidence. Unlike prior approaches, SynQA is the first to systematically demonstrate that fine-tuning small language models on synthetically generated data yields substantial improvements across multi-domain question answering with attribution. Experiments show significant gains in attribution accuracy and robustness across multiple benchmarks. Furthermore, user studies confirm that SynQA’s outputs exhibit high credibility and practical utility.

Technology Category

Application Category

📝 Abstract

Question Answering (QA) accounts for a significant portion of LLM usage"in the wild". However, LLMs sometimes produce false or misleading responses, also known as"hallucinations". Therefore, grounding the generated answers in contextually provided information -- i.e., providing evidence for the generated text -- is paramount for LLMs' trustworthiness. Providing this information is the task of context attribution. In this paper, we systematically study LLM-based approaches for this task, namely we investigate (i) zero-shot inference, (ii) LLM ensembling, and (iii) fine-tuning of small LMs on synthetic data generated by larger LLMs. Our key contribution is SynQA: a novel generative strategy for synthesizing context attribution data. Given selected context sentences, an LLM generates QA pairs that are supported by these sentences. This leverages LLMs' natural strengths in text generation while ensuring clear attribution paths in the synthetic training data. We show that the attribution data synthesized via SynQA is highly effective for fine-tuning small LMs for context attribution in different QA tasks and domains. Finally, with a user study, we validate the usefulness of small LMs (fine-tuned on synthetic data from SynQA) in context attribution for QA.

Problem

Research questions and friction points this paper is trying to address.

Reducing hallucinations in LLM-generated QA responses

Grounding answers with contextual evidence for trustworthiness

Synthesizing data to train small LMs for attribution

Innovation

Methods, ideas, or system contributions that make the work stand out.

Synthetic data generation for QA context attribution

Fine-tuning small LMs with LLM-generated synthetic data

Ensembling and zero-shot inference for context attribution

🔎 Similar Papers

Benchmarking Large Language Models in Complex Question Answering Attribution using Knowledge Graphs