Sample, Align, Synthesize: Graph-Based Response Synthesis with ConGrs

📅 2025-10-03

📈 Citations: 0

✨ Influential: 0

career value

164K/year

🤖 AI Summary

Existing methods struggle to efficiently integrate cognitive diversity and semantic distributions embedded in long-form responses generated via multiple LLM samplings. To address this, we propose Consensus Graph Structure (ConGrs), a directed acyclic graph-based data structure inspired by bioinformatics. ConGrs automatically models shared semantics and divergent reasoning paths across responses through lightweight sequence alignment, and synergistically combines secondary LLM discrimination with task-adaptive decoding for robust knowledge fusion and response generation. ConGrs markedly reduces reliance on LLM-based adjudication—by over 80%—while achieving substantial improvements: a 31% gain in factual accuracy for biography generation; up to a 6-percentage-point increase in mathematical reasoning accuracy; and a 56% improvement in rejection rate for unanswerable questions. The framework demonstrates strong scalability and robustness across diverse reasoning tasks and model scales.

Technology Category

Application Category

📝 Abstract

Language models can be sampled multiple times to access the distribution underlying their responses, but existing methods cannot efficiently synthesize rich epistemic signals across different long-form responses. We introduce Consensus Graphs (ConGrs), a flexible DAG-based data structure that represents shared information, as well as semantic variation in a set of sampled LM responses to the same prompt. We construct ConGrs using a light-weight lexical sequence alignment algorithm from bioinformatics, supplemented by the targeted usage of a secondary LM judge. Further, we design task-dependent decoding methods to synthesize a single, final response from our ConGr data structure. Our experiments show that synthesizing responses from ConGrs improves factual precision on two biography generation tasks by up to 31% over an average response and reduces reliance on LM judges by more than 80% compared to other methods. We also use ConGrs for three refusal-based tasks requiring abstention on unanswerable queries and find that abstention rate is increased by up to 56%. We apply our approach to the MATH and AIME reasoning tasks and find an improvement over self-verification and majority vote baselines by up to 6 points of accuracy. We show that ConGrs provide a flexible method for capturing variation in LM responses and using the epistemic signals provided by response variation to synthesize more effective responses.

Problem

Research questions and friction points this paper is trying to address.

Synthesizing epistemic signals from multiple language model responses

Improving factual precision in biography generation tasks

Enhancing refusal capability for unanswerable queries

Innovation

Methods, ideas, or system contributions that make the work stand out.

Constructs consensus graphs using sequence alignment algorithm

Uses secondary LM judge for targeted information supplementation

Designs task-dependent decoding for final response synthesis

🔎 Similar Papers

No similar papers found.