Approximately Aligned Decoding

📅 2024-10-01

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

201K/year

🤖 AI Summary

To address distribution distortion and excessive computational overhead caused by post-hoc processing in LLM constrained generation, this paper proposes Approximate Alignment Decoding (AAD), a lightweight, sampling-free, and fine-tuning-free decoding algorithm. AAD explicitly balances distribution fidelity and computational efficiency during decoding via logit correction and dynamic threshold alignment—achieving Pareto-optimality between these objectives for the first time in constrained decoding. Experiments on highly constrained tasks—including factual consistency—demonstrate that AAD matches or exceeds distortion-free baseline performance (+3.2% accuracy), reduces KL divergence by 67%, and accelerates inference by 5.8×. Crucially, AAD is model-agnostic, requiring no architectural modifications, and ensures stable long-sequence generation across arbitrary autoregressive language models.

Technology Category

Application Category

📝 Abstract

It is common to reject undesired outputs of Large Language Models (LLMs); however, current methods to do so require an excessive amount of computation, or severely distort the distribution of outputs. We present a method to balance the distortion of the output distribution with computational efficiency, allowing for the generation of long sequences of text with difficult-to-satisfy constraints, with less amplification of low probability outputs compared to existing methods. We show through a series of experiments that the task-specific performance of our method is comparable to methods that do not distort the output distribution, while being much more computationally efficient.

Problem

Research questions and friction points this paper is trying to address.

Balancing output distribution distortion with computational efficiency

Generating constrained text sequences with minimal probability distortion

Improving constraint satisfaction while maintaining computational performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Balances output distortion with computational efficiency

Generates long sequences under difficult constraints

Amplifies low probability outputs less than existing methods

🔎 Similar Papers

No similar papers found.