Balancing Diversity and Risk in LLM Sampling: How to Select Your Method and Parameter for Open-Ended Text Generation

๐Ÿ“… 2024-08-24
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 2
โœจ Influential: 1
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the challenge of simultaneously ensuring safety and diversity in open-ended generation by large language models (LLMs), this paper proposes a context-aware dynamic truncation sampling optimization method. The core contribution is twofold: first, it introduces a prefix-tree structure built over complete-sentence contexts to quantitatively model truncation sampling capacity; second, it establishes a joint diversityโ€“risk evaluation framework that enables interpretable, adaptive co-selection of temperature and top-k/top-p parameters. Experimental results on multiple open-generation benchmarks demonstrate significant improvements in generation stability: high-risk repetition and hallucination rates decrease by 12.7%, while the Dist-2 diversity metric improves by 3.2% over baseline methods. The approach thus achieves a principled trade-off between safety and expressiveness without compromising either dimension.

Technology Category

Application Category

๐Ÿ“ Abstract
Sampling-based decoding strategies have been widely adopted for Large Language Models (LLMs) in numerous applications, targeting a balance between diversity and quality via temperature tuning and tail truncation. Considering the strong dependency of the candidate next tokens on different prefixes, recent studies propose to adaptively truncate the tail of LLMs' predicted distribution. Although improved results have been reported with these methods on open-ended text generation tasks, the results are highly dependent on the curated parameters and the limited exemplar text. In this paper, we propose a systematic way to estimate the capacity of a truncation sampling method by considering the trade-off between diversity and risk at each decoding step, based on our collected prefix tree which preserves the context of a full sentence. Our work offers a comprehensive comparison of existing truncation sampling methods and serves as a practical user guideline for their parameter selection.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Safety
Diversity
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Tail Trimming
Sampling Decoding Optimization
Text Generation Safety
Y
Yuxuan Zhou
CISPA Helmholtz Center for Information Security, Germany
M
M. Keuper
University of Mannheim, Germany
Mario Fritz
Mario Fritz
Faculty CISPA Helmholtz Center for Information Security; Professor Saarland University
Computer VisionMachine LearningTrustworthy AISecurityPrivacy