LLM Output Homogenization is Task Dependent

📅 2025-09-25

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

Large language models (LLMs) exhibit task-dependent output homogenization, yet existing work lacks task-adapted definitions and evaluation frameworks for diversity. Method: We propose a “task-dependent diversity” perspective, establishing a taxonomy spanning eight task categories—including mathematical reasoning and creative writing—and designing a task-anchored functional diversity metric to eliminate subjectivity and task mismatch. We further introduce task-aware sampling, which enhances necessary diversity while preserving correctness, thereby challenging the conventional diversity–quality trade-off assumption. Contribution/Results: Experiments demonstrate that our approach significantly improves output diversity on creative tasks while maintaining high consistency on objective tasks, enabling precise, task-specific regulation of output homogenization across diverse application scenarios.

Technology Category

Application Category

📝 Abstract

A large language model can be less helpful if it exhibits output response homogenization. But whether two responses are considered homogeneous, and whether such homogenization is problematic, both depend on the task category. For instance, in objective math tasks, we often expect no variation in the final answer but anticipate variation in the problem-solving strategy. Whereas, for creative writing tasks, we may expect variation in key narrative components (e.g. plot, genre, setting, etc), beyond the vocabulary or embedding diversity produced by temperature-sampling. Previous work addressing output homogenization often fails to conceptualize diversity in a task-dependent way. We address this gap in the literature directly by making the following contributions. (1) We present a task taxonomy comprised of eight task categories that each have distinct conceptualizations of output homogenization. (2) We introduce task-anchored functional diversity to better evaluate output homogenization. (3) We propose a task-anchored sampling technique that increases functional diversity for task categories where homogenization is undesired, while preserving homogenization where it is desired. (4) We challenge the perceived existence of a diversity-quality trade-off by increasing functional diversity while maintaining response quality. Overall, we demonstrate how task dependence improves the evaluation and mitigation of output homogenization.

Problem

Research questions and friction points this paper is trying to address.

Evaluating LLM output homogenization varies across task categories

Proposing task-anchored diversity metrics to assess homogenization appropriately

Developing sampling methods to control homogenization based on task needs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Task taxonomy categorizing output homogenization conceptualizations

Task-anchored functional diversity for better homogenization evaluation

Task-anchored sampling technique increasing functional diversity appropriately

🔎 Similar Papers

LLMs Are Biased Towards Output Formats! Systematically Evaluating and Mitigating Output Format Bias of LLMs