StyleBench: Evaluating Speech Language Models on Conversational Speaking Style Control

πŸ“… 2026-03-08
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

197K/year
πŸ€– AI Summary
This work addresses the lack of systematic evaluation of speaking-style intensity control in spoken language models within multi-turn dialogues. To bridge this gap, we introduce StyleBenchβ€”the first benchmark specifically designed for evaluating style control in conversational speech synthesis. StyleBench comprises a multi-turn dialogue dataset annotated along four stylistic dimensions: emotion, speech rate, volume, and pitch, and incorporates a user-prompt-driven mechanism for fine-grained style intensity control. Through comprehensive stylistic annotations and automated evaluation metrics, StyleBench establishes a standardized framework that reveals a significant performance gap between current spoken language models and general-purpose large language models in terms of controllable style generation. This benchmark provides both a diagnostic tool and a foundation to guide future research in controllable and expressive spoken dialogue systems.

Technology Category

Application Category

πŸ“ Abstract
Speech language models (SLMs) have significantly extended the interactive capability of text-based Large Language Models (LLMs) by incorporating paralinguistic information. For more realistic interactive experience with customized styles, current SLMs have managed to interpret and control speaking style intensity from user prompts during the dialogue process. However, there remains a lack of systematic benchmarks that quantifies and evaluates the style intensity control ability in conversations. In this paper, we propose StyleBench, a multi-turn dialogue benchmark for comprehensively evaluating the style intensity control ability across four dimensions: emotion, speed, volume, and pitch. Our results reveal the performance gaps between leading SLMs and omni language models (OLMs), suggesting the underlying reasons and promising approaches for future exploration.
Problem

Research questions and friction points this paper is trying to address.

speech language models
speaking style control
conversational benchmark
style intensity
paralinguistic information
Innovation

Methods, ideas, or system contributions that make the work stand out.

StyleBench
speech language models
style intensity control
multi-turn dialogue
paralinguistic evaluation
πŸ”Ž Similar Papers
No similar papers found.