MuChator: Enabling Active Music Discovery via Conversational Music LLMs in Douyin Music

📅 2026-05-26

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

This work addresses the challenge of users struggling to articulate contextual and ambiguous music preferences using natural language on Douyin’s music platform. To overcome the limitations of conventional passive recommendation paradigms, the authors propose MuChator, a novel framework that enhances large language models’ ability to understand and respond to contextualized music requests. MuChator integrates three key components: music knowledge-infused pretraining, context-aware instruction fine-tuning leveraging synthetically generated user-query-music triplets, and a hybrid reward model that jointly optimizes intent relevance, personalized preferences, and foundational constraints. Experimental results demonstrate that MuChator significantly outperforms state-of-the-art models such as Gemini-3-Pro on industrial-scale datasets. Upon deployment in the Douyin Music app, online A/B testing revealed a 46.49% increase in user active days.

📝 Abstract

Douyin Music, a large-scale platform with millions of daily users, adopts an immersive, feed-based discovery paradigm, where users passively explore music through continuous recommendations. While effective for passive music discovery, this paradigm restricts users to recommendation results and provides limited support for explicitly specifying listening intents. Unlike conventional search, where users express well-defined intents through explicit queries such as specific songs or artists, real-world active music discovery is often situational and colloquial, involving vague or underspecified requests. While LLMs enable natural language interaction, their direct use in music discovery remains limited by insufficient music-domain knowledge, lack of music-query collaborative reasoning, and shallow understanding of personalized preferences. To address these challenges, we introduce MuChator, an interactive MusicLLM-based framework that enables users to actively express situational music intents in natural language. MuChator incorporates three key components: (1) Music Knowledge Pre-training, a three-stage scheme that incrementally injects objective music knowledge, subjective music knowledge, and personalized music preferences into LLMs; (2) Context-aware Instruction Tuning, which constructs high-quality user-query-music triplets through an automated synthesis pipeline to align LLMs with active and situational user intents; and (3) Preference Alignment with Hybrid RM, which jointly models intent relevance, personalized preferences, and basic constraints, and is optimized using GRPO-based reinforcement learning. Extensive evaluations on industrial music recommendation datasets demonstrate that MuChator outperforms leading proprietary models, such as Gemini-3-Pro. The model has been deployed on Douyin Music App within ByteDance, with 46.49\% improvement of user active days in online A/B test.

Problem

Research questions and friction points this paper is trying to address.

active music discovery

conversational music LLMs

situational music intents

personalized music preferences

natural language interaction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Conversational Music LLM

Active Music Discovery

Music Knowledge Pre-training