TalkPlay-Tools: Conversational Music Recommendation with LLM Tool Calling

📅 2025-10-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing LLM-based music recommendation systems predominantly rely on natural language interaction, neglecting essential capabilities such as metadata filtering and Boolean retrieval—leading to inflexible, monolithic recommendation behavior. To address this, we propose the first conversational music recommendation framework supporting multimodal tool calling. It unifies heterogeneous retrieval operations—including SQL-based filtering, BM25 sparse retrieval, embedding-based dense retrieval, and generative semantic ID matching—within a single LLM-driven planning interface. The LLM autonomously orchestrates invocation order and parameterization, enabling dynamic path selection and coordinated multi-module execution. Crucially, our framework is the first to seamlessly integrate diverse database query paradigms into an end-to-end conversational recommendation pipeline. Empirical evaluation demonstrates substantial improvements in recommendation flexibility and accuracy across diverse scenarios, establishing a novel paradigm for controllable, interpretable, LLM-powered music recommendation.

Technology Category

Application Category

📝 Abstract
While the recent developments in large language models (LLMs) have successfully enabled generative recommenders with natural language interactions, their recommendation behavior is limited, leaving other simpler yet crucial components such as metadata or attribute filtering underutilized in the system. We propose an LLM-based music recommendation system with tool calling to serve as a unified retrieval-reranking pipeline. Our system positions an LLM as an end-to-end recommendation system that interprets user intent, plans tool invocations, and orchestrates specialized components: boolean filters (SQL), sparse retrieval (BM25), dense retrieval (embedding similarity), and generative retrieval (semantic IDs). Through tool planning, the system predicts which types of tools to use, their execution order, and the arguments needed to find music matching user preferences, supporting diverse modalities while seamlessly integrating multiple database filtering methods. We demonstrate that this unified tool-calling framework achieves competitive performance across diverse recommendation scenarios by selectively employing appropriate retrieval methods based on user queries, envisioning a new paradigm for conversational music recommendation systems.
Problem

Research questions and friction points this paper is trying to address.

Unifying metadata filtering and retrieval methods in music recommendation
Enabling conversational music recommendation through LLM tool calling
Orchestrating specialized retrieval components based on user intent
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based system interprets user intent and plans tools
Unified pipeline integrates SQL, BM25, embedding, and semantic IDs
Tool calling framework selectively employs multiple retrieval methods
🔎 Similar Papers
No similar papers found.