🤖 AI Summary
This work addresses the challenge of balancing stylistic diversity and music-theoretic feasibility in chord generation, a limitation of existing end-to-end approaches that tightly couple candidate generation with constraint satisfaction. To overcome this, the authors propose a Retrieval–Editing–Reranking (RER) framework that systematically decouples the task into three interpretable and controllable stages: diverse candidates are first retrieved to capture stylistic variation; rule-based or learned editing ensures harmonic validity; and a learning-driven reranker integrates soft user preferences to select final outputs. Although end-to-end trainable, this functionally modular pipeline significantly enhances controllability and transparency in navigating the trade-off between creativity and constraint adherence. Experiments demonstrate consistent superiority over current end-to-end baselines in both objective metrics and subjective evaluations, while ablation studies confirm the complementary roles of each stage in creative exploration and theoretical compliance.
📝 Abstract
Chord generation is an inherently constrained creative task that requires balancing stylistic diversity with music-theoretic feasibility. Existing approaches typically entangle candidate generation and constraint enforcement within a single model, making the diversity-feasibility trade-off difficult to control and interpret. In this work, we approach chord generation from a system-level perspective, introducing a Retrieval-Edit-Rerank (RER) framework that decomposes the task into three explicit stages: i) retrieval, which defines a stylistically plausible candidate space; ii) editing, which enforces music-theoretic feasibility through minimal modifications; and iii) reranking, which resolves soft preferences among feasible candidates. This separation provides a controllable pipeline, where each component addresses a distinct aspect of the generation process, thereby enhancing both the interpretability and adjustability of the output chords. Through objective metrics and subjective evaluation, our decomposed system outperforms all end-to-end chord generation baselines in balancing chord diversity and music-theoretic feasibility. Ablation studies further confirm the complementary roles of each stage in creative exploration and constraint satisfaction.