🤖 AI Summary
Existing attribution methods for music generation provide only a single scalar score, which is insufficient to reveal how training data influences specific musical dimensions. To address this limitation, this work proposes ARIA, the first framework enabling multidimensional attribution tailored for copyright analysis. ARIA decomposes influence across five symbolic music dimensions and three audio dimensions, and introduces a reliability diagnostic mechanism based on segment-level score matrices. This mechanism leverages singular value decomposition, column-wise statistics, and comparisons against randomized reference groups to assess the credibility and specificity of attribution results. Experiments demonstrate that, in symbolic music models, ARIA’s diagnostics align with ground-truth counterfactual retraining outcomes; in audio models, it effectively identifies invalid retrieval patterns and characterizes the musical dimensions captured by embedding similarity baselines.
📝 Abstract
Training data attribution (TDA) for music generation must answer two questions that copyright analysis requires, namely which training songs influence a generated output and along which musical aspects the influence operates. Existing methods reduce influence to a single scalar, without revealing which musical aspects are dominant in that influence. We propose ARIA, a framework that decomposes attribution along musical aspects (five for symbolic music, three for audio) and pairs the decomposition with reliability diagnostics computed from the segment-level score matrix. It measures within-group similarity among the top-K attributed tracks against random reference groups drawn from the training pool, and diagnoses the score matrix through its singular value decomposition and column statistics. On a symbolic-music model where attribution ground truth is available through counterfactual retraining, the reliability diagnostics rank four attribution methods identically to that ground truth. On an audio music generation model, ARIA reveals attribution behaviors that vary substantially across TDA methods, flags score matrices whose retrieved tracks are nearly identical across queries rather than reflecting per-query attribution, and characterizes embedding-similarity retrieval baselines by the musical aspect each encoder surfaces. Together, ARIA produces per-aspect attribution evidence aligned with the musical aspects considered under the idea-expression distinction in copyright analysis.