The ICASSP 2026 Automatic Song Aesthetics Evaluation Challenge

πŸ“… 2026-01-12
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work proposes a data-driven approach to automatically predict subjective aesthetic ratings of AI-generated music, aligning with human aesthetic preferences. The study introduces the first standardized evaluation benchmark specifically designed for AI-generated music, encompassing overall musicality and five fine-grained aesthetic dimensions. An end-to-end model is developed to predict these ratings directly from audio inputs. Experimental results demonstrate that the proposed system significantly outperforms baseline methods and effectively captures human aesthetic judgments. This framework establishes a reproducible, human-aligned paradigm for evaluating AI-generated music, offering a principled alternative to conventional metrics that often fail to reflect nuanced perceptual qualities valued by listeners.

Technology Category

Application Category

πŸ“ Abstract
This paper summarizes the ICASSP 2026 Automatic Song Aesthetics Evaluation (ASAE) Challenge, which focuses on predicting the subjective aesthetic scores of AI-generated songs. The challenge consists of two tracks: Track 1 targets the prediction of the overall musicality score, while Track 2 focuses on predicting five fine-grained aesthetic scores. The challenge attracted strong interest from the research community and received numerous submissions from both academia and industry. Top-performing systems significantly surpassed the official baseline, demonstrating substantial progress in aligning objective metrics with human aesthetic preferences. The outcomes establish a standardized benchmark and advance human-aligned evaluation methodologies for modern music generation systems.
Problem

Research questions and friction points this paper is trying to address.

Automatic Song Aesthetics Evaluation
AI-generated songs
subjective aesthetic scores
musicality prediction
human-aligned evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Automatic Song Aesthetics Evaluation
music generation
aesthetic scoring
human-aligned evaluation
benchmark
Guobin Ma
Guobin Ma
Northwestern Polytechnical University
Yuxuan Xia
Yuxuan Xia
Researcher at Shanghai Jiao Tong University
Sensor fusionMultiple object trackingSLAM
J
Jixun Yao
Audio, Speech and Language Processing Group (ASLP@NPU), School of Computer Science, Northwestern Polytechnical University, Xi’an, China
H
Huixin Xue
Shanghai Conservatory of Music, Shanghai, China
Hexin Liu
Hexin Liu
Nanyang Technological University
Speech recognitionlanguage identification
Shuai Wang
Shuai Wang
Nanjing University
AI
H
Hao Liu
Shanghai Conservatory of Music, Shanghai, China
Lei Xie
Lei Xie
Northwestern Polytechnical University
speech processingspeech recognitionspeech synthesismultimediaartificial intelligence