🤖 AI Summary
The spatial audio field lacks systematic surveys, with fragmented research methodologies, divergent technical approaches, and inconsistent evaluation protocols.
Method: This paper introduces the first unified classification framework grounded in input–output representations, structured along two orthogonal dimensions—temporal evolution and task taxonomy. It systematically surveys representation paradigms for both spatial audio generation and understanding, clarifies technical lineages across AR/VR applications, and consolidates mainstream datasets, objective/subjective evaluation metrics, and benchmark suites. Through cross-modal analysis integrating generative modeling, signal processing, and multimodal perception, it proposes a standardized research paradigm.
Contribution/Results: We release an open-source platform encompassing curated datasets, reproducible benchmarks, and integrated toolchains—significantly enhancing standardization, comparability, and reproducibility in spatial audio training, evaluation, and experimentation.
📝 Abstract
With the rapid development of spatial audio technologies today, applications in AR, VR, and other scenarios have garnered extensive attention. Unlike traditional mono sound, spatial audio offers a more realistic and immersive auditory experience. Despite notable progress in the field, there remains a lack of comprehensive surveys that systematically organize and analyze these methods and their underlying technologies. In this paper, we provide a comprehensive overview of spatial audio and systematically review recent literature in the area. To address this, we chronologically outlining existing work related to spatial audio and categorize these studies based on input-output representations, as well as generation and understanding tasks, thereby summarizing various research aspects of spatial audio. In addition, we review related datasets, evaluation metrics, and benchmarks, offering insights from both training and evaluation perspectives. Related materials are available at https://github.com/dieKarotte/ASAudio.