Scholar

Joan Serrà

Google Scholar ID: sZLj96sAAAAJ

Sony AI

Representation LearningGenerative ModelsMachine ListeningMusic Information Retrieval

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

6,277

H-index

i10-index

Publications

Co-authors

121

list available

Contact

TwitterOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

8 items

Training data attribution in diffusion models via mirrored unlearning and noise-consistent skew

2026

Cited

Woosh: A Sound Effects Foundation Model

2026

Cited

Automatic Music Mixing using a Generative Model of Effect Embeddings

2025

Cited

Automatic Music Sample Identification with Multi-Track Contrastive Learning

2025

Cited

Leveraging Whisper Embeddings for Audio-based Lyrics Matching

2025

Cited

Attribution-by-design: Ensuring Inference-Time Provenance in Generative Music Systems

2025

Cited

Enhancing Neural Audio Fingerprint Robustness to Audio Degradation for Music Identification

2025

Cited

Large-Scale Training Data Attribution for Music Generative Models via Unlearning

2025

Cited

Resume (English only)

Academic Achievements

Involved in several research projects, co-invented over 20 patents, and co-authored over 150 publications, many of them highly cited and/or in top tier venues. Recent publications include:
- 'Towards blind data cleaning: a case study in music source separation' (preprint 2025)
- 'Automatic music sample identification with multi-track contrastive learning' (preprint 2025)
- 'Leveraging Whisper embeddings for audio-based lyrics matching' (preprint 2025)
- 'Attribution-by-design: ensuring inference-time provenance in generative music systems' (preprint 2025)
- 'System and method for attributing an output of a generative artificial intelligence (AI) system' (patent 2025)
- 'Enhancing neural audio fingerprint robustness to audio degradation for music identification' (ISMIR 2025)
- 'A comprehensive real-world assessment of audio watermarking algorithms: will they survive neural codecs?' (INTERSPEECH 2025)
- 'Large-scale training data attribution for music generative models via unlearning' (NeurIPS 2025)
- 'Supervised contrastive learning from weakly-labeled audio segments for musical version matching' (ICML 2025)

Research Experience

Machine learning researcher at Telefónica R&D (2015-2019); AI researcher and manager at Dolby Laboratories (2019-2024).

Education

MSc and PhD: 2006-2011 in machine learning for audio at the Music Technology Group of Universitat Pompeu Fabra; Postdoc: 2011-2015 in artificial intelligence at IIIA-CSIC.

Background

Research interests: machine learning, with a focus on audio and multimedia analysis, synthesis, and retrieval. Brief bio: staff research scientist and team lead at Sony AI since 2024.

Miscellany

Occasionally acts as reviewer or area chair for some venues (provided articles are free access/charge), and gives talks and lectures on subjects of interest, lately mainly related to representation learning and generative modeling.

Co-authors

121 total