🤖 AI Summary
This work addresses the problem of generating natural, multimodal listener facial reactions—diverse, contextually appropriate, photorealistic, and temporally synchronized to speaker audiovisual behavior—in spontaneous two-party conversations. Methodologically, we introduce MARS, the first large-scale multimodal listener reaction dataset (137 real dialogues, 2,856 utterances), and formulate two complementary tasks: offline generation and online streaming generation. Our approach jointly models audio, visual, and semantic features, integrating generative frameworks (VAE, GAN, diffusion models), temporal alignment techniques, and diversity-enforcing constraints. Key contributions include: (1) public release of the MARS dataset and a unified evaluation benchmark; (2) open-sourcing of baseline implementations; and (3) quantitative evaluation across three dimensions—reaction quality, contextual appropriateness, and audiovisual synchronization. Our framework advances listener reaction generation toward greater realism and practical deployability.
📝 Abstract
In dyadic interactions, a broad spectrum of human facial reactions might be appropriate for responding to each human speaker behaviour. Following the successful organisation of the REACT 2023 and REACT 2024 challenges, we are proposing the REACT 2025 challenge encouraging the development and benchmarking of Machine Learning (ML) models that can be used to generate multiple appropriate, diverse, realistic and synchronised human-style facial reactions expressed by human listeners in response to an input stimulus (i.e., audio-visual behaviours expressed by their corresponding speakers). As a key of the challenge, we provide challenge participants with the first natural and large-scale multi-modal MAFRG dataset (called MARS) recording 137 human-human dyadic interactions containing a total of 2856 interaction sessions covering five different topics. In addition, this paper also presents the challenge guidelines and the performance of our baselines on the two proposed sub-challenges: Offline MAFRG and Online MAFRG, respectively. The challenge baseline code is publicly available at https://github.com/reactmultimodalchallenge/baseline_react2025