🤖 AI Summary
Existing affective motion generation methods are constrained by fixed-scale datasets, limiting their adaptability to dynamically expanding open-domain scenarios—such as sports and dance—and hindering real-world generalization. To address this, we propose L²-EMG (LLM-Centric Lifelong Empathic Motion Generation), a novel task that establishes the first continual learning paradigm for unseen scenarios, tackling dual challenges of emotion disentanglement and scene adaptation. Methodologically, we introduce a causality-guided emotion disentanglement module and a scene-adaptive mixture-of-experts mechanism, integrated within an LLM-centric architecture enhanced by cross-scenario transfer and a curated multi-source L²-EMG dataset. Experiments on multiple in-house benchmarks demonstrate significant improvements over state-of-the-art baselines, validating superior emotion generalization and scene adaptability. This work pioneers a new paradigm for empathic–intelligent co-evolution in embodied agents.
📝 Abstract
In the literature, existing human-centric emotional motion generation methods primarily focus on boosting performance within a single scale-fixed dataset, largely neglecting the flexible and scale-increasing motion scenarios (e.g., sports, dance), whereas effectively learning these newly emerging scenarios can significantly enhance the model's real-world generalization ability. Inspired by this, this paper proposes a new LLM-Centric Lifelong Empathic Motion Generation (L^2-EMG) task, which aims to equip LLMs with the capability to continually acquire emotional motion generation knowledge across different unseen scenarios, potentially contributing to building a closed-loop and self-evolving embodied agent equipped with both empathy and intelligence. Further, this paper poses two key challenges in the L^2-EMG task, i.e., the emotion decoupling challenge and the scenario adapting challenge. To this end, this paper proposes an Emotion-Transferable and Scenario-Adapted Mixture of Experts (ES-MoE) approach which designs a causal-guided emotion decoupling block and a scenario-adapted expert constructing block to address the two challenges, respectively. Especially, this paper constructs multiple L^2-EMG datasets to validate the effectiveness of the ES-MoE approach. Extensive evaluations show that ES-MoE outperforms advanced baselines.