🤖 AI Summary
Emotion computing models exhibit a significant generalization gap between controlled laboratory settings (in vitro) and real-world deployment scenarios (in vivo). To bridge this gap, we propose a novel multimodal pretraining framework that synergistically integrates Learning Using Privileged Information (LUPI) with supervised contrastive learning (SCL)—the first such integration in emotion modeling. During training, high-dimensional privileged information—available only at training time but inaccessible during inference—is leveraged via teacher–student knowledge distillation to guide the student model toward learning more discriminative and robust cross-modal representations. Evaluated on the RECOLA and AGAIN benchmarks, our method substantially outperforms end-to-end and conventional LUPI baselines; notably, it achieves performance on several metrics comparable to fully supervised multimodal models. This demonstrates a marked reduction in the in vitro–in vivo performance gap, validating the framework’s strong generalization capability and scalability for real-world affective computing applications.
📝 Abstract
Affective Computing (AC) has made significant progress with the advent of deep learning, yet a persistent challenge remains: the reliable transfer of affective models from controlled laboratory settings (in-vitro) to uncontrolled real-world environments (in-vivo). To address this challenge we introduce the Privileged Contrastive Pretraining (PriCon) framework according to which models are first pretrained via supervised contrastive learning (SCL) and then act as teacher models within a Learning Using Privileged Information (LUPI) framework. PriCon both leverages privileged information during training and enhances the robustness of derived affect models via SCL. Experiments conducted on two benchmark affective corpora, RECOLA and AGAIN, demonstrate that models trained using PriCon consistently outperform LUPI and end to end models. Remarkably, in many cases, PriCon models achieve performance comparable to models trained with access to all modalities during both training and testing. The findings underscore the potential of PriCon as a paradigm towards further bridging the gap between in-vitro and in-vivo affective modelling, offering a scalable and practical solution for real-world applications.