🤖 AI Summary
This work addresses the challenge of noise-sensitive automated depression assessment in real-world settings, where existing deterministic methods yield uncalibrated point estimates prone to clinical misdiagnosis. To this end, we propose EviDep, a novel framework that uniquely integrates disentangled representation learning with evidential learning, leveraging a Normal-Inverse-Gamma distribution to jointly model depression severity and both aleatoric and epistemic uncertainties. EviDep incorporates frequency-aware feature extraction and a wavelet-based mixture-of-experts architecture to explicitly disentangle multimodal consensus from modality-specific information, thereby mitigating overconfident predictions caused by redundant cross-modal evidence. Evaluated on AVEC 2013/2014, DAIC-WOZ, and E-DAIC datasets, EviDep achieves state-of-the-art prediction performance while significantly improving uncertainty calibration, offering a reliable safety mechanism for clinical screening applications.
📝 Abstract
Automated depression estimation is highly vulnerable to signal corruption and ambient noise in real-world deployment. Prevailing deterministic methods produce uncalibrated point estimates, exposing safety-critical clinical systems to the severe risk of overconfident misdiagnoses. To establish a highly resilient and trustworthy assessment paradigm, we propose EviDep, an evidential learning framework that jointly quantifies depression severity alongside aleatoric and epistemic uncertainties via a Normal-Inverse-Gamma distribution. A fundamental vulnerability in multimodal evidential fusion is the uncontrolled accumulation of cross-modal redundancies. This structural flaw artificially inflates diagnostic confidence by double-counting overlapping evidence. To guarantee robust evidence synthesis, EviDep enforces strict information integrity. First, a Frequency-aware Feature Extraction module leverages a wavelet-based Mixture-of-Experts to dynamically isolate task-irrelevant noise, preserving the fidelity of diagnostic signals. Subsequently, a Disentangled Evidential Learning strategy separates the shared consensus from modality-specific nuances. By explicitly decoupling these representations before Bayesian fusion, EviDep systematically mitigates evidence redundancy. Extensive experiments on AVEC 2013, 2014, DAIC-WOZ, and E-DAIC confirm that EviDep achieves state-of-the-art predictive accuracy and superior uncertainty calibration, delivering a robust fail-safe mechanism for trustworthy clinical screening.