GRCF: Two-Stage Groupwise Ranking and Calibration Framework for Multimodal Sentiment Analysis

📅 2026-01-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing multimodal sentiment analysis methods, which are vulnerable to label noise in regression tasks and overlook the ordinal relationships among samples, while pairwise ranking approaches employ static margins that treat all comparisons uniformly. To overcome these issues, the authors propose a two-stage Group Ranking and Calibration Framework (GRCF). In the first stage, a dominance-aware weighted dynamic-margin ranking loss constructs fine-grained ordinal structures and adaptively emphasizes difficult samples. The second stage leverages MAE-driven target calibration to jointly optimize relative ranking and absolute score accuracy. Notably, this framework introduces Group Relative Policy Optimization (GRPO) into multimodal sentiment analysis for the first time, unifying regression and classification tasks—such as humor and sarcasm detection—and achieves state-of-the-art performance on mainstream benchmarks with strong generalization capabilities.

Technology Category

Application Category

📝 Abstract
Most Multimodal Sentiment Analysis research has focused on point-wise regression. While straightforward, this approach is sensitive to label noise and neglects whether one sample is more positive than another, resulting in unstable predictions and poor correlation alignment. Pairwise ordinal learning frameworks emerged to address this gap, capturing relative order by learning from comparisons. Yet, they introduce two new trade-offs: First, they assign uniform importance to all comparisons, failing to adaptively focus on hard-to-rank samples. Second, they employ static ranking margins, which fail to reflect the varying semantic distances between sentiment groups. To address this, we propose a Two-Stage Group-wise Ranking and Calibration Framework (GRCF) that adapts the philosophy of Group Relative Policy Optimization (GRPO). Our framework resolves these trade-offs by simultaneously preserving relative ordinal structure, ensuring absolute score calibration, and adaptively focusing on difficult samples. Specifically, Stage 1 introduces a GRPO-inspired Advantage-Weighted Dynamic Margin Ranking Loss to build a fine-grained ordinal structure. Stage 2 then employs an MAE-driven objective to align prediction magnitudes. To validate its generalizability, we extend GRCF to classification tasks, including multimodal humor detection and sarcasm detection. GRCF achieves state-of-the-art performance on core regression benchmarks, while also showing strong generalizability in classification tasks.
Problem

Research questions and friction points this paper is trying to address.

Multimodal Sentiment Analysis
Ordinal Learning
Label Noise
Relative Ranking
Sentiment Calibration
Innovation

Methods, ideas, or system contributions that make the work stand out.

Groupwise Ranking
Dynamic Margin
Ordinal Learning
Calibration Framework
Multimodal Sentiment Analysis
🔎 Similar Papers
No similar papers found.
M
Manning Gao
South China Normal University, Guangzhou 510631, China
Leheng Zhang
Leheng Zhang
University of Electronic Science and Technology of China
image restoration
S
Shiqin Han
South China Normal University, Guangzhou 510631, China
Haifeng Hu
Haifeng Hu
Sun Yat-sen University
Yuncheng Jiang
Yuncheng Jiang
West China Hospital, Sichuan University
Computer VisionMedical Image Analysis
S
Sijie Mai
South China Normal University, Guangzhou 510631, China