GRCF: Two-Stage Groupwise Ranking and Calibration Framework for Multimodal Sentiment Analysis

📅 2026-01-14

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

This work addresses the limitations of existing multimodal sentiment analysis methods, which are vulnerable to label noise in regression tasks and overlook the ordinal relationships among samples, while pairwise ranking approaches employ static margins that treat all comparisons uniformly. To overcome these issues, the authors propose a two-stage Group Ranking and Calibration Framework (GRCF). In the first stage, a dominance-aware weighted dynamic-margin ranking loss constructs fine-grained ordinal structures and adaptively emphasizes difficult samples. The second stage leverages MAE-driven target calibration to jointly optimize relative ranking and absolute score accuracy. Notably, this framework introduces Group Relative Policy Optimization (GRPO) into multimodal sentiment analysis for the first time, unifying regression and classification tasks—such as humor and sarcasm detection—and achieves state-of-the-art performance on mainstream benchmarks with strong generalization capabilities.

Technology Category

Application Category

📝 Abstract

Most Multimodal Sentiment Analysis research has focused on point-wise regression. While straightforward, this approach is sensitive to label noise and neglects whether one sample is more positive than another, resulting in unstable predictions and poor correlation alignment. Pairwise ordinal learning frameworks emerged to address this gap, capturing relative order by learning from comparisons. Yet, they introduce two new trade-offs: First, they assign uniform importance to all comparisons, failing to adaptively focus on hard-to-rank samples. Second, they employ static ranking margins, which fail to reflect the varying semantic distances between sentiment groups. To address this, we propose a Two-Stage Group-wise Ranking and Calibration Framework (GRCF) that adapts the philosophy of Group Relative Policy Optimization (GRPO). Our framework resolves these trade-offs by simultaneously preserving relative ordinal structure, ensuring absolute score calibration, and adaptively focusing on difficult samples. Specifically, Stage 1 introduces a GRPO-inspired Advantage-Weighted Dynamic Margin Ranking Loss to build a fine-grained ordinal structure. Stage 2 then employs an MAE-driven objective to align prediction magnitudes. To validate its generalizability, we extend GRCF to classification tasks, including multimodal humor detection and sarcasm detection. GRCF achieves state-of-the-art performance on core regression benchmarks, while also showing strong generalizability in classification tasks.

Problem

Research questions and friction points this paper is trying to address.

Multimodal Sentiment Analysis

Ordinal Learning

Label Noise

Relative Ranking

Sentiment Calibration

Innovation

Methods, ideas, or system contributions that make the work stand out.

Groupwise Ranking

Dynamic Margin

Ordinal Learning