Auditing Reciprocal Sentiment Alignment: Inversion Risk, Dialect Representation and Intent Misalignment in Transformers

📅 2026-02-19

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

This study addresses the risk of misaligned sentiment representations in cross-lingual (Bengali–English) AI systems, which can lead to erroneous intent interpretation and erode human–AI trust. The authors construct an evaluation framework based on multilingual Transformer models—including mDistilBERT and IndicBERT—to quantify, for the first time, phenomena such as “sentiment reversal rate” and “asymmetric empathy.” Their analysis reveals significant representational gaps and safety risks in current alignment methods when applied to low-resource languages and dialectal variants. Notably, compressed models exhibit a sentiment reversal rate of 28.7%, while IndicBERT shows a 57% increase in alignment error on formal Bengali, underscoring how generic compression strategies compromise affective fidelity. The work advocates incorporating “sentiment stability” into alignment evaluation protocols and calls for culturally attuned sentiment alignment mechanisms.

Technology Category

Application Category

📝 Abstract

The core theme of bidirectional alignment is ensuring that AI systems accurately understand human intent and that humans can trust AI behavior. However, this loop fractures significantly across language barriers. Our research addresses Cross-Lingual Sentiment Misalignment between Bengali and English by benchmarking four transformer architectures. We reveal severe safety and representational failures in current alignment paradigms. We demonstrate that compressed model (mDistilBERT) exhibits 28.7% "Sentiment Inversion Rate," fundamentally misinterpreting positive user intent as negative (or vice versa). Furthermore, we identify systemic nuances affecting human-AI trust, including "Asymmetric Empathy" where some models systematically dampen and others amplify the affective weight of Bengali text relative to its English counterpart. Finally, we reveal a "Modern Bias" in the regional model (IndicBERT), which shows a 57% increase in alignment error when processing formal (Sadhu) Bengali. We argue that equitable human-AI co-evolution requires pluralistic, culturally grounded alignment that respects language and dialectal diversity over universal compression, which fails to preserve the emotional fidelity required for reciprocal human-AI trust. We recommend that alignment benchmarks incorporate "Affective Stability" metrics that explicitly penalize polarity inversions in low-resource and dialectal contexts.

Problem

Research questions and friction points this paper is trying to address.

Cross-Lingual Sentiment Misalignment

Sentiment Inversion

Dialect Representation

Affective Stability

Reciprocal Trust

Innovation

Methods, ideas, or system contributions that make the work stand out.

Sentiment Inversion Rate

Asymmetric Empathy

Modern Bias