🤖 AI Summary
Chain-of-Thought (CoT) explanations in multimodal moral reasoning exhibit a double-edged effect: while enhancing model transparency, they may reinforce user confirmation bias—especially when outputs align with expectations or adopt confident linguistic framing—thereby impairing detection of flawed reasoning. Method: We systematically perturb visual-language models’ (VLMs’) reasoning chains and manipulate the epistemic certainty of their linguistic expressions to isolate factors influencing human trust and error identification. Contribution/Results: Experiments reveal that user trust is primarily driven by output consistency and rhetorical confidence—not reasoning correctness. Critically, confidently phrased but logically defective explanations significantly reduce error detection rates while sustaining high reliance. This work provides the first empirical evidence that CoT can exacerbate cognitive biases in multimodal moral judgment. It argues that explanation design should prioritize fostering critical evaluation over defaulting to persuasive acceptance, advocating for interpretability frameworks that explicitly support skepticism and verification.
📝 Abstract
Explanations are often promoted as tools for transparency, but they can also foster confirmation bias; users may assume reasoning is correct whenever outputs appear acceptable. We study this double-edged role of Chain-of-Thought (CoT) explanations in multimodal moral scenarios by systematically perturbing reasoning chains and manipulating delivery tones. Specifically, we analyze reasoning errors in vision language models (VLMs) and how they impact user trust and the ability to detect errors. Our findings reveal two key effects: (1) users often equate trust with outcome agreement, sustaining reliance even when reasoning is flawed, and (2) the confident tone suppresses error detection while maintaining reliance, showing that delivery styles can override correctness. These results highlight how CoT explanations can simultaneously clarify and mislead, underscoring the need for NLP systems to provide explanations that encourage scrutiny and critical thinking rather than blind trust. All code will be released publicly.