π€ AI Summary
This work addresses the pervasive issue of online clickbait headlines and the challenge posed by large language modelsβ (LLMs) sycophantic tendencies, which hinder objective truthfulness assessment. To overcome this limitation, the authors propose a Self-updating Opposing-stance Reasoning Generation (SORG) framework that strategically leverages LLMsβ sycophancy to automatically produce contrasting reasoning pairs for news headlines. Building upon these opposing inferences, they develop an Opposing-Reasoning-based Clickbait Detector (ORCD) that requires no ground-truth labels. ORCD integrates prompt engineering, a three-way BERT encoder, and soft-label contrastive learning to achieve robust clickbait identification. Evaluated on three benchmark datasets, ORCD significantly outperforms existing LLM prompting strategies, fine-tuned smaller models, and state-of-the-art detection baselines, demonstrating for the first time that LLM sycophancy can be repurposed as a valuable resource for generating multi-perspective reasoning.
π Abstract
The widespread proliferation of online content has intensified concerns about clickbait, deceptive or exaggerated headlines designed to attract attention. While Large Language Models (LLMs) offer a promising avenue for addressing this issue, their effectiveness is often hindered by Sycophancy, a tendency to produce reasoning that matches users'beliefs over truthful ones, which deviates from instruction-following principles. Rather than treating sycophancy as a flaw to be eliminated, this work proposes a novel approach that initially harnesses this behavior to generate contrastive reasoning from opposing perspectives. Specifically, we design a Self-renewal Opposing-stance Reasoning Generation (SORG) framework that prompts LLMs to produce high-quality agree and disagree reasoning pairs for a given news title without requiring ground-truth labels. To utilize the generated reasoning, we develop a local Opposing Reasoning-based Clickbait Detection (ORCD) model that integrates three BERT encoders to represent the title and its associated reasoning. The model leverages contrastive learning, guided by soft labels derived from LLM-generated credibility scores, to enhance detection robustness. Experimental evaluations on three benchmark datasets demonstrate that our method consistently outperforms LLM prompting, fine-tuned smaller language models, and state-of-the-art clickbait detection baselines.