A Symbolic Adversarial Learning Framework for Evolving Fake News Generation and Detection

📅 2025-08-27

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

To address the growing sophistication of LLM-generated disinformation and the limitations of existing detection methods in adapting to its dynamic evolution, this paper proposes a symbolic adversarial learning framework. It features co-evolving generative and detection agents engaged in structured debate: the former constructs logically coherent yet deceptive narratives, while the latter identifies factual and logical inconsistencies via symbolic reasoning. Crucially, the framework represents model weights, loss, and gradients as natural-language symbols—replacing conventional parameter updates with interpretable, evolvable symbolic inference. Integrated with multilingual prompt optimization, the framework significantly enhances both attack and defense capabilities across Chinese and English datasets: generated samples reduce the accuracy of mainstream detectors by 53.4% (Chinese) and 34.2% (English), while the detector achieves a 7.7% improvement in identifying novel disinformation.

Technology Category

Application Category

📝 Abstract

Rapid LLM advancements heighten fake news risks by enabling the automatic generation of increasingly sophisticated misinformation. Previous detection methods, including fine-tuned small models or LLM-based detectors, often struggle with its dynamically evolving nature. In this work, we propose a novel framework called the Symbolic Adversarial Learning Framework (SALF), which implements an adversarial training paradigm by an agent symbolic learning optimization process, rather than relying on numerical updates. SALF introduces a paradigm where the generation agent crafts deceptive narratives, and the detection agent uses structured debates to identify logical and factual flaws for detection, and they iteratively refine themselves through such adversarial interactions. Unlike traditional neural updates, we represent agents using agent symbolic learning, where learnable weights are defined by agent prompts, and simulate back-propagation and gradient descent by operating on natural language representations of weights, loss, and gradients. Experiments on two multilingual benchmark datasets demonstrate SALF's effectiveness, showing it generates sophisticated fake news that degrades state-of-the-art detection performance by up to 53.4% in Chinese and 34.2% in English on average. SALF also refines detectors, improving detection of refined content by up to 7.7%. We hope our work inspires further exploration into more robust, adaptable fake news detection systems.

Problem

Research questions and friction points this paper is trying to address.

Addressing fake news risks from rapidly evolving LLM-generated misinformation

Overcoming limitations of current detection methods against dynamic fake news

Developing adversarial learning for iterative fake news generation and detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adversarial training with symbolic learning optimization

Agents use structured debates for iterative refinement

Simulates back-propagation on natural language representations

🔎 Similar Papers

Adversarial Style Augmentation via Large Language Model for Robust Fake News Detection