Self-playing Adversarial Language Game Enhances LLM Reasoning

๐Ÿ“… 2024-04-16
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 11
โœจ Influential: 1
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the limitations of large language models (LLMs) in deep semantic understanding and counterfactual reasoning for complex inference tasks, this paper proposes Adversarial Tabooโ€”a two-player adversarial language game that establishes an implicit semanticๅšๅผˆ under information constraints. Methodologically, we introduce a novel self-play adversarial training paradigm: an LLM autonomously assumes the roles of attacker and defender in multi-turn dialogues where the target word is withheld; policy optimization proceeds solely via reinforcement learning guided by win/loss signals, eliminating reliance on human annotations or external supervision. Experiments demonstrate substantial performance gains across diverse reasoning benchmarks, and iterative self-play enables continual capability refinement. The implementation is publicly available.

Technology Category

Application Category

๐Ÿ“ Abstract
We explore the potential of self-play training for large language models (LLMs) in a two-player adversarial language game called Adversarial Taboo. In this game, an attacker and a defender communicate around a target word only visible to the attacker. The attacker aims to induce the defender to speak the target word unconsciously, while the defender tries to infer the target word from the attacker's utterances. To win the game, both players must have sufficient knowledge about the target word and high-level reasoning ability to infer and express in this information-reserved conversation. Hence, we are curious about whether LLMs' reasoning ability can be further enhanced by Self-Playing this Adversarial language Game (SPAG). With this goal, we select several open-source LLMs and let each act as the attacker and play with a copy of itself as the defender on an extensive range of target words. Through reinforcement learning on the game outcomes, we observe that the LLMs' performances uniformly improve on a broad range of reasoning benchmarks. Furthermore, iteratively adopting this self-play process can continuously promote LLMs' reasoning abilities. The code is available at https://github.com/Linear95/SPAG.
Problem

Research questions and friction points this paper is trying to address.

Self-play Training
Large Language Models (LLMs)
Adversarial Language Games
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-play Training
Adversarial Taboo
Reinforcement Learning
๐Ÿ”Ž Similar Papers
No similar papers found.
Pengyu Cheng
Pengyu Cheng
Alibaba Group
machine learningnatural language processing
T
Tianhao Hu
Tencent AI Lab, Shenzhen
H
Han Xu
Tencent AI Lab, Shenzhen
Zhisong Zhang
Zhisong Zhang
City University of Hong Kong
Natural Language Processing
Y
Yong Dai
Tencent AI Lab, Shenzhen
L
Lei Han
Tencent Robotics X Lab
N
Nan Du
Tencent AI Lab, Shenzhen