GIFT: Games as Informal Training for Generalizable LLMs

📅 2026-01-09
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the deficiency of large language models in practical wisdom—such as strategic creativity and social reasoning—typically acquired through informal learning. To bridge this gap, the authors propose leveraging games as informal learning environments, capitalizing on their intrinsic rewards and abstract complexity to cultivate generalizable intelligence. The core innovation lies in a nested training framework that enforces sequential task composition, compelling the model to jointly optimize multiple competencies through an explicit “AND”-style multi-task objective, thereby mitigating task interference. Using GRPO-based reinforcement learning, the model is trained across diverse games including Matrix Games, TicTacToe, and Who's the Spy. Evaluation on multiple capability-oriented benchmarks demonstrates significant improvements in generalization performance, substantiating the efficacy of gamified informal learning for advancing artificial intelligence.

Technology Category

Application Category

📝 Abstract
While Large Language Models (LLMs) have achieved remarkable success in formal learning tasks such as mathematics and code generation, they still struggle with the"practical wisdom"and generalizable intelligence, such as strategic creativity and social reasoning, that characterize human cognition. This gap arises from a lack of informal learning, which thrives on interactive feedback rather than goal-oriented instruction. In this paper, we propose treating Games as a primary environment for LLM informal learning, leveraging their intrinsic reward signals and abstracted complexity to cultivate diverse competencies. To address the performance degradation observed in multi-task learning, we introduce a Nested Training Framework. Unlike naive task mixing optimizing an implicit"OR"objective, our framework employs sequential task composition to enforce an explicit"AND"objective, compelling the model to master multiple abilities simultaneously to achieve maximal rewards. Using GRPO-based reinforcement learning across Matrix Games, TicTacToe, and Who's the Spy games, we demonstrate that integrating game-based informal learning not only prevents task interference but also significantly bolsters the model's generalization across broad ability-oriented benchmarks. The framework and implementation are publicly available.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
informal learning
generalizable intelligence
practical wisdom
social reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

informal learning
nested training framework
game-based training
generalizable intelligence
multi-task reinforcement learning
🔎 Similar Papers
No similar papers found.