Rule Synergy Analysis using LLMs: State of the Art and Implications

📅 2025-08-26

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

This work investigates large language models’ (LLMs) capacity to understand and reason about card synergies in dynamic rule environments. We construct the first annotated dataset for rule-based synergy analysis, grounded in *Slay the Spire*, systematically covering positive, negative, and non-synergistic relationships. To diagnose model failures, we propose a multidimensional error analysis framework evaluating temporal reasoning, state modeling, and rule adherence. Experiments reveal that while LLMs accurately identify non-synergistic interactions, performance degrades substantially on positive and—especially—negative synergy tasks, where implicit rule conflicts induce adverse effects. Notably, negative synergy detection proves most challenging, exposing a critical weakness in modeling latent rule interactions. This study provides the first quantitative characterization of LLMs’ structural reasoning deficits in dynamic rule composition, establishing a novel benchmark and analytical paradigm for evaluating rule-aware AI systems.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) have demonstrated strong performance across a variety of domains, including logical reasoning, mathematics, and more. In this paper, we investigate how well LLMs understand and reason about complex rule interactions in dynamic environments, such as card games. We introduce a dataset of card synergies from the game Slay the Spire, where pairs of cards are classified based on their positive, negative, or neutral interactions. Our evaluation shows that while LLMs excel at identifying non-synergistic pairs, they struggle with detecting positive and, particularly, negative synergies. We categorize common error types, including issues with timing, defining game states, and following game rules. Our findings suggest directions for future research to improve model performance in predicting the effect of rules and their interactions.

Problem

Research questions and friction points this paper is trying to address.

Evaluating LLMs' understanding of complex rule interactions

Assessing LLM performance in detecting card synergies

Identifying error types in LLM reasoning about game rules

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs analyze card synergies dataset

Evaluate rule interactions in games

Identify error types for improvement

🔎 Similar Papers

Leveraging Large Language Models for Relevance Judgments in Legal Case Retrieval