🤖 AI Summary
This work investigates whether large language and vision models (LLMs/VLMs) possess genuine abstract reasoning capabilities—beyond superficial pattern matching—by probing their ability to internalize and generalize formal rules.
Method: We introduce “Misleading Fine-Tuning” (MisFT), a novel paradigm that constructs counterfactual datasets violating mathematical axioms, thereby inducing models to learn incorrect rules. We systematically evaluate cross-task rule generalization across textual math word problems and image-based arithmetic expressions.
Contribution/Results: Experiments reveal consistent transfer of learned erroneous rules to unseen tasks in both modalities, indicating an implicit two-stage mechanism: abstraction of structural representations followed by rule application. Crucially, this is the first study to use counterfactual rule learning as a diagnostic probe, providing empirical evidence that LLMs/VLMs support abstract reasoning and rule-level generalization beyond surface-level statistical correlations. Our approach establishes a new methodological framework for characterizing the inferential nature of foundation models.
📝 Abstract
Large language models (LLMs) and Vision language models (VLMs) have been able to perform various forms of reasoning tasks in a wide range of scenarios, but are they truly engaging in task abstraction and rule-based reasoning beyond mere memorization and pattern matching? To answer this question, we propose a novel experimental approach, Misleading Fine-Tuning (MisFT), to examine whether LLMs/VLMs perform abstract reasoning by altering their original understanding of fundamental rules. In particular, by constructing a dataset with math expressions that contradict correct operation principles, we fine-tune the model to learn those contradictory rules and assess its generalization ability on different test domains. Through a series of experiments, we find that current LLMs/VLMs are capable of effectively applying contradictory rules to solve practical math word problems and math expressions represented by images, implying the presence of an internal mechanism that abstracts before reasoning.