Reasoning Elicitation in Language Models via Counterfactual Feedback

📅 2024-10-02
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Large language models exhibit limited capability in causal reasoning tasks—particularly counterfactual question answering—due to inherent biases and insufficient grounding in causal mechanisms. Method: We propose a novel paradigm for enhancing causal reasoning: (1) introducing CausalQA-Balanced, the first evaluation metric jointly optimizing factual and counterfactual accuracy to quantify reasoning bias; and (2) designing a causal-mechanism-inspired fine-tuning strategy integrating counterfactual question generation, multi-objective supervised fine-tuning, and feedback-driven optimization. Contribution/Results: Our approach significantly improves model accuracy on counterfactual QA and strengthens generalization across inductive, deductive, and cross-task causal reasoning. Extensive experiments demonstrate systematic superiority over state-of-the-art baselines across multiple real-world scenarios, establishing a new benchmark for causally grounded language understanding.

Technology Category

Application Category

📝 Abstract
Despite the increasing effectiveness of language models, their reasoning capabilities remain underdeveloped. In particular, causal reasoning through counterfactual question answering is lacking. This work aims to bridge this gap. We first derive novel metrics that balance accuracy in factual and counterfactual questions, capturing a more complete view of the reasoning abilities of language models than traditional factual-only based metrics. Second, we propose several fine-tuning approaches that aim to elicit better reasoning mechanisms, in the sense of the proposed metrics. Finally, we evaluate the performance of the fine-tuned language models in a variety of realistic scenarios. In particular, we investigate to what extent our fine-tuning approaches systemically achieve better generalization with respect to the base models in several problems that require, among others, inductive and deductive reasoning capabilities.
Problem

Research questions and friction points this paper is trying to address.

Enhance reasoning in language models via counterfactual feedback.
Develop metrics for factual and counterfactual reasoning accuracy.
Improve generalization in inductive and deductive reasoning tasks.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Develops metrics for factual and counterfactual accuracy
Proposes fine-tuning methods for enhanced reasoning
Evaluates models in diverse realistic reasoning scenarios
🔎 Similar Papers
Alihan Hüyük
Alihan Hüyük
Google DeepMind
Machine LearningSequential Decision-Making
Xinnuo Xu
Xinnuo Xu
Microsoft Research Cambridge
natural language processingmachine learning
J
Jacqueline Maasch
Cornell Tech
A
Aditya V. Nori
Microsoft Research Cambridge
J
Javier González
Microsoft Research Cambridge