Robustness of Neurosymbolic Reasoners on First-Order Logic Problems

📅 2025-09-22

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

This work investigates the logical consistency vulnerabilities of large language models (LLMs) under first-order logic (FOL) counterfactual perturbations—such as predicate substitution or constant role swapping—revealing their overreliance on superficial patterns and fundamental deficits in symbolic reasoning. To address this, the authors propose NSCoT, a neuro-symbolic collaborative reasoning framework that tightly integrates LLMs with formal logic solvers, augmented by counterfactual data augmentation and chain-of-thought (CoT) prompting. Experiments demonstrate that NSCoT significantly improves logical robustness over purely neural baselines. Although its overall accuracy remains below standard CoT on unperturbed inputs, NSCoT is the first systematic validation of neuro-symbolic architectures for FOL counterfactual reasoning. It empirically exposes critical bottlenecks in LLMs’ logical generalization and establishes a novel pathway toward verifiable, interpretable neuro-symbolic inference.

Technology Category

Application Category

📝 Abstract

Recent trends in NLP aim to improve reasoning capabilities in Large Language Models (LLMs), with key focus on generalization and robustness to variations in tasks. Counterfactual task variants introduce minimal but semantically meaningful changes to otherwise valid first-order logic (FOL) problem instances altering a single predicate or swapping roles of constants to probe whether a reasoning system can maintain logical consistency under perturbation. Previous studies showed that LLMs becomes brittle on counterfactual variations, suggesting that they often rely on spurious surface patterns to generate responses. In this work, we explore if a neurosymbolic (NS) approach that integrates an LLM and a symbolic logical solver could mitigate this problem. Experiments across LLMs of varying sizes show that NS methods are more robust but perform worse overall that purely neural methods. We then propose NSCoT that combines an NS method and Chain-of-Thought (CoT) prompting and demonstrate that while it improves performance, NSCoT still lags behind standard CoT. Our analysis opens research directions for future work.

Problem

Research questions and friction points this paper is trying to address.

Improving robustness of reasoning systems on counterfactual logic problems

Addressing LLM brittleness when handling semantically altered FOL instances

Integrating neural and symbolic methods for consistent logical reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Neurosymbolic integration of LLM and symbolic solver

Counterfactual task variants test logical consistency

NSCoT combines neurosymbolic approach with Chain-of-Thought

🔎 Similar Papers

Neuro-symbolic Training for Reasoning over Spatial Language