🤖 AI Summary
The choice of formal language—a critical yet overlooked factor—significantly impacts the reasoning performance of neuro-symbolic large language models (LLMs) in natural language-to-symbolic solver translation, giving rise to the “intermediate language challenge.”
Method: We conduct systematic experiments across four formal languages, three logical reasoning benchmarks, and seven state-of-the-art LLMs, coupled with symbolic solver evaluation.
Contribution/Results: We demonstrate that both syntactic structure and semantic expressivity of formal languages substantially influence end-to-end reasoning accuracy, and that LLMs exhibit marked heterogeneity in sensitivity to language choice. This work provides the first quantitative analysis of how formal language design affects neuro-symbolic integration efficacy. It yields reproducible, empirically grounded guidelines for optimal formal language selection, thereby advancing the development of robust and efficient neuro-symbolic reasoning frameworks.
📝 Abstract
Large language models (LLMs) achieve astonishing results on a wide range of tasks. However, their formal reasoning ability still lags behind. A promising approach is Neurosymbolic LLM reasoning. It works by using LLMs as translators from natural to formal languages and symbolic solvers for deriving correct results. Still, the contributing factors to the success of Neurosymbolic LLM reasoning remain unclear. This paper demonstrates that one previously overlooked factor is the choice of the formal language. We introduce the intermediate language challenge: selecting a suitable formal language for neurosymbolic reasoning. By comparing four formal languages across three datasets and seven LLMs, we show that the choice of formal language affects both syntactic and semantic reasoning capabilities. We also discuss the varying effects across different LLMs.