SituatedThinker: Grounding LLM Reasoning with Real-World through Situated Thinking

📅 2025-05-25
📈 Citations: 0
✨ Influential: 0
📄 PDF
🤖 AI Summary
Current large language models (LLMs) suffer from static, parameterized knowledge, limiting their ability to access real-time information and reason about the physical world. To address this, we propose the *embodied reasoning* paradigm, dynamically grounding LLM inference in real-world environments. Our method introduces an adaptive reasoning mechanism that integrates intrinsic knowledge with heterogeneous external interfaces—including knowledge bases, structured tables, and textual environments—alongside a reinforcement learning–driven framework for active perception, interactive engagement, and feedback utilization. We further design a unified architecture supporting dynamic tool invocation and reflective reasoning. Empirically, our approach achieves significant performance gains on multi-hop question answering and mathematical reasoning benchmarks. Moreover, it successfully generalizes to unseen tasks—including knowledge base QA (KBQA), table-based QA (TableQA), and text-based games—demonstrating, for the first time, systematic evidence of LLMs’ embodied reasoning capability in real-world settings.

Technology Category

Application Category

📝 Abstract
Recent advances in large language models (LLMs) demonstrate their impressive reasoning capabilities. However, the reasoning confined to internal parametric space limits LLMs' access to real-time information and understanding of the physical world. To overcome this constraint, we introduce SituatedThinker, a novel framework that enables LLMs to ground their reasoning in real-world contexts through situated thinking, which adaptively combines both internal knowledge and external information with predefined interfaces. By utilizing reinforcement learning, SituatedThinker incentivizes deliberate reasoning with the real world to acquire information and feedback, allowing LLMs to surpass their knowledge boundaries and enhance reasoning. Experimental results demonstrate significant performance improvements on multi-hop question-answering and mathematical reasoning benchmarks. Furthermore, SituatedThinker demonstrates strong performance on unseen tasks, such as KBQA, TableQA, and text-based games, showcasing the generalizable real-world grounded reasoning capability. Our codes are available at https://github.com/jnanliu/SituatedThinker.
Problem

Research questions and friction points this paper is trying to address.

Enables LLMs to ground reasoning in real-world contexts
Combines internal knowledge with external information adaptively
Enhances reasoning beyond LLMs' knowledge boundaries
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines internal knowledge with external information
Uses reinforcement learning for real-world feedback
Enhances reasoning beyond knowledge boundaries
🔎 Similar Papers
No similar papers found.
J
Junnan Liu
Department of Data Science and AI, Faculty of Information Technology, Monash University, Australia
Linhao Luo
Linhao Luo
Monash University
Large Language ModelKnowledge GraphGraph Data MiningMachine LearningDeep Learning
Thuy-Trang Vu
Thuy-Trang Vu
Monash University
Natural Language ProcessingMachine Learning
G
Gholamreza Haffari
Department of Data Science and AI, Faculty of Information Technology, Monash University, Australia