🤖 AI Summary
This work addresses the challenge of multi-step reasoning for large language models when performing complex, long-horizon analytical tasks on unstructured tables—such as those with hierarchical layouts, bidirectional headers, or non-standard formatting. The authors propose a closed-loop agent decision framework that maps natural language queries to an operation-level search space via a hierarchical meta-graph. Strategic planning and low-level execution are decoupled yet coordinated through an expectation-aware high-value path selection strategy and a twin-structured memory mechanism that integrates parameter updates with textual summarization. Extensive experiments on multiple challenging unstructured table benchmarks demonstrate the method’s effectiveness, significantly improving long-horizon reasoning performance and highlighting the critical role of the decoupled design.
📝 Abstract
Large language models often struggle with complex long-horizon analytical tasks over unstructured tables, which typically feature hierarchical and bidirectional headers and non-canonical layouts. We formalize this challenge as Deep Tabular Research (DTR), requiring multi-step reasoning over interdependent table regions. To address DTR, we propose a novel agentic framework that treats tabular reasoning as a closed-loop decision-making process. We carefully design a coupled query and table comprehension for path decision making and operational execution. Specifically, (i) DTR first constructs a hierarchical meta graph to capture bidirectional semantics, mapping natural language queries into an operation-level search space; (ii) To navigate this space, we introduce an expectation-aware selection policy that prioritizes high-utility execution paths; (iii) Crucially, historical execution outcomes are synthesized into a siamese structured memory, i.e., parameterized updates and abstracted texts, enabling continual refinement. Extensive experiments on challenging unstructured tabular benchmarks verify the effectiveness and highlight the necessity of separating strategic planning from low-level execution for long-horizon tabular reasoning.