Pandora: A Code-Driven Large Language Model Agent for Unified Reasoning Across Diverse Structured Knowledge

📅 2025-04-17

📈 Citations: 0

✨ Influential: 0

career value

135K/year

🤖 AI Summary

This work addresses the challenge of unified reasoning over heterogeneous structured knowledge sources—including tables, databases, and knowledge graphs—given natural language questions. We propose a code-driven cross-source reasoning framework. Methodologically, it employs executable Python code (based on the Pandas API) as a unified knowledge representation, explicitly aligning structured semantics with large language model (LLM) priors; additionally, it introduces a memory-based cross-task example retrieval mechanism to enable effective transfer of reasoning knowledge across diverse sources. Evaluated on four benchmark datasets spanning three types of structured knowledge sources, our approach outperforms existing unified frameworks and matches the performance of domain-specific models. To our knowledge, this is the first method to organically integrate code-based representation, LLM alignment, and cross-task knowledge transfer—establishing a principled foundation for unified structured reasoning.

Technology Category

Application Category

📝 Abstract

Unified Structured Knowledge Reasoning (USKR) aims to answer natural language questions (NLQs) by using structured sources such as tables, databases, and knowledge graphs in a unified way. Existing USKR methods either rely on employing task-specific strategies or custom-defined representations, which struggle to leverage the knowledge transfer between different SKR tasks or align with the prior of LLMs, thereby limiting their performance. This paper proposes a novel USKR framework named extsc{Pandora}, which takes advantage of extsc{Python}'s extsc{Pandas} API to construct a unified knowledge representation for alignment with LLM pre-training. It employs an LLM to generate textual reasoning steps and executable Python code for each question. Demonstrations are drawn from a memory of training examples that cover various SKR tasks, facilitating knowledge transfer. Extensive experiments on four benchmarks involving three SKR tasks demonstrate that extsc{Pandora} outperforms existing unified frameworks and competes effectively with task-specific methods.

Problem

Research questions and friction points this paper is trying to address.

Unified reasoning across diverse structured knowledge sources

Overcoming limitations of task-specific strategies in knowledge transfer

Aligning structured knowledge representation with LLM pre-training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Python Pandas API for unified knowledge representation

Generates reasoning steps and executable Python code

Leverages diverse training examples for knowledge transfer

🔎 Similar Papers

Multi-Agent Causal Discovery Using Large Language Models