ReCoQA: A Benchmark for Tool-Augmented and Multi-Step Reasoning in Real Estate Question and Answering

📅 2026-04-20

📈 Citations: 0

✨ Influential: 0

career value

143K/year

🤖 AI Summary

This work addresses the lack of benchmark support for hybrid reasoning workflows that integrate database queries with external API calls, which hinders agents’ complex reasoning capabilities in multi-source, heterogeneous environments. To bridge this gap, the authors introduce ReCoQA, a large-scale benchmark comprising 29,270 multi-hop question-answering instances in the real estate domain, providing the first machine-verifiable supervision signals that include intent labels, SQL queries, and API invocations as intermediate reasoning steps. They further propose HIRE-Agent, a hierarchical framework employing a understand-plan-execute architecture, which coordinates a front-end parser, a planning supervisor, and execution experts to jointly reason over structured and unstructured data. Experimental results demonstrate that HIRE-Agent achieves strong performance on ReCoQA, validating the efficacy and necessity of hierarchical collaboration for tackling complex real-world tasks.

Technology Category

Application Category

📝 Abstract

Developing agents capable of navigating fragmented, multi-source information remains challenging, primarily due to the scarcity of benchmarks reflecting hybrid workflows combining database querying with external APIs. To bridge this gap, we introduce ReCoQA, a large-scale benchmark of 29,270 real-estate instances featuring machine-verifiable supervision for intermediate steps, including structured intent labels, SQL queries, and API calls. Complementarily, we propose HIRE-Agent, a hierarchical framework instantiating an understand-plan-execute architecture as a strong baseline. By orchestrating a Front-end parser, a planning Supervisor, and execution Specialists, HIRE-Agent effectively integrates heterogeneous evidence. Extensive experiments demonstrate that HIRE-Agent constitutes a strong baseline and substantiates the necessity of hierarchical collaboration for complex, real-world reasoning tasks.

Problem

Research questions and friction points this paper is trying to address.

tool-augmented reasoning

multi-step reasoning

real estate QA

benchmark

hybrid workflows

Innovation

Methods, ideas, or system contributions that make the work stand out.

tool-augmented reasoning

multi-step reasoning

hierarchical agent architecture