LOGIGEN: Logic-Driven Generation of Verifiable Agentic Tasks

📅 2026-02-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the scarcity of logically rigorous and verifiable agent training data for large language models in complex, stateful environments. To this end, the authors propose LOGIGEN, a framework that synthesizes high-fidelity task trajectories through a three-agent collaborative, logic-driven mechanism. By integrating hard-coded policy grounding, forward logical generation, and deterministic state verification, LOGIGEN ensures causal consistency and adherence to real-world constraints. Using 20,000 multi-domain trajectories generated by this approach, the LOGIGEN-32B(RL) model—fine-tuned via supervised learning and reinforced with state-based rewards—achieves a 79.5% task success rate on the τ²-Bench, substantially outperforming baseline models (40.7%). These results demonstrate the effectiveness and superiority of LOGIGEN in enabling verifiable agent training.

Technology Category

Application Category

📝 Abstract
The evolution of Large Language Models (LLMs) from static instruction-followers to autonomous agents necessitates operating within complex, stateful environments to achieve precise state-transition objectives. However, this paradigm is bottlenecked by data scarcity, as existing tool-centric reverse-synthesis pipelines fail to capture the rigorous logic of real-world applications. We introduce \textbf{LOGIGEN}, a logic-driven framework that synthesizes verifiable training data based on three core pillars: \textbf{Hard-Compiled Policy Grounding}, \textbf{Logic-Driven Forward Synthesis}, and \textbf{Deterministic State Verification}. Specifically, a Triple-Agent Orchestration is employed: the \textbf{Architect} compiles natural-language policy into database constraints to enforce hard rules; the \textbf{Set Designer} initializes boundary-adjacent states to trigger critical policy conflicts; and the \textbf{Explorer} searches this environment to discover causal solution paths. This framework yields a dataset of 20,000 complex tasks across 8 domains, where validity is strictly guaranteed by checking exact state equivalence. Furthermore, we propose a verification-based training protocol where Supervised Fine-Tuning (SFT) on verifiable trajectories establishes compliance with hard-compiled policy, while Reinforcement Learning (RL) guided by dense state-rewards refines long-horizon goal achievement. On $τ^2$-Bench, LOGIGEN-32B(RL) achieves a \textbf{79.5\% success rate}, substantially outperforming the base model (40.7\%). These results demonstrate that logic-driven synthesis combined with verification-based training effectively constructs the causally valid trajectories needed for next-generation agents.
Problem

Research questions and friction points this paper is trying to address.

data scarcity
stateful environments
logic-driven synthesis
verifiable tasks
autonomous agents
Innovation

Methods, ideas, or system contributions that make the work stand out.

Logic-Driven Synthesis
Verifiable Agentic Tasks
Hard-Compiled Policy Grounding
Deterministic State Verification
Triple-Agent Orchestration
🔎 Similar Papers
No similar papers found.
Y
Yucheng Zeng
Baidu Inc.
W
Weipeng Lu
Baidu Inc.
L
Linyun Liu
Baidu Inc.
S
Shupeng Li
Baidu Inc.
Z
Zitian Qu
Tsinghua University
Chenghao Zhu
Chenghao Zhu
University of Electronic Science and Technology of China
Shaofei Li
Shaofei Li
Peking University
Computer Security
Z
Zhengdong Tan
Baidu Inc.
M
Mengyue Liu
Baidu Inc.
H
Haotian Zhao
Baidu Inc.
Z
Zhe Zhou
Tsinghua University
J
Jianmin Wu
Baidu Inc.