AgentMark: Utility-Preserving Behavioral Watermarking for Agents

📅 2026-01-05
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of watermarking high-level planning behaviors—such as tool selection and subgoal decisions—of large language model agents in multi-step tasks, a capability lacking in existing content watermarking methods. To reconcile intellectual property protection with task utility under black-box deployment, we propose AgentMark, a novel framework that enables recoverable, multi-bit watermark embedding directly at the planning decision layer. By modeling the agent’s behavioral distribution and employing a distribution-preserving conditional sampling strategy, AgentMark supports robust watermark extraction and behavioral attribution in black-box API settings without compromising long-term task performance. Extensive experiments across embodied reasoning, tool-use, and social interaction scenarios demonstrate its high capacity, strong robustness, and consistent utility preservation. The code is publicly released.

Technology Category

Application Category

📝 Abstract
LLM-based agents are increasingly deployed to autonomously solve complex tasks, raising urgent needs for IP protection and regulatory provenance. While content watermarking effectively attributes LLM-generated outputs, it fails to directly identify the high-level planning behaviors (e.g., tool and subgoal choices) that govern multi-step execution. Critically, watermarking at the planning-behavior layer faces unique challenges: minor distributional deviations in decision-making can compound during long-term agent operation, degrading utility, and many agents operate as black boxes that are difficult to intervene in directly. To bridge this gap, we propose AgentMark, a behavioral watermarking framework that embeds multi-bit identifiers into planning decisions while preserving utility. It operates by eliciting an explicit behavior distribution from the agent and applying distribution-preserving conditional sampling, enabling deployment under black-box APIs while remaining compatible with action-layer content watermarking. Experiments across embodied, tool-use, and social environments demonstrate practical multi-bit capacity, robust recovery from partial logs, and utility preservation. The code is available at https://github.com/Tooooa/AgentMark.
Problem

Research questions and friction points this paper is trying to address.

behavioral watermarking
LLM-based agents
planning behavior
utility preservation
IP protection
Innovation

Methods, ideas, or system contributions that make the work stand out.

behavioral watermarking
utility preservation
black-box agents
distribution-preserving sampling
multi-bit identifier
🔎 Similar Papers
No similar papers found.
K
Kaibo Huang
Beijing University of Posts and Telecommunications
Jin Tan
Jin Tan
Principal Engineer, National Renewable Energy Laboratory
power systems stability and operationrenewables integrations
Y
Yukun Wei
Beijing University of Posts and Telecommunications
W
Wanling Li
Beijing University of Posts and Telecommunications
Z
Zipei Zhang
Beijing University of Posts and Telecommunications
H
Hui Tian
Huaqiao University
Zhongliang Yang
Zhongliang Yang
Associate Professor, Beijing University of Posts and Telecommunications
AI SecurityFinTech
L
Linna Zhou
Beijing University of Posts and Telecommunications