SafeHarness: Lifecycle-Integrated Security Architecture for LLM-based Agent Deployment

πŸ“… 2026-04-15
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

232K/year
πŸ€– AI Summary
This work addresses the critical security vulnerability inherent in large language model (LLM) agent execution frameworks, which, due to their structural centrality, constitute high-risk attack surfaces. Existing defenses lack internal state awareness and cross-phase coordination. To bridge this gap, the paper proposes the first intrinsic security architecture aligned with the agent’s lifecycle, integrating four defense layers: adversarial context filtering at input, hierarchical causal validation during decision-making, privilege-separated tool control at execution, and secure rollback with adaptive degradation during state updates. These layers are dynamically coordinated through anomaly-driven cross-layer escalation. Experimental results demonstrate that the approach reduces the unsafe behavior rate (UBR) by 38% and attack success rate (ASR) by 42% on average across diverse attack scenarios, while preserving core task utility.

Technology Category

Application Category

πŸ“ Abstract
The performance of large language model (LLM) agents depends critically on the execution harness, the system layer that orchestrates tool use, context management, and state persistence. Yet this same architectural centrality makes the harness a high-value attack surface: a single compromise at the harness level can cascade through the entire execution pipeline. We observe that existing security approaches suffer from structural mismatch, leaving them blind to harness-internal state and unable to coordinate across the different phases of agent operation. In this paper, we introduce \safeharness{}, a security architecture in which four proposed defense layers are woven directly into the agent lifecycle to address above significant limitations: adversarial context filtering at input processing, tiered causal verification at decision making, privilege-separated tool control at action execution, and safe rollback with adaptive degradation at state update. The proposed cross-layer mechanisms tie these layers together, escalating verification rigor, triggering rollbacks, and tightening tool privileges whenever sustained anomalies are detected. We evaluate \safeharness{} on benchmark datasets across diverse harness configurations, comparing against four security baselines under five attack scenarios spanning six threat categories. Compared to the unprotected baseline, \safeharness{} achieves an average reduction of approximately 38\% in UBR and 42\% in ASR, substantially lowering both the unsafe behavior rate and the attack success rate while preserving core task utility.
Problem

Research questions and friction points this paper is trying to address.

LLM-based agents
execution harness
security architecture
attack surface
lifecycle coordination
Innovation

Methods, ideas, or system contributions that make the work stand out.

lifecycle-integrated security
adversarial context filtering
causal verification
privilege-separated tool control
safe rollback with adaptive degradation
πŸ’Ό Related Jobs
Xixun Lin
Xixun Lin
Institute of Information Engineering, Chinese Academy of Sciences
Data miningGraph representation learningLarge language model
Yang Liu
Yang Liu
University of Chinese Academy of Sciences
Self-supervised LearningVideo Analysis
Y
Yancheng Chen
Academy of Mathematics and Systems Science, Chinese Academy of Sciences
Y
Yongxuan Wu
Institute of Information Engineering, Chinese Academy of Sciences
Y
Yucheng Ning
Institute of Information Engineering, Chinese Academy of Sciences
Y
Yilong Liu
Institute of Information Engineering, Chinese Academy of Sciences
Nan Sun
Nan Sun
University of New South Wales
CybersecurityArtificial Intelligence Applications
S
Shun Zhang
Institute of Applied Physics and Computational Mathematics
B
Bin Chong
Peking University
Chuan Zhou
Chuan Zhou
Academy of Mathematics and Systems Science, Chinese Academy of Sciences
Social ComputingGraph ComputingGraph Machine Learning
Yanan Cao
Yanan Cao
Institute of Information Engineering, Chinese Academy of Sciences
L
Li Guo
Institute of Information Engineering, Chinese Academy of Sciences