🤖 AI Summary
This work addresses the vulnerability of self-hosted agents to host-level attacks—such as malicious message诱导, prompt injection, or control-flow manipulation—that exploit direct access to host resources and can trigger unsafe operations. To mitigate these risks, the authors propose an operation-centric risk-constrained model that retains standard functionality within a restricted regular execution environment (REE) while offloading critical security decisions and execution control to a trusted operational plane rooted in a hardware-based trusted execution environment (TEE). The study pioneers the integration of cloud-native TEE technology (Intel TDX) into self-hosted agent architectures, embedding remotely attestable trusted components into OpenClaw to enable context-aware operation authorization and auditable evidence generation. This approach effectively blocks illicit or high-risk actions while preserving legitimate functionality, incurring only modest and manageable performance overhead.
📝 Abstract
Self-hosted computer-use agents (SHCUAs), such as OpenClaw, combine natural-language interaction with direct access to host-side resources, including browsers, files, scripts, system commands, and external communication channels. While useful for automating real tasks, this capability also creates a host-level abuse surface: a legitimately deployed agent may be steered toward unsafe operations through malicious messages, indirect prompt injection, unsafe skills, or tampering along the host-side control path. We argue that such risks cannot be addressed by ad hoc blocking rules alone, because the security criticality of an operation depends jointly on its action type, target object, execution context, and potential effect.
This paper presents an operation-centric model for risk-based confinement of SHCUA operations. The proposed design keeps ordinary functionality on the constrained REE path, while protecting security-critical classification, authorization, binding, evidence generation, and selected execution-control decisions inside a cloud-native TEE-backed trusted operation plane. We instantiate the architecture on OpenClaw using Intel TDX as the primary trusted backend, with remote terminal-side trusted components verifying TDX-audited commands before constrained local execution. The evaluation shows that the design can block unsafe or policy-disallowed operations before execution, preserve ordinary functionality for allowed workloads, and provide auditable evidence with deployment-dependent overhead.