๐ค AI Summary
Building production-grade software engineering agents faces challenges in flexibility, security, and multimodal interaction coordination. This paper proposes a novel agent architecture that innovatively integrates native sandboxed execution, fine-grained lifecycle control, model-agnostic dynamic routing across multiple LLMs, and an embedded security analysis moduleโenabling highly modular and composable design. Technically, it unifies REST/WebSocket APIs, sandbox isolation, memory-aware scheduling, and heterogeneous toolchain integration, supporting diverse interfaces including VS Code, VNC, and CLI, as well as mainstream LLMs. Evaluated on SWE-Bench Verified and GAIA benchmarks, our approach significantly outperforms baselines, demonstrating superior efficiency, robustness, and scalability for complex software engineering tasks.
๐ Abstract
Agents are now used widely in the process of software development, but building production-ready software engineering agents is a complex task. Deploying software agents effectively requires flexibility in implementation and experimentation, reliable and secure execution, and interfaces for users to interact with agents. In this paper, we present the OpenHands Software Agent SDK, a toolkit for implementing software development agents that satisfy these desiderata. This toolkit is a complete architectural redesign of the agent components of the popular OpenHands framework for software development agents, which has 64k+ GitHub stars. To achieve flexibility, we design a simple interface for implementing agents that requires only a few lines of code in the default case, but is easily extensible to more complex, full-featured agents with features such as custom tools, memory management, and more. For security and reliability, it delivers seamless local-to-remote execution portability, integrated REST/WebSocket services. For interaction with human users, it can connect directly to a variety of interfaces, such as visual workspaces (VS Code, VNC, browser), command-line interfaces, and APIs. Compared with existing SDKs from OpenAI, Claude, and Google, OpenHands uniquely integrates native sandboxed execution, lifecycle control, model-agnostic multi-LLM routing, and built-in security analysis. Empirical results on SWE-Bench Verified and GAIA benchmarks demonstrate strong performance. Put together, these elements allow the OpenHands Software Agent SDK to provide a practical foundation for prototyping, unlocking new classes of custom applications, and reliably deploying agents at scale.