OpenHands: An Open Platform for AI Software Developers as Generalist Agents

📅 2024-07-23

📈 Citations: 81

✨ Influential: 2

career value

228K/year

🤖 AI Summary

AI agents struggle to perform end-to-end software development—encompassing code authoring, command-line interaction, and web navigation—as proficiently as human developers. Method: This paper introduces OpenHands, the first open-source AI agent platform designed for full-stack software engineering. Built upon large language models (LLMs), it integrates a secure sandbox execution environment, a multi-agent orchestration engine, and a web interaction module, and natively supports standardized benchmarks including SWE-Bench and WebArena. Contribution/Results: OpenHands establishes a reproducible, extensible paradigm for general-purpose agent development, advancing AI developer agents from task-specific to broadly applicable systems. It undergoes systematic evaluation across 15 challenging software development tasks. Released under the MIT License, the project has attracted over 188 contributors and more than 2,100 code commits.

Technology Category

Application Category

📝 Abstract

Software is one of the most powerful tools that we humans have at our disposal; it allows a skilled programmer to interact with the world in complex and profound ways. At the same time, thanks to improvements in large language models (LLMs), there has also been a rapid development in AI agents that interact with and affect change in their surrounding environments. In this paper, we introduce OpenHands (f.k.a. OpenDevin), a platform for the development of powerful and flexible AI agents that interact with the world in similar ways to those of a human developer: by writing code, interacting with a command line, and browsing the web. We describe how the platform allows for the implementation of new agents, safe interaction with sandboxed environments for code execution, coordination between multiple agents, and incorporation of evaluation benchmarks. Based on our currently incorporated benchmarks, we perform an evaluation of agents over 15 challenging tasks, including software engineering (e.g., SWE-BENCH) and web browsing (e.g., WEBARENA), among others. Released under the permissive MIT license, OpenHands is a community project spanning academia and industry with more than 2.1K contributions from over 188 contributors.

Problem

Research questions and friction points this paper is trying to address.

Develops a platform for creating flexible AI agents

Enables safe code execution in sandboxed environments

Evaluates agents on software and web tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Platform for AI agents writing code

Safe sandboxed code execution

Multi-agent coordination system

🔎 Similar Papers

No similar papers found.