Contributions to multiple high-impact open-source projects and benchmarks widely adopted by industry and academia
Research Experience
Senior Software Engineer at Microsoft, working on SCOPE query optimizer—the core of Microsoft’s in-house data lake processing nearly one million jobs and exabytes of data daily
Co-founder and active maintainer of OpenHands (since 2024), the most popular open-source coding agent platform
Maintainer of Terminal-Bench, a benchmark for AI agents in terminal environments, adopted in system cards of major LLMs like Claude 4.5 and GLM 4.5
Maintainer of Harbor, a framework for agent evaluation and rollout used by Terminal-Bench, DataComp, and SkyRL
Core developer of TheAgentCompany, where he designed a reproducible and extensible evaluation framework and led a team of 10+ engineers
Member of the Technical Steering Committee for JanusGraph (since 2019), leading development of the most popular open-source distributed graph database