GitAgent: Facilitating Autonomous Agent with GitHub by Tool Extension

📅 2023-12-28

🏛️ arXiv.org

📈 Citations: 5

✨ Influential: 1

career value

206K/year

🤖 AI Summary

Current large language models (LLMs) exhibit limited performance on open-domain tasks requiring complex domain-specific computation or dynamic tool invocation, and lack standardized benchmarks for evaluation. This paper introduces GitHub-Agent, an LLM-based autonomous agent capable of discovering, understanding, adapting, and integrating tools directly from GitHub repositories via an end-to-end closed-loop pipeline: tool discovery (via issue/PR retrieval), comprehension (repository-level semantic modeling), adaptation (automatic API generation), and integration (hybrid RAG and fine-tuning). Crucially, it enables LLMs to learn tools from authentic open-source collaboration data—rather than synthetic or manually annotated examples—for the first time. Evaluated on 30 real-world, user-specified complex queries, GitHub-Agent achieves a 69.4% success rate, substantially outperforming fixed-tool baselines. Additionally, we construct the first open-domain benchmark subset specifically designed for evaluating tool-augmented LLMs.

📝 Abstract

While Large Language Models (LLMs) like ChatGPT and GPT-4 have demonstrated exceptional proficiency in natural language processing, their efficacy in addressing complex, multifaceted tasks remains limited. A growing area of research focuses on LLM-based agents equipped with external tools capable of performing diverse tasks. However, existing LLM-based agents only support a limited set of tools which is unable to cover a diverse range of user queries, especially for those involving expertise domains. It remains a challenge for LLM-based agents to extend their tools autonomously when confronted with various user queries. As GitHub has hosted a multitude of repositories which can be seen as a good resource for tools, a promising solution is that LLM-based agents can autonomously integrate the repositories in GitHub according to the user queries to extend their tool set. In this paper, we introduce GitAgent, an agent capable of achieving the autonomous tool extension from GitHub. GitAgent follows a four-phase procedure to incorporate repositories and it can learn human experience by resorting to GitHub Issues/PRs to solve problems encountered during the procedure. Experimental evaluation involving 30 user queries demonstrates GitAgent's effectiveness, achieving a 69.4% success rate on average.

Problem

Research questions and friction points this paper is trying to address.

LLMs struggle with complex domain-specific calculations and simulations

Existing approaches lack flexibility for diverse open-domain queries

No dataset evaluates LLMs on tool-requiring open-domain tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Autonomous tool integration from GitHub

Hierarchical framework for specialized agents

Bi-level experience learning mechanism

🔎 Similar Papers

System for systematic literature review using multiple AI agents: Concept and an empirical evaluation