Prometheus: Unified Knowledge Graphs for Issue Resolution in Multilingual Codebases

📅 2025-07-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing language-model-based agents (e.g., SWE-agent, OpenHands) are constrained to Python and the synthetic SWE-bench environment, limiting generalization to real-world, multilingual codebases. Method: We propose the first cross-lingual, production-oriented automated bug-fixing framework. It constructs a unified, typed knowledge graph integrating source code, abstract syntax trees (ASTs), and natural language—featuring five generic edge types—and implements structured retrieval and reasoning atop Neo4j. A multi-agent architecture orchestrates DeepSeek-V3 for decision-making. Contribution/Results: Evaluated across seven programming languages, our framework solves 10 previously uncovered, real-world GitHub issues (e.g., in LangChain and OpenHands), achieving 28.67% and 13.7% pass rates on SWE-bench Lite and Multilingual benchmarks, respectively. Average API cost per issue is $0.23 (Lite) and $0.38 (Multilingual). The implementation is open-sourced and production-ready.

Technology Category

Application Category

📝 Abstract
Language model (LM) agents, such as SWE-agent and OpenHands, have made progress toward automated issue resolution. However, existing approaches are often limited to Python-only issues and rely on pre-constructed containers in SWE-bench with reproduced issues, restricting their applicability to real-world and work for multi-language repositories. We present Prometheus, designed to resolve real-world issues beyond benchmark settings. Prometheus is a multi-agent system that transforms an entire code repository into a unified knowledge graph to guide context retrieval for issue resolution. Prometheus encodes files, abstract syntax trees, and natural language text into a graph of typed nodes and five general edge types to support multiple programming languages. Prometheus uses Neo4j for graph persistence, enabling scalable and structured reasoning over large codebases. Integrated by the DeepSeek-V3 model, Prometheus resolves 28.67% and 13.7% of issues on SWE-bench Lite and SWE-bench Multilingual, respectively, with an average API cost of $0.23 and $0.38 per issue. Prometheus resolves 10 unique issues not addressed by prior work and is the first to demonstrate effectiveness across seven programming languages. Moreover, it shows the ability to resolve real-world GitHub issues in the LangChain and OpenHands repositories. We have open-sourced Prometheus at: https://github.com/Pantheon-temple/Prometheus
Problem

Research questions and friction points this paper is trying to address.

Resolves multilingual code issues beyond Python-only limitations
Transforms code repositories into unified knowledge graphs
Enables scalable reasoning for real-world GitHub issues
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent system for unified knowledge graphs
Neo4j for scalable graph persistence
DeepSeek-V3 model integration for multilingual support
🔎 Similar Papers
No similar papers found.