DomAgent: Leveraging Knowledge Graphs and Case-Based Reasoning for Domain-Specific Code Generation

📅 2026-03-22

📈 Citations: 0

✨ Influential: 0

career value

159K/year

🤖 AI Summary

This work addresses the challenge that general-purpose large language models often fail to generate high-quality domain-specific code due to insufficient domain knowledge. To overcome this limitation, the authors propose DomAgent, an autonomous coding agent that integrates top-down knowledge graph reasoning with bottom-up case-based reasoning. Its core module, DomRetriever, can be deployed independently or embedded within any large language model, dynamically synthesizing domain concepts and concrete examples through a structured retrieval-and-synthesis mechanism to enable context-aware, domain-adapted code generation. Experimental results demonstrate that DomAgent significantly improves code quality on both the DS-1000 benchmark and real-world truck software development tasks, enabling small open-source models to approach the performance of large closed-source models in complex scenarios.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) have shown impressive capabilities in code generation. However, because most LLMs are trained on public domain corpora, directly applying them to real-world software development often yields low success rates, as these scenarios frequently require domain-specific knowledge. In particular, domain-specific tasks usually demand highly specialized solutions, which are often underrepresented or entirely absent in the training data of generic LLMs. To address this challenge, we propose DomAgent, an autonomous coding agent that bridges this gap by enabling LLMs to generate domain-adapted code through structured reasoning and targeted retrieval. A core component of DomAgent is DomRetriever, a novel retrieval module that emulates how humans learn domain-specific knowledge, by combining conceptual understanding with experiential examples. It dynamically integrates top-down knowledge-graph reasoning with bottom-up case-based reasoning, enabling iterative retrieval and synthesis of structured knowledge and representative cases to ensure contextual relevance and broad task coverage. DomRetriever can operate as part of DomAgent or independently with any LLM for flexible domain adaptation. We evaluate DomAgent on an open benchmark dataset in the data science domain (DS-1000) and further apply it to real-world truck software development tasks. Experimental results show that DomAgent significantly enhances domain-specific code generation, enabling small open-source models to close much of the performance gap with large proprietary LLMs in complex, real-world applications. The code is available at: https://github.com/Wangshuaiia/DomAgent.

Problem

Research questions and friction points this paper is trying to address.

domain-specific code generation

large language models

knowledge gap

software development

specialized solutions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Knowledge Graph

Case-Based Reasoning

Domain-Specific Code Generation