🤖 AI Summary
To address inefficient task planning and cascading hallucinations in multi-LLM collaborative programming, this paper proposes the Dynamic Thought Graph (DTG) framework. DTG models agent roles and task dependencies as a graph structure to enable context-aware dynamic role assignment, progressive task rescheduling, and project-level systematic code verification. It introduces the first hybrid LLM deployment mechanism that integrates proprietary models for complex reasoning with open-source models for coding and validation, coupled with a graph neural network–driven scheduling algorithm. Evaluated on automated image-processing programming tasks, DTG achieves 83.33% accuracy—setting a new state-of-the-art—while reducing runtime cost by 89.09%. Moreover, it significantly mitigates hallucination propagation, enhancing both robustness and efficiency of multi-agent collaboration.
📝 Abstract
With the rapid advancement of Large Language Models (LLMs), LLM-based approaches have demonstrated strong problem-solving capabilities across various domains. However, in automatic programming, a single LLM is typically limited to function-level code generation, while multi-agent systems composed of multiple LLMs often suffer from inefficient task planning. This lack of structured coordination can lead to cascading hallucinations, where accumulated errors across agents result in suboptimal workflows and excessive computational costs. To overcome these challenges, we introduce MaCTG (Multi-Agent Collaborative Thought Graph), a novel multi-agent framework that employs a dynamic graph structure to facilitate precise task allocation and controlled collaboration among LLM agents. MaCTG autonomously assigns agent roles based on programming requirements, dynamically refines task distribution through context-aware adjustments, and systematically verifies and integrates project-level code, effectively reducing hallucination errors and improving overall accuracy. MaCTG enhances cost-effectiveness by implementing a hybrid LLM deployment, where proprietary models handle complex reasoning, while open-source models are used for routine coding and validation tasks. To evaluate MaCTG's effectiveness, we applied it to traditional image processing auto-programming tasks, achieving a state-of-the-art accuracy of 83.33%. Additionally, by leveraging its hybrid LLM configuration, MaCTG significantly reduced operational costs by 89.09% compared to existing multi-agent frameworks, demonstrating its efficiency, scalability, and real-world applicability.