HPCAgentTester: A Multi-Agent LLM Approach for Enhanced HPC Unit Test Generation

📅 2025-11-13

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

HPC unit testing faces challenges including parallel non-determinism, difficulty in detecting synchronization bugs, and hardware heterogeneity, leading to insufficient coverage with conventional approaches. This paper proposes the first automated test generation framework for HPC based on multi-agent large language models (LLMs). It introduces a novel collaborative architecture comprising a Recipe Agent and a Test Agent, integrated with a critique-feedback loop and dual verification—both compilation and functional correctness—to precisely model OpenMP/MPI parallel structures, communication patterns, and hierarchical concurrency. Experimental evaluation demonstrates that the framework significantly improves test compilability (+32.7%) and functional correctness (+28.4%), successfully uncovering fine-grained synchronization defects and data races missed by traditional tools. The approach thus enhances the reliability and maintainability of HPC software.

Technology Category

Application Category

📝 Abstract

Unit testing in High-Performance Computing (HPC) is critical but challenged by parallelism, complex algorithms, and diverse hardware. Traditional methods often fail to address non-deterministic behavior and synchronization issues in HPC applications. This paper introduces HPCAgentTester, a novel multi-agent Large Language Model (LLM) framework designed to automate and enhance unit test generation for HPC software utilizing OpenMP and MPI. HPCAgentTester employs a unique collaborative workflow where specialized LLM agents (Recipe Agent and Test Agent) iteratively generate and refine test cases through a critique loop. This architecture enables the generation of context-aware unit tests that specifically target parallel execution constructs, complex communication patterns, and hierarchical parallelism. We demonstrate HPCAgentTester's ability to produce compilable and functionally correct tests for OpenMP and MPI primitives, effectively identifying subtle bugs that are often missed by conventional techniques. Our evaluation shows that HPCAgentTester significantly improves test compilation rates and correctness compared to standalone LLMs, offering a more robust and scalable solution for ensuring the reliability of parallel software systems.

Problem

Research questions and friction points this paper is trying to address.

Automating unit test generation for HPC software with parallelism challenges

Addressing non-deterministic behavior and synchronization issues in HPC applications

Enhancing test compilation rates and correctness for parallel software systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent LLM framework automates HPC unit testing

Collaborative workflow iteratively refines test cases

Targets parallel execution and complex communication patterns

🔎 Similar Papers

Large-scale, Independent and Comprehensive study of the power of LLMs for test case generation