Towards Feedback-to-Plan Decisions for Self-Evolving LLM Agents in CUDA Kernel Generation

📅 2026-05-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing approaches struggle to dissect how self-evolving large language models attribute and integrate heterogeneous feedback signals into planning decisions during CUDA kernel generation, and conventional ablation studies fail to disentangle feedback effects from trajectory drift. To address this, this work proposes CUDAnalyst, an analytical framework that enables, for the first time, controlled, generation-level feedback attribution. By freezing execution trajectories and selectively injecting feedback, CUDAnalyst reveals structured relationships between feedback and high-level planning. The framework supports multi-feedback interaction modeling and cross-model plan transfer, demonstrating that high-level plans from stronger models can be partially transferred to weaker ones, thereby validating the universality of the feedback–planning mechanism. Experiments show that explicit planning is effective only when feedback is aligned, a finding robust across diverse models, workloads, and inductive settings.
📝 Abstract
Large language models (LLMs) have shown strong empirical gains as self-evolving agents for CUDA kernel generation, driven by feedback-conditioned planning across generations. However, how planning decisions attribute and combine heterogeneous feedback signals remains opaque. Standard end-to-end ablations fail to resolve this question, as iterative planning amplifies early perturbations and conflates feedback effects with trajectory-dependent drift. We introduce \texttt{CUDAnalyst}, a unified analysis layer for controlled, generation-level attribution of planning decisions to feedback components via trajectory freezing and selective feedback injection. \texttt{CUDAnalyst} enables stable generation-level evaluation and principled coalitional-style attribution of feedback effects and interactions. Our results show that explicit planning is beneficial only when feedback is aligned, that effective planning emerges from structured multi-feedback interactions, and that high-level plans from stronger reasoning models can partially transfer to weaker ones. These trends hold across reference backbones, representative workloads, and reference induction regimes, indicating that the identified feedback-to-plan structure is robust within the controlled axes studied.
Problem

Research questions and friction points this paper is trying to address.

feedback attribution
planning decisions
CUDA kernel generation
self-evolving LLM agents
heterogeneous feedback
Innovation

Methods, ideas, or system contributions that make the work stand out.

feedback-to-plan attribution
trajectory freezing
selective feedback injection
self-evolving LLM agents
CUDA kernel generation
Y
Yee Hin Chong
Department of Computer Science and Technology, Tsinghua University, Beijing, China
Jiaming Wu
Jiaming Wu
Assistant Professor, Chalmers University of Technology
Modeling and optimization of intelligent transport systems
Y
Youhui Zhang
Department of Computer Science and Technology, Tsinghua University, Beijing, China; Beijing National Research Center for Information Science and Technology, Beijing, China
P
Peng Qu
Department of Computer Science and Technology, Tsinghua University, Beijing, China; Beijing National Research Center for Information Science and Technology, Beijing, China