Large Language Model-Powered Evolutionary Code Optimization on a Phylogenetic Tree

๐Ÿ“… 2026-01-20
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work proposes PhyloEvolve, a context-aware reinforcement learningโ€“based LLM agent system designed to address the labor-intensive and inefficient iterative nature of manual optimization in modern GPU-accelerated scientific computing. The approach formulates algorithmic optimization as a sequential decision-making problem, leveraging code modifications and performance feedback as learning signals. It innovatively organizes the optimization history using a phylogenetic tree structure, enabling backtracking, cross-lineage knowledge transfer, and reproducibility. By integrating algorithm distillation with a prompt-driven Decision Transformer, PhyloEvolve achieves trajectory-conditioned experience reuse without requiring retraining. Evaluated on tasks including PDE solvers, manifold learning, and spectral graph algorithms, PhyloEvolve demonstrates significant improvements over baseline and conventional evolutionary methods in runtime efficiency, memory usage, and correctness.

Technology Category

Application Category

๐Ÿ“ Abstract
Optimizing scientific computing algorithms for modern GPUs is a labor-intensive and iterative process involving repeated code modification, benchmarking, and tuning across complex hardware and software stacks. Recent work has explored large language model (LLM)-assisted evolutionary methods for automated code optimization, but these approaches primarily rely on outcome-based selection and random mutation, underutilizing the rich trajectory information generated during iterative optimization. We propose PhyloEvolve, an LLM-agent system that reframes GPU-oriented algorithm optimization as an In-Context Reinforcement Learning (ICRL) problem. This formulation enables trajectory-conditioned reuse of optimization experience without model retraining. PhyloEvolve integrates Algorithm Distillation and prompt-based Decision Transformers into an iterative workflow, treating sequences of algorithm modifications and performance feedback as first-class learning signals. To organize optimization history, we introduce a phylogenetic tree representation that captures inheritance, divergence, and recombination among algorithm variants, enabling backtracking, cross-lineage transfer, and reproducibility. The system combines elite trajectory pooling, multi-island parallel exploration, and containerized execution to balance exploration and exploitation across heterogeneous hardware. We evaluate PhyloEvolve on scientific computing workloads including PDE solvers, manifold learning, and spectral graph algorithms, demonstrating consistent improvements in runtime, memory efficiency, and correctness over baseline and evolutionary methods. Code is published at: https://github.com/annihi1ation/phylo_evolve
Problem

Research questions and friction points this paper is trying to address.

GPU code optimization
scientific computing
evolutionary optimization
trajectory reuse
large language models
Innovation

Methods, ideas, or system contributions that make the work stand out.

PhyloEvolve
In-Context Reinforcement Learning
Algorithm Distillation
Phylogenetic Tree
LLM-Agent Optimization
๐Ÿ”Ž Similar Papers
No similar papers found.
L
Leyi Zhao
Department of Computer Science, Luddy School of Informatics, Indiana University, Bloomington
W
Weijie Huang
Department of Computer Science, Luddy School of Informatics, Indiana University, Bloomington
Y
Yitong Guo
Department of Computer Science, Luddy School of Informatics, Indiana University, Bloomington
Jiang Bian
Jiang Bian
Regenstrief Institue; Indiana University; IU Health
data sciencereal-world dataontology/semanticeHealth/social media
Chenghong Wang
Chenghong Wang
Duke University
PrivacyCryptographyDatabase
Xuhong Zhang
Xuhong Zhang
Zhejiang University
LLMVLMVLATrustworthy AI