🤖 AI Summary
Traditional LLM-guided evolutionary search for program synthesis optimizes each task independently, neglecting shareable structures across tasks and thereby suffering from low efficiency and poor generalization. This work proposes EMO-STA, a two-stage framework that first evolves a shared program library over a family of related tasks and then adapts candidate programs to individual target tasks, enabling cross-task knowledge transfer. By introducing multi-task optimization into LLM-guided program evolution for the first time, the approach leverages a share-and-adapt mechanism to enhance generalization, mitigate few-shot overfitting, and improve computational resource allocation. Experiments across eight task families demonstrate significant improvements over single-task baselines: STA Best-Local boosts in-distribution performance, while STA Best-Shared strengthens transfer to unseen tasks, with optimal results achieved when balancing the budgets for sharing and adaptation.
📝 Abstract
Recent LLM-guided evolutionary search methods have shown that iterative program mutation can discover strong algorithms, but they typically optimize each task independently, even when related tasks share reusable structure. We introduce Evolutionary Multi-Task Optimization (EMO) for LLM-guided program discovery, and propose EMO-STA (Shared-Then-Adapt), a two-stage framework that first evolves a shared archive of executable programs across a task family and then adapts selected shared candidates to each target task. Within EMO-STA, we explore multiple adaptation strategies, including warm-starting from the shared archive, adapting the best average shared program, and adapting the shared program that performs best on each target task. Across eight task families spanning continuous optimization, geometric construction, modeling, and algorithmic optimization, EMO-STA improves over matched-compute single-task evolution in most settings, with STA Best-Local providing the strongest in-distribution adaptation and STA Best-Shared yielding robust transfer to unseen tasks. Compute-allocation experiments show that allocating a substantial fraction of the family-level budget to shared evolution is consistently beneficial, with roughly balanced shared and adaptation budgets often being optimal. Beyond compute efficiency, we show that shared evolution can mitigate overfitting in low-evidence settings (e.g. few training data), including ARC tasks and time-series feature engineering, by favoring programs that generalize across all tasks rather than exploiting task-specific brittle artifacts.