TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration

📅 2026-04-15

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

This work proposes the first end-to-end automated framework for fine-tuning large language models (LLMs), encompassing the entire lifecycle from requirement analysis to training and evaluation. The system employs a dual-module, multi-agent collaboration mechanism—comprising Researcher and Executor agents—to autonomously parse requirements, retrieve open-domain literature and data, generate training strategies, construct data recipes, and evaluate models. It further models iterative experiments as a tree-structured search space, enabling optimized exploration paths, result reuse, and high-level insight extraction. Experiments on FT-Bench, a newly introduced real-world benchmark comprising ten diverse tasks, demonstrate consistent performance gains across both general and domain-specific tasks, thereby validating the effectiveness and practicality of fully automated LLM fine-tuning.

Technology Category

Application Category

📝 Abstract

While Large Language Models (LLMs) have empowered AI research agents to perform isolated scientific tasks, automating complex, real-world workflows, such as LLM training, remains a significant challenge. In this paper, we introduce TREX, a multi-agent system that automates the entire LLM training life-cycle. By orchestrating collaboration between two core modules-the Researcher and the Executor-the system seamlessly performs requirement analysis, open-domain literature and data research, formulation of training strategies, preparation of data recipes, and model training and evaluation. The multi-round experimental process is modeled as a search tree, enabling the system to efficiently plan exploration paths, reuse historical results, and distill high-level insights from iterative trials. To evaluate the capability of automated LLM training, we construct FT-Bench, a benchmark comprising 10 tasks derived from real-world scenarios, ranging from optimizing fundamental model capabilities to enhancing performance on domain-specific tasks. Experimental results demonstrate that the TREX agent consistently optimizes model performance on target tasks.

Problem

Research questions and friction points this paper is trying to address.

LLM fine-tuning

workflow automation

scientific research agent

model training lifecycle

real-world scenarios

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-agent system

automated LLM fine-tuning

tree-based exploration