Agentic Design of Compositional Machines

📅 2025-10-16

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

This work investigates whether large language models (LLMs) can autonomously design and assemble composite machines capable of fulfilling specific physical functions—such as locomotion or manipulation. To this end, we propose the first end-to-end agent framework for machine creation: it integrates LLM-based agent workflows, reinforcement learning fine-tuning, and a structured component assembly mechanism within the BesiegeField physics simulation environment. We introduce “compositional machine design” as a novel evaluation dimension for assessing LLMs’ spatial reasoning, instruction following, and embodied planning capabilities. Leveraging cold-start dataset construction and RLHF-based optimization, we significantly improve design success rates. Experiments reveal both the promise and limitations of LLMs in aligning linguistic representations with physical dynamics—highlighting key challenges in cross-modal (language–physics) grounding. Our framework establishes a verifiable technical pathway toward embodied intelligence, providing foundational benchmarks and concrete bottlenecks for future research.

Technology Category

Application Category

📝 Abstract

The design of complex machines stands as both a marker of human intelligence and a foundation of engineering practice. Given recent advances in large language models (LLMs), we ask whether they, too, can learn to create. We approach this question through the lens of compositional machine design: a task in which machines are assembled from standardized components to meet functional demands like locomotion or manipulation in a simulated physical environment. To support this investigation, we introduce BesiegeField, a testbed built on the machine-building game Besiege, which enables part-based construction, physical simulation and reward-driven evaluation. Using BesiegeField, we benchmark state-of-the-art LLMs with agentic workflows and identify key capabilities required for success, including spatial reasoning, strategic assembly, and instruction-following. As current open-source models fall short, we explore reinforcement learning (RL) as a path to improvement: we curate a cold-start dataset, conduct RL finetuning experiments, and highlight open challenges at the intersection of language, machine design, and physical reasoning.

Problem

Research questions and friction points this paper is trying to address.

Assessing LLMs' ability to design functional machines from components

Developing a simulation testbed for evaluating machine-building agents

Improving models' spatial reasoning and physical design capabilities

Innovation

Methods, ideas, or system contributions that make the work stand out.

Agentic workflows for compositional machine design

Reinforcement learning finetuning for spatial reasoning

BesiegeField testbed enabling part-based physical simulation

🔎 Similar Papers

Automated Design of Agentic Systems