BrickCraft: Visuomotor Skill Composition with Situated Manual Guidance for Long-Horizon Interlocking Brick Assembly

📅 2026-05-08
📈 Citations: 0
Influential: 0
📄 PDF

career value

213K/year
🤖 AI Summary
This work addresses the challenge of coordinating task-level reasoning, spatial localization, and fine manipulation in long-horizon interlocking block assembly by autonomous robots. To this end, we propose BrickCraft, a framework that introduces a reference-block-based relative assembly representation coupled with a contextualized instruction mechanism to map high-level task plans onto real-time visual observations. This enables spatially anchored, composable visuomotor skill execution. Leveraging a hierarchical execution pipeline, the system learns primitive skills from only a few demonstrations and efficiently generalizes to unseen structures. Experimental results demonstrate that BrickCraft achieves strong compositional generalization and high assembly efficiency under limited demonstration conditions.
📝 Abstract
Autonomous robotic assembly of interlocking bricks demands seamless integration of long-horizon task reasoning, spatial grounding, and fine-grained manipulation. This paper presents BrickCraft, a compositional framework designed for long-horizon and generalizable interlocking brick assembly. BrickCraft models the assembly process using a relative formulation, where each step is anchored to a reference brick within the partial structure, thereby decomposing complex tasks into a finite set of reusable primitive skills. BrickCraft bridges the gap between high-level assembly plans and physical execution through situated manuals, which provide explicit spatial guidance for learned visuomotor skills by projecting the assembly intent onto real-time robot observations. Finally, BrickCraft employs a compositional execution pipeline that chains these spatially grounded skills to accomplish long-horizon assembly tasks. Extensive experimental validations demonstrate that BrickCraft acquires proficient assembly skills from a limited set of demonstrations and exhibits strong compositional generalization to unseen structures. The project website is available at https://intelligent-control-lab.github.io/BrickCraft.
Problem

Research questions and friction points this paper is trying to address.

long-horizon assembly
interlocking brick
visuomotor skills
spatial grounding
compositional generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

compositional framework
situated manual guidance
visuomotor skill
spatial grounding
long-horizon assembly