MotionBricks: Scalable Real-Time Motions with Modular Latent Generative Model and Smart Primitives

📅 2026-04-27
📈 Citations: 0
Influential: 0
📄 PDF

career value

209K/year
🤖 AI Summary
This work proposes the first unified motion generation framework that bridges the gap between generative motion synthesis and industrial applications by simultaneously achieving large-scale scalability, real-time performance, and fine-grained multimodal control. Built upon a modular latent generative backbone and smart primitives, the method efficiently models over 350,000 motion clips and supports multimodal conditioning—including speed, style, and keyframes—enabling plug-and-play motion authoring. Experiments demonstrate state-of-the-art generation quality across multiple benchmarks, with an inference throughput of 15,000 FPS and a latency of only 2 ms. The system has been successfully deployed on the Unitree G1 humanoid robot, validating its versatility and practicality for both animation generation and real-time robotic control.
📝 Abstract
Despite transformative advances in generative motion synthesis, real-time interactive motion control remains dominated by traditional techniques. In this work, we identify two key challenges in bridging research and production: 1) Real-time scalability: Industry applications demand real-time generation of a vast repertoire of motion skills, while generative methods exhibit significant degradation in quality and scalability under real-time computation constraints, and 2) Integration: Industry applications demand fine-grained multi-modal control involving velocity commands, style selection, and precise keyframes, a need largely unmet by existing text- or tag-driven models. To overcome these limitations, we introduce MotionBricks: a large-scale, real-time generative framework with a two-fold solution. First, we propose a large-scale modular latent generative backbone tailored for robust real-time motion generation, effectively modeling a dataset of over 350,000 motion clips with a single model. Second, we introduce smart primitives that provide a unified, robust, and intuitive interface for authoring both navigation and object interaction. Applications can be designed in a plug-and-play manner like assembling bricks without expert animation knowledge. Quantitatively, we show that MotionBricks produces state-of-the-art motion quality on open-source and proprietary datasets of various scales, while also achieving a real-time throughput of 15,000 FPS with 2ms latency. We demonstrate the flexibility and robustness of MotionBricks in a complete production-level animation demo, covering navigation and object-scene interaction across various styles with a unified model. To showcase our framework's application beyond animation, we deploy MotionBricks on the Unitree G1 humanoid robot to demonstrate its flexibility and generalization for real-time robotic control.
Problem

Research questions and friction points this paper is trying to address.

real-time motion generation
scalability
multi-modal control
motion synthesis
interactive animation
Innovation

Methods, ideas, or system contributions that make the work stand out.

modular latent generative model
real-time motion synthesis
smart primitives
scalable animation
multi-modal motion control
🔎 Similar Papers
No similar papers found.