PINO: Person-Interaction Noise Optimization for Long-Duration and Customizable Motion Generation of Arbitrary-Sized Groups

📅 2025-07-25

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

Existing group motion generation methods struggle to simultaneously achieve scalability, physical plausibility, and fine-grained control: incremental diffusion models based on shared prompts often oversimplify interactions and lack explicit modeling of orientation, velocity, and spatial relationships. This paper proposes a training-free noise optimization framework that decomposes complex group motion into semantically coherent pairwise interaction sequences, leveraging a pre-trained two-person diffusion model for stepwise synthesis. Physical constraints—such as collision avoidance—are incorporated to suppress interpenetration artifacts and enable multi-dimensional motion customization (e.g., speed, direction, spacing). Extensive evaluation across diverse scenarios demonstrates substantial improvements in visual realism, physical plausibility, and controllability for long-horizon, large-group interactions. The approach achieves high-quality, scalable, and fine-tuning-free group motion generation without requiring additional model training.

Technology Category

Application Category

📝 Abstract

Generating realistic group interactions involving multiple characters remains challenging due to increasing complexity as group size expands. While existing conditional diffusion models incrementally generate motions by conditioning on previously generated characters, they rely on single shared prompts, limiting nuanced control and leading to overly simplified interactions. In this paper, we introduce Person-Interaction Noise Optimization (PINO), a novel, training-free framework designed for generating realistic and customizable interactions among groups of arbitrary size. PINO decomposes complex group interactions into semantically relevant pairwise interactions, and leverages pretrained two-person interaction diffusion models to incrementally compose group interactions. To ensure physical plausibility and avoid common artifacts such as overlapping or penetration between characters, PINO employs physics-based penalties during noise optimization. This approach allows precise user control over character orientation, speed, and spatial relationships without additional training. Comprehensive evaluations demonstrate that PINO generates visually realistic, physically coherent, and adaptable multi-person interactions suitable for diverse animation, gaming, and robotics applications.

Problem

Research questions and friction points this paper is trying to address.

Generating realistic group interactions with arbitrary size

Overcoming limitations of single shared prompts in motion generation

Ensuring physical plausibility in multi-character interactions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decomposes group interactions into pairwise interactions

Uses pretrained two-person diffusion models

Employs physics-based penalties for plausibility

🔎 Similar Papers

CrowdMoGen: Zero-Shot Text-Driven Collective Motion Generation