GauDP: Reinventing Multi-Agent Collaboration through Gaussian-Image Synergy in Diffusion Policies

📅 2025-11-02

📈 Citations: 0

✨ Influential: 0

career value

238K/year

🤖 AI Summary

Embodied multi-agent systems face challenges in harmonizing local perception with global understanding and suffer from limited scalability. Method: This paper proposes a decentralized, RGB-only approach to constructing globally consistent 3D Gaussian splatting fields. Leveraging a Gaussian-image co-representation mechanism, it introduces attribute redistribution of 3D Gaussians into multi-agent collaboration for the first time, enabling distributed scene reconstruction and task-relevant feature sharing solely from monocular RGB inputs. The method integrates multi-view geometric fusion, distributed feature reprojection, and diffusion-policy-driven imitation learning—requiring no additional sensors. Results: Evaluated on the RoboFactory benchmark, our approach achieves performance comparable to point-cloud-based baselines and substantially outperforms existing pure-image methods. Crucially, it demonstrates superior scalability with increasing agent count, validating its efficacy for large-scale embodied multi-agent coordination.

Technology Category

Application Category

📝 Abstract

Recently, effective coordination in embodied multi-agent systems has remained a fundamental challenge, particularly in scenarios where agents must balance individual perspectives with global environmental awareness. Existing approaches often struggle to balance fine-grained local control with comprehensive scene understanding, resulting in limited scalability and compromised collaboration quality. In this paper, we present GauDP, a novel Gaussian-image synergistic representation that facilitates scalable, perception-aware imitation learning in multi-agent collaborative systems. Specifically, GauDP constructs a globally consistent 3D Gaussian field from decentralized RGB observations, then dynamically redistributes 3D Gaussian attributes to each agent's local perspective. This enables all agents to adaptively query task-critical features from the shared scene representation while maintaining their individual viewpoints. This design facilitates both fine-grained control and globally coherent behavior without requiring additional sensing modalities (e.g., 3D point cloud). We evaluate GauDP on the RoboFactory benchmark, which includes diverse multi-arm manipulation tasks. Our method achieves superior performance over existing image-based methods and approaches the effectiveness of point-cloud-driven methods, while maintaining strong scalability as the number of agents increases.

Problem

Research questions and friction points this paper is trying to address.

Balancing individual agent perspectives with global environmental awareness in multi-agent systems

Overcoming limitations in fine-grained local control and comprehensive scene understanding

Enabling scalable perception-aware imitation learning without additional sensing modalities

Innovation

Methods, ideas, or system contributions that make the work stand out.

Gaussian-image synergy enables scalable multi-agent collaboration

Globally consistent 3D Gaussian field from decentralized RGB observations

Dynamic redistribution of 3D attributes maintains individual viewpoints

🔎 Similar Papers

MADiff: Offline Multi-agent Learning with Diffusion Models

2023-05-27arXiv.orgCitations: 22