🤖 AI Summary
This work addresses the limitations of traditional auction-based consensus algorithms—such as Consensus-Based Bundle Algorithm (CBBA)—in multi-robot task allocation, which rely on handcrafted greedy scoring functions and struggle to achieve high-quality global solutions. The paper introduces, for the first time, a reinforcement learning approach within the auction-consensus framework, replacing CBBA’s deterministic bidding mechanism with a neural network policy that learns superior bidding strategies from local observations while preserving decentralization. Leveraging the centralized training with decentralized execution paradigm and Proximal Policy Optimization (PPO), the study systematically evaluates architectures including neural additive models, LSTM, and Set Transformer. Experimental results demonstrate that the proposed method significantly outperforms CBBA across robot teams of varying scales, maintaining strong scalability and fully decentralized operation.
📝 Abstract
Multi-Robot Task Allocation (MRTA) is a central challenge in decentralized multi-agent systems, where teams of robots must cooperatively assign and execute tasks under limited communication while optimizing global performance objectives. Auction-consensus algorithms, such as the Consensus-Based Bundle Algorithm (CBBA), provide scalable decentralized coordination with provable convergence, but rely on hand-crafted greedy scoring functions that often lead to suboptimal task allocations. This paper proposes a learning-enhanced auction-consensus framework in which CBBA's deterministic bidding mechanism is replaced by a neural bidding policy trained using reinforcement learning. Under a centralized training and decentralized execution paradigm, agents learn to compute task bids from partial local observations while retaining the standard auction and consensus phases for decentralized coordination. The learned bidding policy is trained using Proximal Policy Optimization with rewards shaped by proximity to globally optimal solutions obtained via mixed-integer linear programming. Multiple neural architectures are evaluated, including a Neural Additive Model, the Long Short-Term Memory (LSTM) model, and the Set Transformer Model. Experimental results across varying swarm sizes demonstrate that learned bidding policies can improve solution quality over classical CBBA while preserving decentralized execution. The proposed approach highlights the effectiveness of integrating reinforcement learning with classical distributed coordination algorithms, offering a scalable pathway toward higher-quality decentralized multi-robot task allocation.