🤖 AI Summary
This study investigates the self-organizing mechanisms underlying collective foraging behavior in non-cooperative multi-agent systems operating under partial observability, and how such group-level dynamics relate to agents’ internal physiological states.
Method: We employ continuous-time recurrent neural networks (CTRNNs) as velocity controllers for individual agents and optimize a shared policy via evolutionary strategies in a resource-patch environment.
Contribution/Results: We identify a robust negative correlation between individual internal resource reserves and collective aggregation strength: agents form denser clusters when resources are scarce and disperse when reserves are abundant—revealing an implicit risk-sensitive foraging strategy. Crucially, this is the first empirical demonstration of a causal regulatory role played by latent internal states on emergent group cohesion. Our findings establish a novel mechanistic account of endogenous-state-driven swarm intelligence, bridging individual physiology and collective decision-making in decentralized multi-agent systems.
📝 Abstract
Active particles are entities that sustain persistent out-of-equilibrium motion by consuming energy. Under certain conditions, they exhibit the tendency to self-organize through coordinated movements, such as swarming via aggregation. While performing non-cooperative foraging tasks, the emergence of such swarming behavior in foragers, exemplifying active particles, has been attributed to the partial observability of the environment, in which the presence of another forager can serve as a proxy signal to indicate the potential presence of a food source or a resource patch. In this paper, we validate this phenomenon by simulating multiple self-propelled foragers as they forage from multiple resource patches in a non-cooperative manner. These foragers operate in a continuous two-dimensional space with stochastic position updates and partial observability. We evolve a shared policy in the form of a continuous-time recurrent neural network that serves as a velocity controller for the foragers. To this end, we use an evolutionary strategy algorithm wherein the different samples of the policy-distribution are evaluated in the same rollout. Then we show that agents are able to learn to adaptively forage in the environment. Next, we show the emergence of swarming in the form of aggregation among the foragers when resource patches are absent. We observe that the strength of this swarming behavior appears to be inversely proportional to the amount of resource stored in the foragers, which supports the risk-sensitive foraging claims. Empirical analysis of the learned controller's hidden states in minimal test runs uncovers their sensitivity to the amount of resource stored in a forager. Clamping these hidden states to represent a lesser amount of resource hastens its learned aggregation behavior.