PhysicsMinions: Winning Gold Medals in the Latest Physics Olympiads with a Coevolutionary Multimodal Multi-Agent System

📅 2025-09-29

📈 Citations: 0

✨ Influential: 0

career value

239K/year

🤖 AI Summary

Existing AI approaches for Physics Olympiad problems predominantly rely on single-model architectures; open-source multimodal large language models (MLLMs) have yet to achieve gold-medal-level performance. Method: We propose the first collaborative evolutionary multi-agent system tailored for Physics Olympiads, featuring a tripartite studio architecture—Vision Studio, Logic Studio, and Review Studio—enabling specialized role-based reasoning. Our framework integrates dual-stage verification and iterative self-correction to support multimodal joint inference and continuous optimization, synergistically leveraging both open- and closed-source LLMs. Contribution/Results: Evaluated on seven International Physics Olympiad (IPhO) contests, our system elevates the number of gold medals achieved by open-source models from 1–2 to 6, attaining—for the first time—the average score threshold required for gold. With a Pass@32 score of 26.8/30, it ranks 4th among 406 contestants, significantly outperforming the strongest single-model baseline. This work establishes a novel paradigm for AI-driven high-level physics reasoning.

Technology Category

Application Category

📝 Abstract

Physics is central to understanding and shaping the real world, and the ability to solve physics problems is a key indicator of real-world physical intelligence. Physics Olympiads, renowned as the crown of competitive physics, provide a rigorous testbed requiring complex reasoning and deep multimodal understanding, yet they remain largely underexplored in AI research. Existing approaches are predominantly single-model based, and open-source MLLMs rarely reach gold-medal-level performance. To address this gap, we propose PhysicsMinions, a coevolutionary multi-agent system for Physics Olympiad. Its architecture features three synergistic studios: a Visual Studio to interpret diagrams, a Logic Studio to formulate solutions, and a Review Studio to perform dual-stage verification. The system coevolves through an iterative refinement loop where feedback from the Review Studio continuously guides the Logic Studio, enabling the system to self-correct and converge towards the ground truth. Evaluated on the HiPhO benchmark spanning 7 latest physics Olympiads, PhysicsMinions delivers three major breakthroughs: (i) Strong generalization: it consistently improves both open-source and closed-source models of different sizes, delivering clear benefits over their single-model baselines; (ii) Historic breakthroughs: it elevates open-source models from only 1-2 to 6 gold medals across 7 Olympiads, achieving the first-ever open-source gold medal in the latest International Physics Olympiad (IPhO) under the average-score metric; and (iii) Scaling to human expert: it further advances the open-source Pass@32 score to 26.8/30 points on the latest IPhO, ranking 4th of 406 contestants and far surpassing the top single-model score of 22.7 (ranked 22nd). Generally, PhysicsMinions offers a generalizable framework for Olympiad-level problem solving, with the potential to extend across disciplines.

Problem

Research questions and friction points this paper is trying to address.

Addressing gold-medal-level performance gaps in Physics Olympiad AI systems

Overcoming limitations of single-model approaches in complex physics reasoning

Developing multimodal coevolutionary agents for rigorous physics problem solving

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent system with three synergistic studios

Coevolutionary loop enabling self-correction and convergence

Generalizable framework for Olympiad-level problem solving

🔎 Similar Papers

OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI