PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

📅 2026-01-09
🏛️ arXiv.org
📈 Citations: 3
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitation of current language models, which are constrained by fixed context windows and struggle to effectively scale test-time compute (TTC). We propose Parallel Coordinated Reasoning (PaCoRe), a novel framework that introduces the first message-passing-based multi-round parallel reasoning architecture. By enabling parallel inference trajectories coupled with context-aware message compression and synthesis, PaCoRe extends effective TTC to millions of tokens without exceeding context length limits. The model is trained end-to-end via reinforcement learning to master information coordination and synthesis. Evaluated on the HMMT 2025 mathematical reasoning benchmark, our method achieves 94.5% accuracy, surpassing GPT-5 (93.2%). We publicly release the model, data, and full reasoning pipeline.

Technology Category

Application Category

📝 Abstract
We introduce Parallel Coordinated Reasoning (PaCoRe), a training-and-inference framework designed to overcome a central limitation of contemporary language models: their inability to scale test-time compute (TTC) far beyond sequential reasoning under a fixed context window. PaCoRe departs from the traditional sequential paradigm by driving TTC through massive parallel exploration coordinated via a message-passing architecture in multiple rounds. Each round launches many parallel reasoning trajectories, compacts their findings into context-bounded messages, and synthesizes these messages to guide the next round and ultimately produce the final answer. Trained end-to-end with large-scale, outcome-based reinforcement learning, the model masters the synthesis abilities required by PaCoRe and scales to multi-million-token effective TTC without exceeding context limits. The approach yields strong improvements across diverse domains, and notably pushes reasoning beyond frontier systems in mathematics: an 8B model reaches 94.5% on HMMT 2025, surpassing GPT-5's 93.2% by scaling effective TTC to roughly two million tokens. We open-source model checkpoints, training data, and the full inference pipeline to accelerate follow-up work.
Problem

Research questions and friction points this paper is trying to address.

test-time compute
language models
reasoning scalability
context window
sequential reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Parallel Coordinated Reasoning
Test-Time Compute
Message-Passing Architecture
Reinforcement Learning
Scalable Inference
🔎 Similar Papers