FERA: Uncertainty-Aware Federated Reasoning for Large Language Models

📅 2026-05-11

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

This work proposes FERA, a training-free federated inference framework designed to enhance the multi-step reasoning capabilities of large language models in settings where data cannot be centrally shared. FERA enables iterative collaboration between server and clients, leveraging uncertainty-aware self-critical aggregation (UA-SCA), query-dependent trust weighting, and structured cross-client verification to dynamically evaluate and fuse reasoning trajectories without exchanging raw data. This approach effectively identifies low-quality reasoning paths, corrects erroneous steps, and preserves useful information. Theoretical analysis establishes the convergence of the algorithm, and experiments demonstrate that FERA significantly outperforms existing federated training and training-free methods across multiple reasoning benchmarks. Moreover, its accuracy consistently improves with additional communication rounds while maintaining efficient communication and computational overhead.

📝 Abstract

Large language models (LLMs) exhibit strong reasoning capabilities when guided by high-quality demonstrations, yet such data is often distributed across organizations that cannot centralize it due to regulatory, proprietary, or institutional constraints. We study federated reasoning, where a server improves multi-step reasoning by coordinating with heterogeneous clients holding private demonstrations, without centralized training or raw data sharing. The key challenge is that client reliability is query-dependent, while the server cannot inspect client data to determine which contributions are trustworthy. To address this, we propose Uncertainty-Aware Federated Reasoning (FERA), a training-free framework based on iterative server-client co-refinement. Across communication rounds, clients generate reasoning traces with lightweight uncertainty estimates, and the server synthesizes them into improved reasoning that is redistributed as context for the next round, progressively improving both server outputs and client-side reasoning. Within each round, Uncertainty-Aware Self-Critique Aggregation (UA-SCA) resolves conflicts among heterogeneous client traces through query-dependent trust weighting and structured cross-client verification. Rather than simply discarding low-quality traces, UA-SCA revises flawed reasoning steps to recover useful information. We provide theoretical guarantees showing that the proposed iterative protocol converges and that uncertainty-aware weighting accelerates convergence. Experiments on multiple reasoning benchmarks show that FERA consistently outperforms both federated training and training-free baselines, achieving progressively higher accuracy across rounds while maintaining communication and computational efficiency.

Problem

Research questions and friction points this paper is trying to address.

federated reasoning

large language models

uncertainty-aware

private demonstrations

query-dependent reliability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated Reasoning

Uncertainty Estimation

Training-Free Framework