Identifying Heterogeneity in Distributed Learning

📅 2025-06-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the challenge of heterogeneous parameter identification in distributed M-estimation. We propose a communication-efficient hybrid framework combining renormalized Wald testing and an extremum-based contrast test (ECT). ECT eliminates bias via sample splitting and, for the first time, achieves sparse-heterogeneity-consistent detection when the number of nodes $K$ far exceeds the local sample size. The renormalized Wald test ensures statistical accuracy under dense heterogeneity. Together, they support multiple testing correction with rigorous family-wise error rate (FWER) control. We establish theoretical consistency for both tests under dense and sparse heterogeneity regimes. Empirical evaluations demonstrate substantially higher statistical power than state-of-the-art methods. Real-world case studies confirm the practicality and effectiveness of our approach under low communication overhead.

Technology Category

Application Category

📝 Abstract
We study methods for identifying heterogeneous parameter components in distributed M-estimation with minimal data transmission. One is based on a re-normalized Wald test, which is shown to be consistent as long as the number of distributed data blocks $K$ is of a smaller order of the minimum block sample size {and the level of heterogeneity is dense}. The second one is an extreme contrast test (ECT) based on the difference between the largest and smallest component-wise estimated parameters among data blocks. By introducing a sample splitting procedure, the ECT can avoid the bias accumulation arising from the M-estimation procedures, and exhibits consistency for $K$ being much larger than the sample size while the heterogeneity is sparse. The ECT procedure is easy to operate and communication-efficient. A combination of the Wald and the extreme contrast tests is formulated to attain more robust power under varying levels of sparsity of the heterogeneity. We also conduct intensive numerical experiments to compare the family-wise error rate (FWER) and the power of the proposed methods. Additionally, we conduct a case study to present the implementation and validity of the proposed methods.
Problem

Research questions and friction points this paper is trying to address.

Identify heterogeneous parameters in distributed M-estimation
Test consistency with minimal data transmission
Combine Wald and extreme contrast tests for robustness
Innovation

Methods, ideas, or system contributions that make the work stand out.

Renormalized Wald test for dense heterogeneity
Extreme contrast test for sparse heterogeneity
Combined Wald and ECT for robust power
🔎 Similar Papers
No similar papers found.
Z
Zelin Xiao
Center for Statistical Science, Peking University
J
Jia Gu
Center for Data Science, Zhejiang University
Song Xi Chen
Song Xi Chen
Iowa State University and Peking University
Statistics and Econometrics