On the Optimality of Hierarchical Secure Aggregation with Arbitrary Heterogeneous Data Assignment

📅 2026-04-14
📈 Citations: 0
Influential: 0
📄 PDF

career value

238K/year
🤖 AI Summary
This work addresses the challenges of arbitrary heterogeneous data distributions, random user dropouts, and potential collusion between servers/relays and subsets of users in three-tier hierarchical networks. To tackle these issues, the paper proposes an information-theoretically secure two-layer gradient aggregation scheme. Users employ a masking mechanism to transmit their local gradients to relays, which aggregate and forward them to the server such that only the global sum is recoverable, while individual contributions remain perfectly concealed. This approach is the first to simultaneously guarantee security against both the server and relays under heterogeneous data and user dropout conditions, achieving information-theoretic optimality in communication load across both layers. By integrating secure coding, masked transmission, and a hierarchical aggregation architecture, the scheme significantly enhances communication efficiency without compromising rigorous security, thereby providing theoretical foundations for heterogeneous federated learning.

Technology Category

Application Category

📝 Abstract
This paper studies the information theoretic secure aggregation problem in a three-layer hierarchical network with arbitrary heterogeneous data assignment, where clustered users communicate with an aggregation server through an intermediate layer of relays. We consider a more general setting with arbitrary heterogeneous data assignment across users, where `arbitrary' means that the data assignment is given in advance and `heterogeneous' means that the users may hold different numbers of datasets. Each user locally computes the partially aggregated gradients as its input based on the assigned datasets and transmits masked input to its associated relay. The relays then forward the aggregated messages to the server, which aims to recover the sum of the gradients. In this process, while some users may drop out unpredictably, the server needs to correctly recover the desired aggregation from the surviving users. Moreover, the server or any relay may collude with a subset of users. We impose the following security constraints: (i) server security, requiring the server to learn only the sum of gradients without gaining any additional information about individual inputs; and (ii) relay security, ensuring that each relay learns nothing about users' inputs. Under these constraints, we propose an aggregation scheme that guarantees information theoretic security and achieves the optimal two-layer communication loads.
Problem

Research questions and friction points this paper is trying to address.

secure aggregation
hierarchical network
heterogeneous data assignment
information theoretic security
user dropout
Innovation

Methods, ideas, or system contributions that make the work stand out.

secure aggregation
hierarchical network
heterogeneous data assignment
information-theoretic security
optimal communication load