A Survey on Federated Analytics: Taxonomy, Enabling Techniques, Applications and Open Issues

📅 2024-04-19
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
The surge in edge data and growing privacy concerns render conventional centralized data analytics unsustainable. Method: Federated Analytics (FA) emerges as a paradigm enabling collaborative statistical analysis, frequency computation, database querying, federated learning coordination, and wireless network applications—without sharing raw data. This paper systematically establishes the first comprehensive FA taxonomy, rigorously delineating its conceptual boundaries from related paradigms (e.g., federated learning). Integrating techniques from privacy-preserving computation—including differential privacy, secure multi-party computation, and trusted execution environments—it synthesizes core methodologies and practical bottlenecks from over 100 state-of-the-art works. Contribution/Results: The study distills key technical challenges and open research questions, and proposes a reusable, actionable roadmap for advancing FA—serving as an authoritative reference framework for both academic research and industrial deployment.

Technology Category

Application Category

📝 Abstract
The escalating influx of data generated by networked edge devices, coupled with the growing awareness of data privacy, has restricted the traditional data analytics workflow, where the edge data are gathered by a centralized server to be further utilized by data analysts. To continue leveraging vast edge data to support various data-incentive applications, computing paradigms have promoted a transformative shift from centralized data processing to privacy-preserved distributed data processing. The need to perform data analytics on private edge data motivates federated analytics (FA), an emerging technique to support collaborative data analytics among diverse data owners without centralizing the raw data. Despite the wide applications of FA in industry and academia, a comprehensive examination of existing research efforts in FA has been notably absent. This survey aims to bridge this gap by first providing an overview of FA, elucidating key concepts, and discussing its relationship with similar concepts. We then thoroughly examine FA, including its key challenges, taxonomy, and enabling techniques. Diverse FA applications, including statistical metrics, frequency-related applications, database query operations, FL-assisting FA tasks, and other wireless network applications are then carefully reviewed. We complete the survey with several open research issues, future directions, and a comprehensive lessons learned part. This survey intends to provide a holistic understanding of the emerging FA techniques and foster the continued evolution of privacy-preserving distributed data processing in the emerging networked society.
Problem

Research questions and friction points this paper is trying to address.

Surveying federated analytics techniques for privacy-preserving data processing
Examining challenges and taxonomy in federated analytics applications
Identifying open research issues in distributed collaborative data analytics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated analytics enables privacy-preserved distributed data processing
FA supports collaborative analytics without centralizing raw data
Survey covers FA taxonomy, techniques, applications, and challenges
🔎 Similar Papers
No similar papers found.
Z
Zibo Wang
UM-SJTU Joint Institute, Shanghai Jiao Tong University, Shanghai, China
H
Haichao Ji
UM-SJTU Joint Institute, Shanghai Jiao Tong University, Shanghai, China
Yifei Zhu
Yifei Zhu
Shanghai Jiao Tong University
Edge computingmultimedia networkingdistributed ML systems
D
Dan Wang
Department of Computing, The Hong Kong Polytechnic University, Hong Kong, China
Z
Zhu Han
Department of Electrical and Computer Engineering, University of Houston, Houston, USA