FedMobileAgent: Training Mobile Agents Using Decentralized Self-Sourced Data from Diverse Users

📅 2025-02-05

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

To address the challenges of scarce high-quality instruction data, privacy sensitivity, and non-independent-and-identically-distributed (Non-IID) data—leading to degraded performance in federated learning for mobile agents—this paper proposes a decentralized collaborative framework. It automatically mines instruction data from users’ daily smartphone interactions and introduces Auto-Annotation, a novel zero-human-intervention automatic labeling technique. To mitigate Non-IID effects, we design an adaptive aggregation mechanism integrating both episode-level and step-level distribution alignment. Furthermore, differential privacy and secure aggregation are jointly integrated to ensure raw data never leaves local devices. Experiments demonstrate that our approach achieves performance on par with centralized, human-annotated models while preserving privacy, reducing training costs to less than 0.02% of the centralized baseline. This work establishes the first federated training paradigm for mobile agents that simultaneously achieves high data quality, ultra-low computational overhead, and strong privacy guarantees.

Technology Category

Application Category

📝 Abstract

The advancement of mobile agents has opened new opportunities for automating tasks on mobile devices. Training these agents requires large-scale high-quality data, which is costly using human labor. Given the vast number of mobile phone users worldwide, if automated data collection from them is feasible, the resulting data volume and the subsequently trained mobile agents could reach unprecedented levels. Nevertheless, two major challenges arise: (1) extracting high-level and low-level user instructions without involving human and (2) utilizing distributed data from diverse users while preserving privacy. To tackle these challenges, we propose FedMobileAgent, a collaborative framework that trains mobile agents using self-sourced data from diverse users. Specifically, it includes two techniques. First, we propose Auto-Annotation, which enables the automatic collection of high-quality datasets during users' routine phone usage with minimal cost. Second, we introduce adapted aggregation to improve federated training of mobile agents on non-IID user data, by incorporating both episode- and step-level distributions. In distributed settings, FedMobileAgent achieves performance comparable to centralized human-annotated models at less than 0.02% of the cost, highlighting its potential for real-world applications.

Problem

Research questions and friction points this paper is trying to address.

Automating data collection for mobile agents

Training agents with decentralized user data

Ensuring privacy in distributed data usage

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decentralized self-sourced data collection

Auto-Annotation for automatic dataset generation

Adapted aggregation for federated training

🔎 Similar Papers

Benchmarking Mobile Device Control Agents across Diverse Configurations