๐ค AI Summary
To address low multi-task federated fine-tuning efficiency and high latency in edge-assisted Internet of Vehicles (IoV) caused by high vehicle mobility, resource heterogeneity, intermittent connectivity, and energy constraints, this paper proposes a hierarchical federated fine-tuning framework integrating roadside units (RSUs) and vehicles for collaborative multi-task LoRA-based low-rank adaptation. We innovatively design a decentralized energy-aware rank scheduling mechanism, formulated as a multi-armed bandit problem, and propose the UCB-DUAL algorithm to jointly optimize exploration-exploitation trade-offs and dynamic energy constraints, achieving sublinear regret guarantees. Extensive evaluations on a large-scale, real-trajectory-driven IoV simulation platform demonstrate that our approach improves average accuracy by over 2.5% and reduces end-to-end latency by 24% compared to state-of-the-art baselines, significantly enhancing the accuracyโenergy-efficiency trade-off.
๐ Abstract
Federated fine-tuning has emerged as a promising approach for adapting foundation models (FMs) to diverse downstream tasks in edge environments. In Internet of Vehicles (IoV) systems, enabling efficient and low-latency multi-task adaptation is particularly challenging due to client mobility, heterogeneous resources, and intermittent connectivity. This paper proposes a hierarchical federated fine-tuning framework that coordinates roadside units (RSUs) and vehicles to support resource-aware and mobility-resilient learning across dynamic IoV scenarios. Leveraging Low-Rank Adaptation (LoRA), we introduce a decentralized, energy-aware rank adaptation mechanism formulated as a constrained multi-armed bandit problem. A novel UCB-DUAL algorithm is developed to enable adaptive exploration under per-task energy budgets, achieving provable sublinear regret. To evaluate our method, we construct a large-scale IoV simulator based on real-world trajectories, capturing dynamic participation, RSU handoffs, and communication variability. Extensive experiments show that our approach achieves the best accuracy-efficiency trade-off among all baselines, reducing latency by over 24% and improving average accuracy by more than 2.5%.