Certifying the Right to Be Forgotten: Primal-Dual Optimization for Sample and Label Unlearning in Vertical Federated Learning

📅 2025-12-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the “right to be forgotten” for data subjects in vertical federated learning (VFL), this paper proposes FedORA—the first VFL unlearning framework leveraging primal-dual optimization. It formulates sample- and label-level unlearning as a constrained optimization problem. FedORA introduces a novel classification-uncertainty-based unlearning loss, coupled with asymmetric mini-batching and adaptive step sizing, enabling theoretically grounded approximation of full retraining performance. Experiments on tabular and image datasets demonstrate that FedORA matches retraining in both unlearning accuracy and model utility, while reducing communication overhead by up to 42% and computational cost by 37%, significantly outperforming existing VFL unlearning approaches.

Technology Category

Application Category

📝 Abstract
Federated unlearning has become an attractive approach to address privacy concerns in collaborative machine learning, for situations when sensitive data is remembered by AI models during the machine learning process. It enables the removal of specific data influences from trained models, aligning with the growing emphasis on the "right to be forgotten." While extensively studied in horizontal federated learning, unlearning in vertical federated learning (VFL) remains challenging due to the distributed feature architecture. VFL unlearning includes sample unlearning that removes specific data points' influence and label unlearning that removes entire classes. Since different parties hold complementary features of the same samples, unlearning tasks require cross-party coordination, creating computational overhead and complexities from feature interdependencies. To address such challenges, we propose FedORA (Federated Optimization for data Removal via primal-dual Algorithm), designed for sample and label unlearning in VFL. FedORA formulates the removal of certain samples or labels as a constrained optimization problem solved using a primal-dual framework. Our approach introduces a new unlearning loss function that promotes classification uncertainty rather than misclassification. An adaptive step size enhances stability, while an asymmetric batch design, considering the prior influence of the remaining data on the model, handles unlearning and retained data differently to efficiently reduce computational costs. We provide theoretical analysis proving that the model difference between FedORA and Train-from-scratch is bounded, establishing guarantees for unlearning effectiveness. Experiments on tabular and image datasets demonstrate that FedORA achieves unlearning effectiveness and utility preservation comparable to Train-from-scratch with reduced computation and communication overhead.
Problem

Research questions and friction points this paper is trying to address.

Addresses sample and label unlearning challenges in vertical federated learning
Proposes a primal-dual optimization method to remove specific data influences efficiently
Ensures unlearning effectiveness with bounded model difference and reduced overhead
Innovation

Methods, ideas, or system contributions that make the work stand out.

Primal-dual optimization for sample and label unlearning
New unlearning loss promoting classification uncertainty
Asymmetric batch design reduces computational costs
🔎 Similar Papers
2024-05-24International Joint Conference on Artificial IntelligenceCitations: 5
Y
Yu Jiang
College of Computing and Data Science (CCDS), Nanyang Technological University, Singapore and Digital Trust Centre (DTC), Singapore
X
Xindi Tong
College of Computing and Data Science (CCDS), Nanyang Technological University, Singapore
Z
Ziyao Liu
Digital Trust Centre (DTC), Singapore
Xiaoxi Zhang
Xiaoxi Zhang
School of Computer Science and Engineering, Sun Yat-sen University
Machine learning systemsresource-efficient machine learningreinforcement learning
Kwok-Yan Lam
Kwok-Yan Lam
Nanyang Technological University
CybersecurityPrivacy-Preserving technologiesDigital TrustDistributing systemsLegalTech
Chee Wei Tan
Chee Wei Tan
Nanyang Technological University, Singapore
NetworksDistributed OptimizationGen AI