Temperature Scaling Attack Disrupting Model Confidence in Federated Learning

📅 2026-02-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work identifies a previously overlooked threat in federated learning: the integrity of prediction confidence calibration, which is critical for high-stakes decision-making yet often ignored by existing attacks that primarily target model accuracy or implant backdoors. The paper introduces Temperature Scaling Attack (TSA), the first method to explicitly compromise calibration fidelity as a novel attack surface. TSA dynamically couples temperature scaling with the local learning rate during client training, significantly degrading calibration—increasing expected calibration error by up to 145%—while preserving model accuracy (within a 2% drop). Notably, under non-IID data distributions, TSA maintains malicious updates indistinguishable from benign client behavior, thereby evading mainstream defenses. In safety-critical applications such as healthcare and autonomous driving, TSA amplifies critical false negatives or false positives by up to 7.2× and remains effective against robust aggregation rules and post-hoc calibration techniques.

Technology Category

Application Category

📝 Abstract
Predictive confidence serves as a foundational control signal in mission-critical systems, directly governing risk-aware logic such as escalation, abstention, and conservative fallback. While prior federated learning attacks predominantly target accuracy or implant backdoors, we identify confidence calibration as a distinct attack objective. We present the Temperature Scaling Attack (TSA), a training-time attack that degrades calibration while preserving accuracy. By injecting temperature scaling with learning rate-temperature coupling during local training, malicious updates maintain benign-like optimization behavior, evading accuracy-based monitoring and similarity-based detection. We provide a convergence analysis under non-IID settings, showing that this coupling preserves standard convergence bounds while systematically distorting confidence. Across three benchmarks, TSA substantially shifts calibration (e.g., 145% error increase on CIFAR-100) with<2 accuracy change, and remains effective under robust aggregation and post-hoc calibration defenses. Case studies further show that confidence manipulation can cause up to 7.2x increases in missed critical cases (healthcare) or false alarms (autonomous driving), even when accuracy is unchanged. Overall, our results establish calibration integrity as a critical attack surface in federated learning.
Problem

Research questions and friction points this paper is trying to address.

confidence calibration
federated learning
adversarial attack
model reliability
predictive uncertainty
Innovation

Methods, ideas, or system contributions that make the work stand out.

Temperature Scaling Attack
Confidence Calibration
Federated Learning
Training-time Attack
Non-IID Convergence
🔎 Similar Papers
No similar papers found.
Kichang Lee
Kichang Lee
Ph.D Student, Yonsei University
Machine learningDeep LearningMedical AIMobile ComputingSecurity
J
Jaeho Jin
College of Computing, Yonsei University
J
JaeYeon Park
Department of Mobile Systems Engineering, Dankook University
S
Songkuk Kim
College of Computing, Yonsei University
JeongGil Ko
JeongGil Ko
Associate Professor, Yonsei University
Mobile ComputingInternet of ThingsIntelligent Cyber Physical SystemsWireless HealthcareLow-Power Embedded Systems