RCR-AF: Enhancing Model Generalization via Rademacher Complexity Reduction Activation Function

📅 2025-07-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deep neural networks are vulnerable to adversarial attacks, posing serious risks to security-critical applications. This paper identifies the activation function as a key yet underexplored dimension for enhancing model robustness and proposes RCR-AF, a novel activation function grounded in Rademacher complexity theory—the first such integration in activation design. RCR-AF synergistically combines the smoothness of GELU with the monotonicity of ReLU and introduces a dual-parameter (α, γ) clipping mechanism to explicitly control model capacity, sparsity, and generalization. Evaluated under both standard and adversarial training paradigms across CIFAR-10/100 and ImageNet, and on architectures including ResNet and ViT, RCR-AF consistently outperforms ReLU, GELU, and Swish. It maintains high clean accuracy while significantly improving robustness against strong adversarial attacks such as PGD and AutoAttack.

Technology Category

Application Category

📝 Abstract
Despite their widespread success, deep neural networks remain critically vulnerable to adversarial attacks, posing significant risks in safety-sensitive applications. This paper investigates activation functions as a crucial yet underexplored component for enhancing model robustness. We propose a Rademacher Complexity Reduction Activation Function (RCR-AF), a novel activation function designed to improve both generalization and adversarial resilience. RCR-AF uniquely combines the advantages of GELU (including smoothness, gradient stability, and negative information retention) with ReLU's desirable monotonicity, while simultaneously controlling both model sparsity and capacity through built-in clipping mechanisms governed by two hyperparameters, $α$ and $γ$. Our theoretical analysis, grounded in Rademacher complexity, demonstrates that these parameters directly modulate the model's Rademacher complexity, offering a principled approach to enhance robustness. Comprehensive empirical evaluations show that RCR-AF consistently outperforms widely-used alternatives (ReLU, GELU, and Swish) in both clean accuracy under standard training and in adversarial robustness within adversarial training paradigms.
Problem

Research questions and friction points this paper is trying to address.

Enhancing model generalization against adversarial attacks
Designing activation function for improved robustness
Controlling model sparsity and capacity via hyperparameters
Innovation

Methods, ideas, or system contributions that make the work stand out.

RCR-AF combines GELU and ReLU advantages
Built-in clipping controls sparsity and capacity
Parameters reduce Rademacher complexity theoretically
🔎 Similar Papers
No similar papers found.
Y
Yunrui Yu
Tsinghua University
Kafeng Wang
Kafeng Wang
Tsinghua University
Machine LearningDeep Learning
H
Hang Su
Computer Science, Tsinghua University
J
Jun Zhu
Computer Science, Tsinghua University