LipNeXt: Scaling up Lipschitz-based Certified Robustness to Billion-parameter Models

📅 2026-01-26

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

Existing Lipschitz-based certification methods struggle to scale in terms of model size, training efficiency, and ImageNet performance. This work proposes LipNeXt—the first unconstrained, convolution-free 1-Lipschitz architecture—enabling efficient deterministic robustness certification through orthogonal manifold optimization, a Spatial Shift module, β-Abs activation, L2 pooling, and orthogonal projection. LipNeXt supports models at the billion-parameter scale while maintaining training stability under low-precision arithmetic. It achieves state-of-the-art clean and certified accuracy on CIFAR-10/100 and Tiny-ImageNet, and notably advances certified robust accuracy (CRA) on ImageNet by up to 8% at ε=1, marking the first successful extension of certified robustness to 1–2 billion parameter models.

Technology Category

Application Category

📝 Abstract

Lipschitz-based certification offers efficient, deterministic robustness guarantees but has struggled to scale in model size, training efficiency, and ImageNet performance. We introduce \emph{LipNeXt}, the first \emph{constraint-free} and \emph{convolution-free} 1-Lipschitz architecture for certified robustness. LipNeXt is built using two techniques: (1) a manifold optimization procedure that updates parameters directly on the orthogonal manifold and (2) a \emph{Spatial Shift Module} to model spatial pattern without convolutions. The full network uses orthogonal projections, spatial shifts, a simple 1-Lipschitz $\beta$-Abs nonlinearity, and $L_2$ spatial pooling to maintain tight Lipschitz control while enabling expressive feature mixing. Across CIFAR-10/100 and Tiny-ImageNet, LipNeXt achieves state-of-the-art clean and certified robust accuracy (CRA), and on ImageNet it scales to 1-2B large models, improving CRA over prior Lipschitz models (e.g., up to $+8\%$ at $\varepsilon{=}1$) while retaining efficient, stable low-precision training. These results demonstrate that Lipschitz-based certification can benefit from modern scaling trends without sacrificing determinism or efficiency.

Problem

Research questions and friction points this paper is trying to address.

Lipschitz-based certification

scalability

certified robustness

large models

ImageNet performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Lipschitz certification

orthogonal manifold optimization

convolution-free architecture