Wasserstein Distributionally Robust Shallow Convex Neural Networks

📅 2024-07-23

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

167K/year

🤖 AI Summary

Data contamination and distributional shifts in domains such as energy systems degrade model robustness. Method: This paper proposes an interference-robust shallow convex neural network framework. It rigorously reformulates training of shallow ReLU networks into a tractable convex Wasserstein distributionally robust optimization (DRO) problem—the first such exact convexification. The framework supports hard physical constraint embedding and enables posterior stability verification via mixed-integer convex programming. It integrates convex reconstruction, Wasserstein DRO, and open-source optimization solvers. Results: Evaluated on real-world tasks—including building energy consumption forecasting in virtual power plants—the method significantly improves generalization and robustness under distributional perturbations. It provides theoretical performance guarantees (e.g., out-of-distribution risk bounds) and demonstrates industrial-scale scalability.

Technology Category

Application Category

📝 Abstract

In this work, we propose Wasserstein distributionally robust shallow convex neural networks (WaDiRo-SCNNs) to provide reliable nonlinear predictions when subject to adverse and corrupted datasets. Our approach is based on a new convex training program for $ReLU$-based shallow neural networks which allows us to cast the problem as an exact, tractable reformulation of its order-1 Wasserstein distributionally robust counterpart. Our training procedure is conservative, has low stochasticity, is solvable with open-source solvers, and is scalable to large industrial deployments. We provide out-of-sample performance guarantees, show that hard convex physical constraints can be enforced in the training program, and propose a mixed-integer convex post-training verification program to evaluate model stability. WaDiRo-SCNN aims to make neural networks safer for critical applications, such as in the energy sector. Finally, we numerically demonstrate the performance of our model on a synthetic experiment, a real-world power system application, i.e., the prediction of non-residential buildings' hourly energy consumption in the context of virtual power plants, and on benchmark datasets. The experimental results are convincing and showcase the strengths of the proposed model.

Problem

Research questions and friction points this paper is trying to address.

Neural Network Robustness

Data Quality

Energy Sector

Innovation

Methods, ideas, or system contributions that make the work stand out.

Wasserstein Distance

SCNNs (Simple Structured Neural Networks)

Energy Industry Applications

🔎 Similar Papers

Data-Driven Lipschitz Continuity: A Cost-Effective Approach to Improve Adversarial Robustness