DC is all you need: describing ReLU from a signal processing standpoint

๐Ÿ“… 2024-07-23
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 1
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work investigates the fundamental frequency-domain characteristics of the ReLU activation function and its impact on CNN representation learning from a signal processing perspective. We derive a rigorous closed-form spectral representation of ReLU, revealing that it inherently introduces high-frequency oscillations while simultaneously generating a dominant direct-current (DC) component. Contrary to conventional emphasis on nonlinear high-frequency effects, we establish the DC component as a critical mechanism ensuring both stability and spectral sensitivity in feature extraction. Methodologically, we integrate Taylor spectral analysis, numerical frequency-response modeling, feature visualization, and multi-scale ablation studiesโ€”including validation on real CNNs. Theoretical findings are fully corroborated by numerical experiments: the DC component significantly enhances network responsiveness to input frequency content and steers weight optimization toward stable solutions near initialization.

Technology Category

Application Category

๐Ÿ“ Abstract
Non-linear activation functions are crucial in Convolutional Neural Networks. However, until now they have not been well described in the frequency domain. In this work, we study the spectral behavior of ReLU, a popular activation function. We use the ReLU's Taylor expansion to derive its frequency domain behavior. We demonstrate that ReLU introduces higher frequency oscillations in the signal and a constant DC component. Furthermore, we investigate the importance of this DC component, where we demonstrate that it helps the model extract meaningful features related to the input frequency content. We accompany our theoretical derivations with experiments and real-world examples. First, we numerically validate our frequency response model. Then we observe ReLU's spectral behavior on two example models and a real-world one. Finally, we experimentally investigate the role of the DC component introduced by ReLU in the CNN's representations. Our results indicate that the DC helps to converge to a weight configuration that is close to the initial random weights.
Problem

Research questions and friction points this paper is trying to address.

Describing ReLU's spectral behavior in frequency domain
Investigating DC component's role in feature extraction
Validating ReLU's frequency impact via experiments
Innovation

Methods, ideas, or system contributions that make the work stand out.

ReLU's Taylor expansion for frequency analysis
ReLU introduces DC component and oscillations
DC component aids feature extraction convergence
๐Ÿ”Ž Similar Papers
No similar papers found.