🤖 AI Summary
Traditional neural networks suffer from structural redundancy and computational inefficiency due to the strict separation of linear transformations and nonlinear activations. To address this, we propose APTx Neuron—a unified, fully trainable neural computation unit that integrates linear mapping and nonlinear activation (a tanh-weighted linear combination) into a single, differentiable, parameterized expression, eliminating the need for separate activation layers. This design enables end-to-end joint optimization of activation and linear computation—the first such approach—thereby simplifying network architecture, reducing module coupling, and advancing neuron modeling paradigms. Evaluated on MNIST, the APTx-based model achieves 96.69% test accuracy within only 20 training epochs, with approximately 332,000 parameters, demonstrating both high computational efficiency and strong representational capacity.
📝 Abstract
We propose the APTx Neuron, a novel, unified neural computation unit that integrates non-linear activation and linear transformation into a single trainable expression. The APTx Neuron is derived from the APTx activation function, thereby eliminating the need for separate activation layers and making the architecture both computationally efficient and elegant. The proposed neuron follows the functional form $y = sum_{i=1}^{n} ((α_i + anh(β_i x_i)) cdot γ_i x_i) + δ$, where all parameters $α_i$, $β_i$, $γ_i$, and $δ$ are trainable. We validate our APTx Neuron-based architecture on the MNIST dataset, achieving up to 96.69% test accuracy in just 20 epochs using approximately 332K trainable parameters. The results highlight the superior expressiveness and computational efficiency of the APTx Neuron compared to traditional neurons, pointing toward a new paradigm in unified neuron design and the architectures built upon it.