đ¤ AI Summary
Classical sigmoid-based Restricted Boltzmann Machines (RBMs) exhibit limited representational capacity and poor biological interpretability when modeling high-dimensional, heterogeneous biological dataâsuch as brain electrophysiology, protein sequences, and T-cell receptor (TCR) repertoires.
Method: We systematically investigate the impact of activation functions on RBM performance and interpretability, proposing novel RBM variants featuring continuous or integer-valued hidden units and non-standard activationsâincluding softplus, tanh, and custom biology-inspired functionsâcombined with rigorous probabilistic modeling and theoretical analysis.
Contribution/Results: Through multimodal empirical evaluationâincluding protein structure prediction and TCRâantigen specificity identificationâwe demonstrate that activation function choice critically governs both model accuracy and the biological meaning of latent variables. Our variants outperform classical sigmoid-RBMs on both tasks and endow hidden units with explicit, biologically grounded interpretations. This work establishes a new design paradigm for interpretable deep generative models in computational biology.
đ Abstract
Restricted Boltzmann Machines are simple yet powerful neural networks. They can be used for learning structure in data, and are used as a building block of more complex neural architectures. At the same time, their simplicity makes them easy to use, amenable to theoretical analysis, yielding interpretable models in applications. Here, we focus on reviewing the role that the activation functions, describing the input-output relationship of single neurons in RBM, play in the functionality of these models. We discuss recent theoretical results on the benefits and limitations of different activation functions. We also review applications to biological data analysis, namely neural data analysis, where RBM units are mostly taken to have sigmoid activation functions and binary units, to protein data analysis and immunology where non-binary units and non-sigmoid activation functions have recently been shown to yield important insights into the data. Finally, we discuss open problems addressing which can shed light on broader issues in neural network research.