🤖 AI Summary
This work addresses the challenge of training thermodynamic computing hardware—driven solely by thermal noise—to perform target computations (e.g., image classification) within a fixed observation time. We propose a gradient-descent-based parameter optimization method framed as a teacher–student paradigm: a deterministic teacher network generates ideal neural activation trajectories, while a stochastic student system models tunable thermodynamic hardware (e.g., bistable units with adjustable energy barriers and coupling strengths). Physical parameters are optimized end-to-end via backpropagation through the stochastic dynamics to minimize trajectory divergence. To our knowledge, this is the first approach to apply gradient descent directly for end-to-end training of physical thermodynamic computing substrates. Experiments demonstrate robust classification performance on MNIST, with theoretical energy consumption over seven orders of magnitude lower than conventional digital implementations. The method establishes a new paradigm for ultra-low-power, brain-inspired computing grounded in nonequilibrium thermodynamics.
📝 Abstract
We show how to adjust the parameters of a thermodynamic computer by gradient descent in order to perform a desired computation at a specified observation time. Within a digital simulation of a thermodynamic computer, training proceeds by maximizing the probability with which the computer would generate an idealized dynamical trajectory. The idealized trajectory is designed to reproduce the activations of a neural network trained to perform the desired computation. This teacher-student scheme results in a thermodynamic computer whose finite-time dynamics enacts a computation analogous to that of the neural network. The parameters identified in this way can be implemented in the hardware realization of the thermodynamic computer, which will perform the desired computation automatically, driven by thermal noise. We demonstrate the method on a standard image-classification task, and estimate the thermodynamic advantage -- the ratio of energy costs of the digital and thermodynamic implementations -- to exceed seven orders of magnitude. Our results establish gradient descent as a viable training method for thermodynamic computing, enabling application of the core methodology of machine learning to this emerging field.