🤖 AI Summary
This work investigates whether discrete logic circuits can replace continuous neural networks for learning continuous control policies. To this end, we propose the Differentiable Weightless Controller (DWC): a hardware-friendly architecture comprising one-hot encoded inputs, sparsely connected Boolean lookup-table (LUT) layers, and a lightweight action head, trained end-to-end via gradient relaxation. DWC is the first weightless logic-circuit controller supporting backpropagation, offering strong interpretability, sub-nanowatt per-step energy consumption, native FPGA synthesizability, and ultra-low inference latency. Evaluated on five MuJoCo benchmarks, DWC matches the performance of full-precision and quantized neural networks; it further demonstrates effectiveness on high-dimensional tasks such as Humanoid. Our approach establishes a new paradigm for trustworthy, resource-efficient control in embedded and energy-constrained settings.
📝 Abstract
We investigate whether continuous-control policies can be represented and learned as discrete logic circuits instead of continuous neural networks. We introduce Differentiable Weightless Controllers (DWCs), a symbolic-differentiable architecture that maps real-valued observations to actions using thermometer-encoded inputs, sparsely connected boolean lookup-table layers, and lightweight action heads. DWCs can be trained end-to-end by gradient-based techniques, yet compile directly into FPGA-compatible circuits with few- or even single-clock-cycle latency and nanojoule-level energy cost per action. Across five MuJoCo benchmarks, including high-dimensional Humanoid, DWCs achieve returns competitive with weight-based policies (full precision or quantized neural networks), matching performance on four tasks and isolating network capacity as the key limiting factor on HalfCheetah. Furthermore, DWCs exhibit structurally sparse and interpretable connectivity patterns, enabling a direct inspection of which input thresholds influence control decisions.