🤖 AI Summary
Automatic differentiation (AD) via the Taylor-mode scheme incurs exponential computational overhead when evaluating linear partial differential equation (PDE) operators, due to nested reverse-mode AD.
Method: We propose the first computation-graph-level “derivative collapse” mechanism: it rewrites the computational graph and aggregates forward-mode higher-order derivative propagation paths, reducing asymptotic complexity from exponential to linear. The method requires no user-code modification, is natively compatible with mainstream AD frameworks, and supports general linear PDE operators and randomized Taylor expansions. We further integrate machine learning compiler optimizations to enhance efficiency.
Results: Experiments demonstrate significantly faster derivative evaluation on canonical PDE operators compared to conventional nested reverse-mode AD. Our approach establishes a new paradigm for efficient high-order differential operator implementation in scientific machine learning.
📝 Abstract
Computing partial differential equation (PDE) operators via nested backpropagation is expensive, yet popular, and severely restricts their utility for scientific machine learning. Recent advances, like the forward Laplacian and randomizing Taylor mode automatic differentiation (AD), propose forward schemes to address this. We introduce an optimization technique for Taylor mode that 'collapses' derivatives by rewriting the computational graph, and demonstrate how to apply it to general linear PDE operators, and randomized Taylor mode. The modifications simply require propagating a sum up the computational graph, which could -- or should -- be done by a machine learning compiler, without exposing complexity to users. We implement our collapsing procedure and evaluate it on popular PDE operators, confirming it accelerates Taylor mode and outperforms nested backpropagation.