Collapsing Taylor Mode Automatic Differentiation

📅 2025-05-19

📈 Citations: 0

✨ Influential: 0

career value

168K/year

🤖 AI Summary

Automatic differentiation (AD) via the Taylor-mode scheme incurs exponential computational overhead when evaluating linear partial differential equation (PDE) operators, due to nested reverse-mode AD. Method: We propose the first computation-graph-level “derivative collapse” mechanism: it rewrites the computational graph and aggregates forward-mode higher-order derivative propagation paths, reducing asymptotic complexity from exponential to linear. The method requires no user-code modification, is natively compatible with mainstream AD frameworks, and supports general linear PDE operators and randomized Taylor expansions. We further integrate machine learning compiler optimizations to enhance efficiency. Results: Experiments demonstrate significantly faster derivative evaluation on canonical PDE operators compared to conventional nested reverse-mode AD. Our approach establishes a new paradigm for efficient high-order differential operator implementation in scientific machine learning.

Technology Category

Application Category

📝 Abstract

Computing partial differential equation (PDE) operators via nested backpropagation is expensive, yet popular, and severely restricts their utility for scientific machine learning. Recent advances, like the forward Laplacian and randomizing Taylor mode automatic differentiation (AD), propose forward schemes to address this. We introduce an optimization technique for Taylor mode that 'collapses' derivatives by rewriting the computational graph, and demonstrate how to apply it to general linear PDE operators, and randomized Taylor mode. The modifications simply require propagating a sum up the computational graph, which could -- or should -- be done by a machine learning compiler, without exposing complexity to users. We implement our collapsing procedure and evaluate it on popular PDE operators, confirming it accelerates Taylor mode and outperforms nested backpropagation.

Problem

Research questions and friction points this paper is trying to address.

Optimizing Taylor mode AD for PDE operators

Reducing computational cost of nested backpropagation

Enhancing efficiency in scientific machine learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Collapses derivatives by rewriting computational graph

Applies optimization to general linear PDE operators

Propagates sum up graph via machine learning compiler

🔎 Similar Papers

Adapting Newton's Method to Neural Networks through a Summary of Higher-Order Derivatives