Optimal and Diffusion Transports in Machine Learning

📅 2025-12-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the common challenge of modeling time-varying probability distributions in machine learning—spanning diffusion sampling, neural network weight dynamics, and inter-layer token propagation in large language models. We propose a unified Eulerian–Lagrangian framework to jointly characterize distributional evolution. To resolve the non-uniqueness of Lagrangian vector fields, we integrate stochastic interpolation-based diffusion processes with optimal transport theory grounded in minimal displacement cost, yielding a controlled continuity equation that governs particle transport. This formulation enhances regularity, numerical stability, and computational efficiency of density evolution paths. Experiments demonstrate consistent improvements across three domains: generative AI sampling (e.g., accelerated, interpretable diffusion), neural optimization trajectory analysis (e.g., robust parameter dynamics modeling), and Transformer internal token flow modeling (e.g., efficient, physically grounded attention interpretation). The framework delivers interpretable, robust, and computationally efficient performance without sacrificing fidelity.

Technology Category

Application Category

📝 Abstract
Several problems in machine learning are naturally expressed as the design and analysis of time-evolving probability distributions. This includes sampling via diffusion methods, optimizing the weights of neural networks, and analyzing the evolution of token distributions across layers of large language models. While the targeted applications differ (samples, weights, tokens), their mathematical descriptions share a common structure. A key idea is to switch from the Eulerian representation of densities to their Lagrangian counterpart through vector fields that advect particles. This dual view introduces challenges, notably the non-uniqueness of Lagrangian vector fields, but also opportunities to craft density evolutions and flows with favorable properties in terms of regularity, stability, and computational tractability. This survey presents an overview of these methods, with emphasis on two complementary approaches: diffusion methods, which rely on stochastic interpolation processes and underpin modern generative AI, and optimal transport, which defines interpolation by minimizing displacement cost. We illustrate how both approaches appear in applications ranging from sampling, neural network optimization, to modeling the dynamics of transformers for large language models.
Problem

Research questions and friction points this paper is trying to address.

Design time-evolving probability distributions for machine learning tasks
Address non-uniqueness of Lagrangian vector fields in density evolution
Apply diffusion and optimal transport methods to sampling and optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Switching from Eulerian to Lagrangian representation via vector fields
Using diffusion methods with stochastic interpolation for generative AI
Applying optimal transport by minimizing displacement cost for interpolation
🔎 Similar Papers
No similar papers found.