Multiple importance sampling for stochastic gradient estimation

📅 2024-07-22
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of low-variance gradient estimation under noisy gradients, this paper proposes the Dynamic Adaptive Multiple Importance Sampling (DAMIS) framework. DAMIS jointly integrates multiple heterogeneous importance distributions, dynamically evolving each distribution based on gradient sensitivity and optimally weighting their sample contributions according to theoretically derived weights, thereby enabling efficient and stable estimation of vector-valued gradients. Unlike conventional approaches relying on a single importance distribution or static mixture sampling, DAMIS introduces a novel metric-driven, cooperative multi-distribution mechanism. Theoretical analysis guarantees convergence under mild assumptions. Empirical evaluation on image classification and point cloud regression tasks demonstrates that DAMIS reduces gradient variance by up to 47%, decreases required iterations for convergence by approximately 30%, and concurrently improves generalization performance.

Technology Category

Application Category

📝 Abstract
We introduce a theoretical and practical framework for efficient importance sampling of mini-batch samples for gradient estimation from single and multiple probability distributions. To handle noisy gradients, our framework dynamically evolves the importance distribution during training by utilizing a self-adaptive metric. Our framework combines multiple, diverse sampling distributions, each tailored to specific parameter gradients. This approach facilitates the importance sampling of vector-valued gradient estimation. Rather than naively combining multiple distributions, our framework involves optimally weighting data contribution across multiple distributions. This adapted combination of multiple importance yields superior gradient estimates, leading to faster training convergence. We demonstrate the effectiveness of our approach through empirical evaluations across a range of optimization tasks like classification and regression on both image and point cloud datasets.
Problem

Research questions and friction points this paper is trying to address.

Stochastic Gradient Estimation
Learning Efficiency
Sample Selection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Sampling Strategy
Gradient Estimation Optimization
Machine Learning Efficiency
🔎 Similar Papers
No similar papers found.
C
Corentin Salaün
Max Planck Institute for Informatics, Saarland University, Saarbrücken, Germany
Xingchang Huang
Xingchang Huang
Max Planck Institute for Informatics, Saarland University, Saarbrücken, Germany
Iliyan Georgiev
Iliyan Georgiev
Adobe Research
Computer GraphicsGlobal IlluminationRay TracingMonte CarloStochastic Sampling
N
N. Mitra
Adobe Research, London, United Kingdom; Department of Computer Science, University College London, London, United Kingdom
Gurprit Singh
Gurprit Singh
Advanced Micro devices (AMD)
Generative ModelsMCMCGenerative Rendering