Discrete Variational Autoencoding via Policy Search

📅 2025-09-29
📈 Citations: 0
✨ Influential: 0
📄 PDF
🤖 AI Summary
Discrete variational autoencoders (VAEs) suffer from non-differentiable discrete latent variables, necessitating biased or high-variance approximations—such as Gumbel-Softmax or REINFORCE—for gradient estimation, which limits high-fidelity image reconstruction. To address this, we propose a reparameterization-free training framework: a nonparametric encoder serves as a proxy to guide the optimization of a parametric encoder via natural gradient updates; integrated with a Transformer-based encoder and an automatic step-size adaptation mechanism, the framework enables end-to-end training. Our key contribution is the first application of natural gradient optimization—originally developed for policy learning—to discrete VAEs, effectively circumventing the bias–variance trade-off inherent in conventional estimators. On ImageNet 256, our method achieves a 20% improvement in Fréchet Inception Distance (FID) over strong baselines, including vector quantized VAEs and Gumbel-Softmax VAEs, demonstrating that high-quality image reconstruction is feasible even under highly compact discrete latent representations.

Technology Category

Application Category

📝 Abstract
Discrete latent bottlenecks in variational autoencoders (VAEs) offer high bit efficiency and can be modeled with autoregressive discrete distributions, enabling parameter-efficient multimodal search with transformers. However, discrete random variables do not allow for exact differentiable parameterization; therefore, discrete VAEs typically rely on approximations, such as Gumbel-Softmax reparameterization or straight-through gradient estimates, or employ high-variance gradient-free methods such as REINFORCE that have had limited success on high-dimensional tasks such as image reconstruction. Inspired by popular techniques in policy search, we propose a training framework for discrete VAEs that leverages the natural gradient of a non-parametric encoder to update the parametric encoder without requiring reparameterization. Our method, combined with automatic step size adaptation and a transformer-based encoder, scales to challenging datasets such as ImageNet and outperforms both approximate reparameterization methods and quantization-based discrete autoencoders in reconstructing high-dimensional data from compact latent spaces, achieving a 20% improvement on FID Score for ImageNet 256.
Problem

Research questions and friction points this paper is trying to address.

Overcoming non-differentiability in discrete variational autoencoders training
Improving gradient estimation for high-dimensional image reconstruction tasks
Enhancing reconstruction quality from compact discrete latent spaces
Innovation

Methods, ideas, or system contributions that make the work stand out.

Non-parametric natural gradient updates encoder
Policy search techniques replace reparameterization
Transformer encoder scales to ImageNet reconstruction
🔎 Similar Papers
No similar papers found.
M
Michael Drolet
Intelligent Autonomous Systems Lab, TU Darmstadt, Germany
Firas Al-Hafez
Firas Al-Hafez
Ph.D. Student at IAS Technische Universität Darmstadt
robot-learningreinforcement learningcontrolrobotics
A
Aditya Bhatt
Intelligent Autonomous Systems Lab, German Research Center for AI (DFKI), Centre for Cognitive Science, Hessian.AI, TU Darmstadt, Germany
J
Jan Peters
Intelligent Autonomous Systems Lab, German Research Center for AI (DFKI), Centre for Cognitive Science, Hessian.AI, TU Darmstadt, Germany
Oleg Arenz
Oleg Arenz
Postdoctoral Researcher, Technische Universitaet Darmstadt
Autonomous RobotsInverse Reinforcement LearningVariational InferenceReinforcement Learning