Scholar

Siddarth Venkatraman

Google Scholar ID: j9l0rg4AAAAJ

Mila, University of Montreal

Artificial IntelligenceRobotics

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

191

H-index

i10-index

Publications

Co-authors

list available

Contact

Emailsiddarth.venkatraman@mila.quebec CVOpen ↗TwitterOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

5 items

A Comedy of Estimators: On KL Regularization in RL Training of LLMs

2025

Cited

Recursive Self-Aggregation Unlocks Deep Thinking in Large Language Models

2025

Cited

Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training

2025

Cited

Solving Bayesian inverse problems with diffusion priors and off-policy RL

2025

Cited

Amortizing intractable inference in diffusion models for vision, language, and control

arXiv.org · 2024

Cited

Resume (English only)

Academic Achievements

- Publications:
* Recursive Self-Aggregation Unlocks Deep Thinking in Large Language Models (Preprint)
* Trajectory Balance with Asynchrony: Decoupling exploration and learning for fast, scalable, LLM post-training (NeurIPS 2025)
* Outsourced diffusion sampling: Efficient posterior inference in latent spaces of generative models (ICML 2025)
* Amortizing intractable inference in diffusion models for vision, language and control (NeurIPS 2024)
* Reasoning with Latent Diffusion in Offline Reinforcement Learning (ICLR 2024)
* Learning Temporally Abstract World Models without Online Experimentation (ICML 2023)
* Multi-Alpha Soft Actor-Critic: Overcoming Stochastic Biases in Maximum Entropy Reinforcement Learning (ICRA 2023)
* MLNav: Learning to Safely Navigate on Martian Terrains (RAL+ICRA 2022)
* Machine Learning Based Path Planning for Improved Rover Navigation (IEEE Aerospace Conference 2021)

Research Experience

- PhD Student at Mila, Quebec AI Institute
- Academic Collaborator at LawZero - Safe AI for Humanity
- Intern and Academic Collaborator at Lawrence Livermore National Laboratory (LLNL)
- Intern at Valence Labs, working on training flow bridges for molecular systems
- Intern at NASA Jet Propulsion Laboratory (JPL), working on more efficient Mars Rover motion planning

Education

- PhD: Mila, Quebec AI Institute, Université de Montréal, Supervisors: Glen Berseth, Nikolay Malkin
- Master's: Robotics, Carnegie Mellon University, Advisor: Dr. Jeff Schneider
- Bachelor's: Computer Science, Manipal Institute of Technology

Background

- Research Interests: reinforcement learning, reasoning, and probabilistic inference
- Professional Field: Artificial Intelligence, Machine Learning
- Brief Introduction: Currently a PhD student at Mila, Quebec AI Institute, affiliated with Université de Montréal, co-supervised by Glen Berseth and Nikolay Malkin. Closely works with Yoshua Bengio and is an academic collaborator at LawZero - Safe AI for Humanity and Lawrence Livermore National Laboratory (LLNL).

Miscellany

- Personal Interests: Solving fundamental issues with LLMs such as the long context problem
- Email, CV, Google Scholar, GitHub, X, LinkedIn

Co-authors

2 total

Co-author 1

Shivam Agarwal

University of Illinois at Urbana Champaign