Scholar

Gunshi Gupta

Google Scholar ID: w4UK_9kAAAAJ

Ph.D student at University of Oxford

Reinforcement LearningMulti-modal learningMeta+Continual Learning

Citations & Impact

All-time

Citations

305

H-index

i10-index

Publications

Co-authors

list available

Contact

Publications

4 items

2026

Cited

2025

Cited

2025

Cited

2025

Cited

Resume (English only)

Academic Achievements

- Published “FindingDory: A Benchmark to Evaluate Memory in Embodied Agents”
- Published “Memo: Training Memory-Efficient Embodied Agents with Reinforcement Learning”
- Published “Recurrent Attention-based Token Selection for Efficient Streaming Video-LLMs”

Research Experience

- Deep Learning Intern (RL/IL) at Microsoft Research, Apr 2023 – Jul 2023: Contributed to a NeurIPS submission titled “WHAM: World and Human Action Modelling in a Modern Xbox Game” and developed an evaluation suite for mechanistic interpretability of transformer representations.
- Deep Learning Researcher at Wayve, Jul 2020 – Sep 2021: Focused on exploring algorithms that can learn in a robust and sample-efficient manner, aided by expert demonstrations.
- Graduate Research Assistant at Robotics Research Center, IIITH, Feb 2017 – Apr 2018: Developed a Multi Robot Visual SLAM framework and published “View-Invariant Intersection Recognition from Videos using Deep Network Ensembles” at IROS 2018.
- Software Developer at Microsoft, Jun 2016 – Feb 2017: Built prediction and summarization modules for employee performance feedback and organized workshops on ‘Machine Learning Fundamentals’ for Microsoft employees.

Education

- D.Phil in Machine Learning (AIMS CDT), 2024, University of Oxford
- Research Master's in Machine Learning, 2020, Montreal Institute of Learning Algorithms
- B.Tech in Maths and Computing (Applied Mathematics), 2016, Delhi Technological University (DTU/DCE)

Background

Soon to graduate as a Machine Learning D.phil student at the OATML group at the University of Oxford, supervised by Prof. Yarin Gal. Currently working on designing methods, architectures, and benchmarks to enable transformer-based agents to perform long-horizon tasks by creating and accessing memories, through large-scale RL. Research topics include leveraging advances in visual diffusion modeling for robotics, mechanistic interpretability in transformer-based world models, training generative world models for video games and robotics, and causally-correct, sample-efficient learning from imbalanced data.

Miscellany