‘Diffusion for World Modeling: Visual Details Matter in Atari’, NeurIPS 2024 (Spotlight): Introduced DIAMOND, an RL agent trained in a diffusion world model.
‘Aligning Agents like Large Language Models’, RLBRew Workshop at RLC 2024: Investigated LLM-style agent training via unsupervised pre-training, supervised fine-tuning, and RLHF.
‘Efficient Offline Reinforcement Learning: The Critic is Critical’, ARLET Workshop at ICML 2024: Proposed learning behavior policy and values via supervised learning before RL improvement.
‘Contrastive Meta-Learning for Partially Observable Few-Shot Learning’, ICLR 2023: Developed contrastive meta-learning under partial observability for RL agent representation learning.
‘Deep Ocean Learning of Small Scale Turbulence’, GRL 2022: Demonstrated ML-based inference of turbulent mixing from routine oceanographic observations.
Research Experience
Apr 2025 – Present: Research / Co-Founder at General Intuition, focusing on foundation models with deep spatial and temporal reasoning.
May 2021 – Jul 2025: PhD Candidate at University of Edinburgh, researching efficient reinforcement learning and world models.
Jun 2023 – Sep 2023: Research Scientist Intern at Microsoft Research Cambridge, developed a preference alignment pipeline for the Xbox game Bleeding Edge to study RLHF capabilities and limitations.
Jul 2020 – Apr 2021: Lead Data Scientist at Dataiku (London), led a team of 6 data scientists across the UK and Northern Europe.
Apr 2019 – Jul 2020: Data Scientist at Dataiku (London), delivered client-facing data science projects.