- Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks (ICML 2025)
- Rethinking Evaluation of Sparse Autoencoders through the Representation of Polysemous Words (ICLR 2025)
- Geometric-Averaged Preference Optimization for Soft Preference Labels (NeurIPS 2024)
- A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts (ICML 2024)
- Open X-Embodiment: Robotic Learning Datasets and RT-X Models (ICRA 2024, Best Conference Paper Award)
- A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis (ICLR 2024, Oral, 1.2% acceptance rate)
- Multimodal Web Navigation with Instruction-Finetuned Foundation Models (ICLR 2024)
- A System for Morphology-Task Generalization via Unified Representation and Behavior Distillation (ICLR 2023, Notable-top-25%, 8% acceptance rate)
Research Experience
Research Scientist at Google DeepMind, primarily working on multimodal AI agents and alignment for diffusion models. Former Student Researcher at Google DeepMind, hosted by Heiga Zen and Izzeddin Gur.
Education
Ph.D.: The University of Tokyo, Advisor: Yutaka Matsuo; BEng and MEng: The University of Tokyo, Advisors: Yutaka Matsuo and Shixiang Shane Gu.
Background
Research Interests: Multimodal AI agents, alignment for diffusion models, and mechanistic interpretability of LLMs. Professional field: Artificial Intelligence, particularly focusing on multimodal AI agents and alignment.