- Published papers such as 'Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long context, and Next Generation Agentic Capabilities';
- Contributed to various research projects including 'Multimodal Few-Shot Learning with Frozen Language Models', 'Representation Learning Without Labels', 'Neural Scene Representation and Rendering', etc.;
- Delivered talks or participated in tutorials at multiple international conferences.
Research Experience
- At Google DeepMind, contributing to Gemini's search, agentic, and reasoning capabilities, as well as its research strategy;
- Led the DL:X team (focusing on generative models, self-supervised learning, and multimodal large language models);
- Headed the Quantum Chemistry and Materials team (mainly DFT and some downstream applications);
- Created the 'Neural Google' prototype which evolved into Google Search AI Mode.
Education
- PhD: University of Edinburgh, supervised by Christopher Williams;
- Post-doctoral: Microsoft Research Cambridge, with John Winn;
- Visiting researcher at Oxford University, collaborating with Andrew Zisserman.
Background
Research interests include artificial intelligence, generative models, self-supervised learning, and multimodal large language models. Currently a Director and Principal Research Scientist at Google DeepMind.