Browse publications on Google Scholar (top-right) ↗
Resume (English only)
Academic Achievements
Paper HAMSTER accepted by ICLR 2025; STOW accepted by CoRL 2023; DeepIM accepted as oral by ECCV 2018, selected as one of the 12 best papers at ECCV 2018, with an extended version published in IJCV; Deformable ConvNets accepted by ICCV 2017; FCIS accepted as spotlight by CVPR 2017; R-FCN accepted by NeurIPS 2016; Inst-FCN accepted by ECCV 2016; First prize in MS COCO segmentation challenge 2016; Awarded the National Scholarship (top 0.5% nationwide).
Research Experience
Interned at NVIDIA Seattle Robotics Lab, contributing to three projects under the mentorship of Dr. Ankit Goyal, Dr. Kaichun Mo, and Dr. Arsalan Mousavian, advancing multimodal reasoning, vision-language-action (VLA) models, and policy generalization. Also collaborated with Dr. Jifeng Dai and Dr. Kaiming He at Microsoft Research Asia on deep learning and computer vision research.
Education
Ph.D. from the University of Washington, advised by Prof. Dieter Fox; M.S. from Tsinghua University, advised by Prof. Xiangyang Ji; B.S. from Tsinghua University.
Background
Currently a Member of Technical Staff at xAI, working on the Omni model. Research focuses on developing multi-modal foundation models capable of understanding and interacting with the real world.
Miscellany
Invited to give a talk at OpenAI, Google DeepMind, and Bytedance Seed for the HAMSTER project; Invited speaker on humanoid robot cognition, hosted by Bernstein.