2023: DFN is the best large scale CLIP image encoder in the market
2022: Co-designed and co-led Google Robotics SayCan effort, initiating a new field of Foundational Models for Robotics Decision Making and getting best paper award at CoRL 2022
2017-2021: Initiator and co-lead of the robot navigation effort (10+ FTEs) within Google Robotics, resulting in systems and published work across Object-driven Robot Navigation, Social Robot Navigation, Mobile Manipulation
2015: Co-authored Show and Tell paper, initiated the new field of Vision-Language models; top most cited papers ever from CVPR 2015
2013-2014: Co-authored DeepPose and Object Detection papers; first deep neural network work for localization in image, top most cited papers from CVPR 2014
Background
Research interests include multimodal foundational models and embodied AI. Serves as a Research Scientist and Manager at Apple ML Research.