First author of the ICLR Spotlight paper 'Demystifying CLIP Data (MetaCLIP)' in 2024; core contributor to 'The Llama 3 Herd of Models' project; led the development of 'MoDE: CLIP Data Experts via Clustering', presented at CVPR 2024.
Research Experience
Serves as a research scientist at Facebook AI Research (FAIR). During his Ph.D., he interned at Facebook AI, Amazon AI, and WeChat AI lab. He was also the winner of the Yelp dataset challenge.
Education
Received a master's degree in microelectronics from Peking University; obtained a Ph.D. in computer science from the University of Illinois at Chicago, advised by Prof. Philip S. Yu and Prof. Bing Liu.
Background
Research interests include multi-modal pre-training (CLIP and unCLIP), self-supervised learning, and Internet-scale data curation. Specializes in Lifelong Representation Learning, Open-world Learning, Sentiment Analysis, Question Answering, and Conversational AI.
Miscellany
Open to giving invited talks on insights into his research; his team is looking for research interns in multi-modal pretraining; defended his Ph.D. thesis in February 2020 and open-sourced it; submitted a Ph.D. proposal on lifelong representation learning for NLP in September 2019.