Exploring Classification Equilibrium in Long-Tailed Object Detection (ICCV 2021)
PromptDet: Towards Open-vocabulary Detection using Uncurated Images (ECCV 2022)
AeDet: Azimuth-invariant Multi-view 3D Object Detection (CVPR 2023)
InstaGen: Enhancing Object Detection by Training on Synthetic Dataset (CVPR 2024)
InstructVEdit: A Holistic Approach for Instructional Video Editing (Preprint, 2025)
P3Nav: A Unified Framework for Embodied Navigation Integrating Perception, Planning, and Prediction (Preprint, 2025)
DriveMM: All-in-One Large Multimodal Model for Autonomous Driving (Preprint, 2024)
RoboMM: All-in-One Multimodal Large Model for Robotic Manipulation (Preprint, 2024)
Research Experience
Researcher at Meituan Inc.
Background
Currently a researcher at Meituan Inc. Primary research interests encompass a broad range of topics, including large multimodal models, diffusion models, autonomous driving, embodied AI, object detection, domain adaptation, and more. Specifically, particularly interested in the application of large multimodal models and diffusion models to improve daily lives.