Paper 'VLA reinforced fine-tuning via consistency policy in real-world environments (ConRFT)' accepted to RSS 2025; paper 'Generalizing consistency policy for visual RL tasks (CP3ER)' accepted to NIPS 2024; paper 'Consistency policy for robotic RL tasks (CPQL)' accepted to AAMAS 2024. Additionally, two papers are under review: 'QDepth-VLA: Quantized Depth Prediction as Auxiliary Supervision for Vision-Language-Action Models' and 'Survey of Vision-Language-Action Models for Embodied Manipulation'.
Research Experience
Before joining CASIA, he worked as an MCU embedded engineer at DJI. Currently, he is conducting research on VLA (Vision-Language-Action) models.
Education
Currently a third-year Ph.D. student in Control Theory and Engineering at the Institute of Automation, Chinese Academy of Sciences (CASIA), supervised by Prof. Dongbin Zhao and Prof. Haoran Li. Prior to CASIA, he was an MCU embedded engineer at DJI. He earned his B.Eng. degree in Information Engineering from Beijing Institute of Technology (BIT) and in Electrical Communication Engineering from Australian National University (ANU).
Background
Research interests include reinforcement learning and robot learning, with a focus on building large-scale foundation models that tightly couple vision, language, and action. The goal is to develop foundation models for embodied AI agents to accelerate scientific research.