- 'ChartMoE: Mixture of Diversely Aligned Expert Connector for Chart Understanding', ICLR2025 Oral (1.8%).
- 'Aria: An Open Multimodal Native Mixture-of-Experts Model', Technical Report.
- 'Exploring AIGC Video Quality: A Focus on Visual Harmony, Video-Text Consistency and Domain Distribution Gap', CVPRW2024.
- 'Bringing Textual Prompt to AI-Generated Image Quality Assessment', ICME2024.
Projects Released:
- 2025.02: ChartMoE selected as ICLR2025 Oral.
- 2025.01: ChartMoE accepted by ICLR2025.
- 2024.10: Released Aria, a native LMM excelling in text, code, image, video, PDF, and more.
- 2024.09: Released ChartMoE, an MLLM with MoE connector for advanced chart understanding, replot, editing, highlighting, and transformation.
Research Experience
Participated in several interesting MLLM research projects:
- 2024.05 - 2024.12: 01.ai & Rhymes.ai, Multimodal Team, supervised by Junnan Li, working closely with Dongxu Li and Haoning Wu.
- 2024.02 - 2024.07: IDEA Research, working closely with Zhengzhuo Xu, Yiyan Qi, and Chengjin Xu.
Education
An MPhil. Candidate at the School of Electronic and Computer Science (SECE), Peking University (PKU) since 2022 fall; previously received an Honours B.E. degree from the School of Electronic Information and Communications (EIC), Huazhong University of Science and Technology (HUST) in June 2022.
Background
Research interests include Vision-Language Model and MLLM Reasoning.
Miscellany
Enjoys open-sourcing. The logo of his website is his pet cat, Baka.