GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices, Preprint, 2024.
MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI, ICML, 2024.
OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM, CVPR, 2024.
ChartAssisstant: A Universal Chart Multimodal Language Model via Chart-to-Table Pre-training and Multitask Instruction Tuning, ACL Findings, 2024.
Research Experience
Conducting research at Shanghai Jiao Tong University and Shanghai Artificial Intelligence Laboratory.
Background
A first-year Ph.D. student at Shanghai Jiao Tong University, with research interests in multimodal applications, time series analysis, and quantitative investing. Supervised by Prof. Ping Luo and closely working with Dr. Wenqi Shao.