arXiv 2025: A High-Quality Dataset and Reliable Evaluation for Interleaved Image-Text Generation
arXiv 2025: ARMOR: Empowering Autoregressive Multimodal Understanding Model with Interleaved Multimodal Generation via Asymmetric Synergy
arXiv 2025: SridBench: Benchmark of Scientific Research Illustration Drawing of Image Generation Model
AAAI 2026: MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models
NeurIPS 2025: Sekai: A Video Dataset towards World Exploration
EMNLP 2025: InMind: Evaluating LLMs in Capturing
Research Experience
Researcher, Shanghai AI Lab
Background
Research interests: vision-and-language, image/video generation, internet-augmented generation, compositional generalization. Currently a researcher at Shanghai AI Lab, collaborating closely with Dr. Kaipeng Zhang and Dr. Wenqi Shao.