Scholar

Fangxun Shu

Google Scholar ID: 8Fq3EFkAAAAJ

Bytedance

Multimodal

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

280

H-index

i10-index

Publications

Co-authors

list available

Contact

GitHubOpen ↗

Publications

7 items

SAIL-RL: Guiding MLLMs in When and How to Think via Dual-Reward RL Tuning

2025

Cited

SAIL-VL2 Technical Report

2025

Cited

Fast-Slow Thinking for Large Vision-Language Model Reasoning

2025

Cited

CMMCoT: Enhancing Complex Multi-Image Comprehension via Multi-Modal Chain-of-Thought and Memory Augmentation

2025

Cited

MINT: Multi-modal Chain of Thought in Unified Generative Models for Enhanced Image Generation

2025

Cited

Streaming Video Question-Answering with In-context Video KV-Cache Retrieval

2025

Cited

T2I-FactualBench: Benchmarking the Factuality of Text-to-Image Models with Knowledge-Intensive Concepts

arXiv.org · 2024

Cited

Resume (English only)

Academic Achievements

May 2025: 1 paper accepted to ACL'25 (T2I-FactualBench)
Jan. 2025: 3 papers accepted to ICLR'25 (LLaVA-MoD, ReKV, ARM)
Dec. 2024: 2 papers accepted to AAAI'25 (MARS, HSA-DPO)
May 2024: 1 paper accepted to TMM'24 (MAC)
Reviewer for CVPR, ICLR, NeurIPS, and ICML

Co-authors

4 total

Cihang Xie

Assistant Professor, University of California, Santa Cruz

si liu

Beihang University

Jinqiao Wang 王金桥

Professor, Institute of Automation,Chinese Academy of Science

Hongsheng Li (李鸿升)

The Chinese University of Hong Kong