1. SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training - ICML 2025
2. Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMs - Preprint
3. Emergence of Segmentation with Minimalistic White-Box Transformers - CPAL 2024(Oral), NeurIPS 2023 XAI Workshop(Oral)
4. Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models - ICLR 2024
5. White-Box Transformers via Sparse Rate Reduction - NeurIPS 2023
6. White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is? - JMLR
Research Experience
1. Conducted research at Berkeley
2. Teaching Assistant for DATA8014
Education
1. Ph.D. student at HKU IDS, advised by Prof. Yi Ma (Fall 2024)
2. B.Eng. in Computer Science (Honor) from ShanghaiTech University (graduation year not provided)
Background
Research Interests: Data, architecture, training, and understanding of foundation models; particularly interested in empirical understanding of training (M)LLM with reinforcement learning. Conducted representation learning and interpretability research during a visit to Berkeley.
Miscellany
From Suzhou, enjoys seasoning dishes with sweetness. Gets easily annoyed by cooking philosophies that are opposite to his.