- 2025: Show-o: One Single Transformer to Unify Multimodal Understanding and Generation (ICLR, PREMIA Best Student Paper Award 2025)
- 2025: CLIMS++: Cross Language Image Matching with Automatic Context Discovery for Weakly Supervised Semantic Segmentation (IJCV)
- 2025: Faster Diffusion via Temporal Attention Decomposition (TMLR)
- 2024: Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion Models (arXiv)
- Awards:
- 2025: PREMIA Best Student Paper Award 2025
- Projects:
- Development and release of Show-o and Show-o2 models
Research Experience
- Work Experience:
- Show Lab, National University of Singapore, PhD Student (2023 to present)
- Research Projects:
- Development and training of Show-o and Show-o2 models
- Research on unified models for multimodal understanding and generation
- Position: PhD Student
Education
- Degree: PhD
- University: National University of Singapore
- Advisor: Prof. Mike Shou
- Duration: 2023 to present
- Major: Computer Science
Background
- Research Interests: Label-efficient learning, weakly-supervised object localization, semantic segmentation, visual prompt learning, controllable image synthesis, multimodal understanding and generation
- Professional Field: Computer Vision, Machine Learning
- Brief Introduction: Third-year PhD student at Show Lab, National University of Singapore, working with Prof. Mike Shou. Focused on the development of unified models for multimodal understanding and generation.