Paper 'InterFeedback: Unveiling Interactive Intelligence of Large Multimodal Models via Human Feedback' accepted to EMNLP 2025 Findings and presented at ICLR 2025 Bi-Align Workshop (Oral).
Paper 'WorldGUI: An Interactive Benchmark for Desktop GUI Automation from Any Starting Point' accepted to ACL 2025 @ REALM Workshop.
Paper 'LOVA3: Learning to Visual Question Answering, Asking and Assessment' accepted to NeurIPS 2024.
Paper 'Genixer: Empowering Multimodal Large Language Models as a Powerful Data Generator' accepted to ECCV 2024.
Paper 'SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient Channels' accepted to IJCV 2023.
Paper 'Evaluating the Generalization Ability of Super-Resolution Networks' accepted to TPAMI 2023.
Paper 'ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic' accepted to CVPR 2021.
Paper 'Efficient Image Super-Resolution Using Pixel Attention' accepted to ECCVW 2020, with over 400 citations.
Research Experience
Involved in multiple research projects such as WorldGUI, Genixer, LOVA3, and InterFeedback.
Education
PhD - National University of Singapore, Advisor: Prof. Mike Zheng Shou
Background
PhD student in the Show Lab at the National University of Singapore, advised by Prof. Mike Zheng Shou. Research interests include training Large Multimodal Models (LMMs) and developing agents/benchmarks for GUI automation and chart-to-code.