IEEE International Conference on Computer Vision · 2022
Cited
320
Resume (English only)
Academic Achievements
Publications:
- Step1X-Edit: A Practical Framework for General Image Editing, arXiv, 2024
- EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts, arXiv, 2024
- AppAgent: Multimodal Agents as Smartphone Users, arXiv, 2024
- ChartLlama: A Multimodal LLM for Chart Understanding and Generation, arXiv, 2024
- Dual-Perspective Knowledge Enrichment for Semi-Supervised 3D Object Detection, AAAI, 2024
- Prompt-aligned Gradient for Prompt Tuning, ICCV, 2023
Research Experience
Has been involved in multiple research projects including Step1X-Edit, EMMA, AppAgent, and ChartLlama.
Education
Graduated from Tsinghua University in 2021. Currently a fourth-year Ph.D. student at Nanyang Technological University, advised by Professor Hanwang Zhang. Recently worked as an intern under the guidance of Gang Yu.
Background
Research interests: Generative AI, Image Editing, Multi-modal Large Language Model, 3D object detection, prompt learning, and video summarization.