Published multiple papers at top-tier conferences and journals including AAAI, CVPR, ACM MM, IJCAI, and ACL, covering diffusion models, image generation/editing, image captioning, speech understanding, and efficient vision Transformers
2024: Published technical reports such as 'Diffusion-RWKV', 'Music Consistency Models', and 'Scalable Diffusion Models with State Space Backbone'; co-authored 'Tuning-Free Inversion-Enhanced Control for Consistent Image Editing' accepted at AAAI2024
2023: Published 'A-JEPA: Joint-Embedding Predictive Architecture Can Listen', 'Gradient-Free Textual Inversion' (ACM MM2023), 'Incorporating Unlikely Negative Cues for Distinctive Image Captioning' (IJCAI2023), and 'Masked Auto-Encoders Meet Generative Adversarial Networks and Beyond' (CVPR2023)
2022: Published 'Efficient Modeling of Future Context for Image Captioning' (ACM MM2022), 'DeeCap: Dynamic Early Exiting for Efficient Image Captioning' (CVPR2022), and 'Selecting Stickers in Open-Domain Dialogue through Multitask Learning' (ACL2022 Findings)
Contributed to DSTC10 (10th Dialog System Technology Challenge) Track1; co-authored overview paper published in IEEE/ACM Transactions on Audio, Speech, and Language Processing
Awarded first place in multiple competitions: Deecamp 2020 Innovation Track (sci-fi novel generation), Huawei Cloud AI Application Innovation Competition 2020 (AI-assisted writing), and the 6th CAAI National Youth AI Innovation & Entrepreneurship Conference (adviser: Ming Zhou)