May 2025: Paper 'Flat-LoRA: Low-Rank Adaptation over a Flat Loss Landscape' accepted to ICML 2025.
March 2025: Paper 'Distraction is All You Need for Multimodal Large Language Model Jailbreaking' accepted to CVPR 2025 as highlight.
September 2024: Paper 'Unified Gradient-Based Machine Unlearning with Remain Geometry Enhancement' accepted to NeurIPS 2024 as spotlight.
September 2024: Paper 'PromptIntern: Saving Inference Costs by Internalizing Recurrent Prompt during Large Language Model Fine-tuning' accepted to EMNLP 2024 Findings.
September 2024: Paper 'Low-Dimensional Gradient Helps Out-of-Distribution Detection' accepted to TPAMI 2024.
July 2024: Paper 'Learning Scalable Model Soup on a Single GPU: An Efficient Subspace Training Strategy' accepted to ECCV 2024.
April 2024: Paper 'Online Continual Learning via Logit Adjusted Softmax' accepted to TMLR 2024.
March 2024: Paper 'Revisiting Random Weight Perturbation for Efficiently Improving Generalization' accepted to TMLR 2024. A short version was presented at NeurIPS Workshops on Optimization for Machine Learning (2023).
February 2024: Paper 'Friendly Sharpness-Aware Minimization' accepted to CVPR 2024.
Research Experience
Worked at Bytedance Seed, conducting research on large-scale multimodal models.
Education
Ph.D. student at Shanghai Jiao Tong University, supervised by Prof. Xiaolin Huang; Visiting student in the Department of Computer Science and Engineering at The Hong Kong University of Science and Technology, guided by Prof. James Kwok.
Background
Currently a researcher at Bytedance Seed, focusing on developing large multimodal models. Research interests include machine learning and optimization, particularly the efficiency, robustness, and generalization of optimization algorithms in the era of large language models.
Miscellany
Good at data structures and algorithms, participated in several competitive programming contests. Open for discussions and potential collaborations. Feel free to contact via email or WeChat.