1. IJCV 2025: High-Quality Sound Separation Across Diverse Categories via Visually-Guided Generative Modeling
2. NeurIPS 2025: ZeroSep: Separate Anything in Audio with Zero Training
3. NeurIPS 2025: Harnessing the Computation Redundancy in ViTs to Boost Adversarial Transferability
4. NeurIPS 2025 D&B Track: MMPerspective: Do MLLMs Understand Perspective? A Comprehensive Benchmark for Perspective Perception, Reasoning, and Robustness
5. arXiv preprint 2025: Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models
7. ICML 2025: BinauralFlow: A Causal and Streamable Approach for High-Quality Binaural Speech Synthesis with Flow Matching Models
8. CVPR 2025: VIDCOMPOSITION: Can MLLMs Analyze Compositions in Compiled Videos?
9. ICLR 2025: Rethinking Audio-Visual Adversarial Vulnerability from Temporal and Modality Perspectives
10. arXiv prepri: Generative AI for Cel-Animation: A Survey
Research Experience
1. Conducting Ph.D. research in Prof. Chenliang Xu's lab (2022-present)
2. Worked in Prof. Shiguang Shan's group for one and a half years (2020-2021)
3. Closely collaborated with Prof. Ming-Hsuan Yang
Education
1. Ph.D. student, Department of Computer Science, University of Rochester, Advisor: Prof. Chenliang Xu (2022-present)
2. Bachelor's degree, Computer Science, University of Chinese Academy of Sciences, Advisor: Prof. Shiguang Shan (2020-2021)
Background
Research Interests: Computer Vision and Deep Learning, especially audio-visual learning, implicit neural fields, multi-modal learning, and trustworthy AI.
Miscellany
Her English name Susan is the pinyin of her Chinese name 梁苏叁. People often think she is female. There is an interesting clip about the pronunciation of Susan in the film Johnny English Reborn.