Scholar

Susan Liang

Google Scholar ID: x3HBE2gAAAAJ

University of Rochester

Computer Vision

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

447

H-index

i10-index

Publications

Co-authors

list available

Contact

Emailsliang22@ur.rochester.edu CVOpen ↗GitHubOpen ↗

Publications

21 items

AdaTurn: Budget-Aware Test-Time Scaling for Active Visual Perception Agents

2026

Cited

SurgAtlas: A Large-Scale Surgical Video-Language Dataset with 2,391 Hours of Open and Minimally Invasive Surgery

2026

Cited

Can VLMs Truly Forget? Benchmarking Training-Free Visual Concept Unlearning

2026

Cited

TDMM-LM: Bridging Facial Understanding and Animation via Language Models

2026

Cited

Omni-Judge: Can Omni-LLMs Serve as Human-Aligned Judges for Text-Conditioned Audio-Video Generation?

2026

Cited

When to Think and When to Look: Uncertainty-Guided Lookback

2025

Cited

Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models

2025

Cited

High-Quality Sound Separation Across Diverse Categories via Visually-Guided Generative Modeling

2025

Cited

Resume (English only)

Academic Achievements

1. IJCV 2025: High-Quality Sound Separation Across Diverse Categories via Visually-Guided Generative Modeling
2. NeurIPS 2025: ZeroSep: Separate Anything in Audio with Zero Training
3. NeurIPS 2025: Harnessing the Computation Redundancy in ViTs to Boost Adversarial Transferability
4. NeurIPS 2025 D&B Track: MMPerspective: Do MLLMs Understand Perspective? A Comprehensive Benchmark for Perspective Perception, Reasoning, and Robustness
5. arXiv preprint 2025: Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models
6. ICCV 2025: π-AVAS: Can Physics-Integrated Audio-Visual Modeling Boost Neural Acoustic Synthesis?
7. ICML 2025: BinauralFlow: A Causal and Streamable Approach for High-Quality Binaural Speech Synthesis with Flow Matching Models
8. CVPR 2025: VIDCOMPOSITION: Can MLLMs Analyze Compositions in Compiled Videos?
9. ICLR 2025: Rethinking Audio-Visual Adversarial Vulnerability from Temporal and Modality Perspectives
10. arXiv prepri: Generative AI for Cel-Animation: A Survey

Research Experience

1. Conducting Ph.D. research in Prof. Chenliang Xu's lab (2022-present)
2. Worked in Prof. Shiguang Shan's group for one and a half years (2020-2021)
3. Closely collaborated with Prof. Ming-Hsuan Yang

Education

1. Ph.D. student, Department of Computer Science, University of Rochester, Advisor: Prof. Chenliang Xu (2022-present)
2. Bachelor's degree, Computer Science, University of Chinese Academy of Sciences, Advisor: Prof. Shiguang Shan (2020-2021)

Background

Research Interests: Computer Vision and Deep Learning, especially audio-visual learning, implicit neural fields, multi-modal learning, and trustworthy AI.

Miscellany

Her English name Susan is the pinyin of her Chinese name 梁苏叁. People often think she is female. There is an interesting clip about the pronunciation of Susan in the film Johnny English Reborn.

Co-authors

9 total