Scholar

Yongyuan Liang

Google Scholar ID: GQToORIAAAAJ

University of Maryland, College Park

Large Language ModelsLarge Multimodal ModelsReinforcement Learning

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

715

H-index

i10-index

Publications

Co-authors

Contact

Emailcharlotte9762@gmail.com TwitterOpen ↗GitHubOpen ↗

Publications

15 items

Anticipatory Planning for Multimodal AI Agents

2026

Cited

Learning Situated Awareness in the Real World

2026

Cited

Failure-Aware RL: Reliable Offline-to-Online Reinforcement Learning with Self-Recovery for Real-World Manipulation

2026

Cited

MomaGraph: State-Aware Unified Scene Graphs with Vision-Language Model for Embodied Task Planning

2025

Cited

Lemon: A Unified and Scalable 3D Multimodal Model for Universal Spatial Understanding

2025

Cited

TraceGen: World Modeling in 3D Trace Space Enables Learning from Cross-Embodiment Videos

2025

Cited

WEAVE: Unleashing and Benchmarking the In-context Interleaved Comprehension and Generation

2025

Cited

ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation

2025

Cited

Resume (English only)

Academic Achievements

Multiple papers accepted at top-tier conferences including NeurIPS 2025, CVPR 2025, ICLR 2025, and ICML 2024
Awarded Dean's Fellowship in 2024
Paper 'ACE' selected as a long oral presentation at ICML 2024
Three papers accepted at ICLR 2024, including two spotlights
Leads or contributes to open-source initiatives such as Awesome-Generalist-Agents, Magma, and Make-An-Agent
Notable works include: ROVER (benchmark for multimodal reasoning), LEMON (3D multimodal model), Avocado (multi-objective alignment framework), and TraceVLA (embodied agent policy model)

Background

Research focuses on developing foundation models and intelligent agents
Actively explores both theoretical frameworks and empirical findings, with specific interests in:
- Large Multimodal Models: Unified models for 2D/3D virtual and physical agentic tasks
- Alignment: Human preference alignment and cross-modality alignment in post-training
- Previous research includes Reinforcement Learning, Representations, and Robustness

Co-authors

0 total

Co-authors: 0 (list not available)