Scholar

Tingle Li

Google Scholar ID: UGpC1zgAAAAJ

PhD Student, UC Berkeley

Multimodal LearningAuditory PerceptionSpeech ProcessingComputer Vision

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

444

H-index

i10-index

Publications

Co-authors

list available

Contact

Emailtingle@eecs.berkeley.edu CVOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

10 items

Benchmarking Single-Factor Physical Video-to-Audio Generation

2026

Cited

Conversational Behavior Modeling Foundation Model With Multi-Level Perception

2026

Cited

Enabling Conversational Behavior Reasoning Capabilities in Full-Duplex Speech

2025

Cited

Schrodinger Audio-Visual Editor: Object-Level Audiovisual Removal

2025

Cited

AV-EMO-Reasoning: Benchmarking Emotional Reasoning Capabilities in Omni-modal LLMS with Audio-visual Cues

2025

Cited

EMO-Reasoning: Benchmarking Emotional Reasoning Capabilities in Spoken Dialogue Systems

2025

Cited

MultiGen: Using Multimodal Generation in Simulation to Learn Multimodal Policies in Real

2025

Cited

Sounding that Object: Interactive Object-Aware Image to Audio Generation

2025

Cited

Resume (English only)

Academic Achievements

Published several papers including 'The Sound of Simulation: Learning Multimodal Sim-to-Real Robot Policies with Generative Audio' (CoRL 2025, Best Paper Finalist), 'Sounding that Object: Interactive Object-Aware Image to Audio Generation' (ICML 2025), 'Audio Texture Manipulation by Exemplar-Based Analogy' (ICASSP 2025). Received Sony Research Award.

Research Experience

Part of Berkeley Artificial Intelligence Research (BAIR) Lab, involved in multiple projects such as learning multimodal sim-to-real robot policies and interactive object-aware image to audio generation.

Education

UC Berkeley, Ph.D. in Computer Science, Advisor: Gopala Anumanchipalli; IIIS, Tsinghua University, Collaborated with Hang Zhao; Duke University, Collaborated with Ming Li.

Background

Currently a fourth-year CS Ph.D. student at UC Berkeley, exploring how we acquire physical common sense by leveraging sound, a neglected source of physical truth, to understand the substance of objects beyond what vision reveals. Advised by Gopala Anumanchipalli and collaborated with Andrew Owens. Previously worked with Hang Zhao from IIIS, Tsinghua University, and Ming Li from Duke University.

Co-authors

17 total