Scholar

Guangzhi Sun

Google Scholar ID: PzPAzf8AAAAJ

University of Cambridge

Speech and language technologyconversational AI

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

1,672

H-index

i10-index

Publications

Co-authors

Contact

CVOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

25 items

SLVMBench: Skill Learning from Video Memory

2026

Cited

video-SALMONN-R$^3$: Learning to ReWatch, ReAsk, and ReAnswer for Efficient Video Understanding

2026

Cited

Uncertainty-based Debiasing and Unlearning for Decontamination

2026

Cited

Measuring the Redundancy of Decoder Layers in SpeechLLMs

2026

Cited

Who can we trust? LLM-as-a-jury for Comparative Assessment

2026

Cited

Scaling Open Discrete Audio Foundation Models with Interleaved Semantic, Acoustic, and Text Tokens

2026

Cited

OCR-Enhanced Multimodal ASR Can Read While Listening

2026

Cited

Protecting Bystander Privacy via Selective Hearing in LALMs

2025

Cited

Resume (English only)

Academic Achievements

- Paper 'Unlearning vs. Obfuscation: Are We Truly Removing Knowledge?' accepted at EMNLP 2025 main conference
- Paper 'SkillAggregation' accepted at ACL 2025 main conference
- Papers 'video-SALMONN-o1', 'CASE-Bench', and 'F-16' accepted at ICML 2025
- 1 paper accepted at ICLR 2025, and 1 paper accepted at NAACL 2025
- 1 paper accepted at ICASSP 2025
- Paper 'CrossCheckGPT: Universal Hallucination Ranking for Multimodal Foundation Models' accepted at NeurIPS 2024 Workshop
- Journal 'Large Language Models Surpass Human Experts in Predicting Neuroscience Results' published at Nature Human Behaviour
- Won Best Short Paper Award at CUI 2024
- 4 papers accepted at Interspeech 2024
- Paper 'Building Better AI Agents: A Provocation on the Utilisation of Persona in LLM-based Conversational Agents' accepted at CUI 2024
- Paper 'av-SALMONN: Speech-Enhanced Audio-Visual Large Language Models' accepted at ICML 2024
- 4 papers accepted at ICASSP 2024
- Journal 'Graph Neural Networks for Contextual ASR with the Tree-Constrained Pointer Generator' published
- Paper 'SALMONN: Towards Generic Hearing Abilities for Large Language Models' accepted at ICLR 2024

Research Experience

- Junior Research Fellow at Trinity College, University of Cambridge starting from October 2024
- Research Associate at the Machine Intelligence Laboratory, University of Cambridge, working with Prof. Phil Woodland
- Closely collaborating with Prof. Chao Zhang at Tsinghua University
- Research Internship at Google Brain with Dr Yu Zhang in 2019
- Research Internship at ByteDance with Dr Wei Li in 2023
- Collaborated with Poly AI Ltd working with Dr Ivan Vulić and Dr. Paweł Budzianowski in 2023

Education

- Ph.D., June 2023, University of Cambridge, supervised by Prof. Phil Woodland (advisor Prof. Mark Gales)
- B.A. and M.Eng, 2019, Trinity College, University of Cambridge

Background

- Research Interest: Controllable and reliable multimodal conversational AI, including multi-modal contextual knowledge integration, reliability, hallucination reduction, and multimodal contextualised AI safety
- Professional Fields: Speaker diarisation, language modelling, and speech synthesis

Co-authors

0 total

Co-authors: 0 (list not available)