Yang Liu
Scholar

Yang Liu

Google Scholar ID: HxTr-CtMdrsC
Microsoft
natural language processingtext summarizationtext generation
Citations & Impact
All-time
Citations
8,792
 
H-index
30
 
i10-index
42
 
Publications
20
 
Co-authors
5
list available
Resume (English only)
Academic Achievements
  • Mar 2025: Released KodCode, the largest verified synthetic coding dataset for Code LLM training
  • Jul 2024: Introduced Samba, a powerful hybrid LLM
  • May 2024: Built GPT-4 Japanese
  • Mar 2023: Proposed G-Eval: NLG evaluation using GPT-4 with better human alignment
  • Nov 2022: Released UniSumm, a state-of-the-art few-shot summarization model
  • Oct 2022: Five papers accepted at EMNLP 2022
  • Mar 2022: Two papers accepted at ACL 2022
  • Mar 2021: Three papers (two long, one short) accepted at NAACL 2021
  • Jan 2021: RE-T5 model ranked 1st in CommonGen competition
  • Oct 2020: Ranked 1st in FEVER competition
  • Published numerous papers at top-tier conferences including NeurIPS 2023, EMNLP, ACL, AAAI, and NAACL
  • Notable works include DialogLM (pre-trained model for dialogue understanding and summarization), MediaSum (large-scale media interview summarization dataset), and DialogSum (real-life dialogue summarization dataset)
  • Multiple papers include open-source code or public datasets (marked with [code] or [dataset])