Scholar
Zeyu Huang
Google Scholar ID: EWU88_YAAAAJ
The University of Edinburgh
Large Language Models
Follow
Homepage
↗
Google Scholar
↗
Citations & Impact
All-time
Citations
417
H-index
9
i10-index
8
Publications
17
Co-authors
6
list available
Contact
Email
zeroy.huang@gmail.com
CV
Open ↗
Twitter
Open ↗
GitHub
Open ↗
LinkedIn
Open ↗
Publications
1 items
A Unified View of Attention and Residual Sinks: Outlier-Driven Rescaling is Essential for Transformer Training
2026
Cited
0
Resume (English only)
Academic Achievements
Bending Supervised and Reinforcement Fine-Tuning with Prefix Sampling, arXiv 2025
A Controllable Examination for Long-Context Language Models, NeurIPS 2025 DB Track, Spotlight (56/1995 submissions)
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free, NeurIPS 2025, Oral (77/21575 submissions)
Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models, ACL 2025 Main
Post-hoc Reward Calibration: A Case Study on Length Bias, ICLR 2025
Layerwise Recurrent Router for Mixture-of-Experts, ICLR 2025
Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training, NeurIPS 2024, Spotlight (325/15671 submissions)
Unlocking Emergent Modularity in Large Language Models, NAACL 2024, Outstanding Paper (6/2604 submissions)
Transformer-Patcher: One Mistake worth One Neuron, ICLR 2023
Co-authors
6 total
Zihan Qiu
Qwen Team, Alibaba Group & IIIS, Tsinghua University
Xiaofeng Zhang
MILA
Yikang Shen
xAI
Ivan Titov
University of Edinburgh / University of Amsterdam
Zili Wang
StepFun LLM Researcher & M-A-P
Edoardo M. Ponti
University of Edinburgh & NVIDIA
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up