Dongyang Fan
Scholar

Dongyang Fan

Google Scholar ID: U7yzfCkAAAAJ
EPFL
machine learningLLMs
Citations & Impact
All-time
Citations
36
 
H-index
3
 
i10-index
2
 
Publications
11
 
Co-authors
10
list available
Resume (English only)
Academic Achievements
  • Published multiple papers such as 'Apertus: Democratizing Open and Compliant LLMs for Global Language Environments', 'TiMoE: Time-Aware Mixture of Language Experts', 'URLs Help, Topics Guide: Understanding Metadata Utility in LLM Training', 'Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs', etc. Won oral presentation award at COLM 2025.
Research Experience
  • Involved in multiple research projects, including Apertus team's pretraining work, TiMoE: Time-Aware Mixture of Language Experts, URLs Help, Topics Guide: Understanding Metadata Utility in LLM Training, etc.
Education
  • 4th-year PhD student at Machine Learning and Optimization Lab at EPFL, supervised by Prof. Martin Jaggi.
Background
  • Research interests include: Data-Efficient Language Modeling, Mixture-of-Experts architectures, Decentralized training methods, Accelerating LLM pretraining through metadata conditioning, Responsible Language Modeling, Data-compliant pretraining by respecting owners’ opt-out choices, Designing compensation frameworks for data contributors, Understanding and mitigating model hallucinations.
Miscellany
  • Likes arts and cultural stuff, enjoys outdoor activities like hiking, skiing, and sailing. Also paints from hiking trips.