Scholar

Zhengyan Shi

Google Scholar ID: TF8l2ZEAAAAJ

Microsoft Research

Natural Language ProcessingLanguage Model

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

430

H-index

i10-index

Publications

Co-authors

list available

Contact

Emailzhengyan.shi.19@gmail.com TwitterOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

8 items

FAME: Forecasting Academic Impact via Continuous-Time Manifold Evolution

2026

Cited

Learning to Solve Complex Problems via Dataset Decomposition

2026

Cited

Learning to Extract Context for Context-Aware LLM Inference

2025

Cited

Gistify! Codebase-Level Understanding via Runtime Execution

2025

Cited

BugPilot: Complex Bug Generation for Efficient Learning of SWE Skills

2025

Cited

Min: Mixture of Noise for Pre-Trained Model-Based Class-Incremental Learning

2025

Cited

Optimising Language Models for Downstream Tasks: A Post-Training Perspective

2025

Cited

RiOT: Efficient Prompt Refinement with Residual Optimization Tree

2025

Cited

Resume (English only)

Academic Achievements

- Publications:
- Understanding Likelihood Over-optimisation in Direct Alignment Algorithms, Preprint, 2024
- Instruction Tuning With Loss Over Instructions, NeurIPS, 2024
- DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning, ICLR, 2024
- Don't Stop Pretraining? Make Prompt-based Fine-tuning Powerful Learner, NeurIPS, 2023
- Rethinking Semi-supervised Learning with Language Models, Findings of ACL, 2023

Research Experience

- Microsoft Research: Current research focus is on teaching language models to code
- Cohere (London): Research Intern
- Amazon (London & Seattle): Research Intern

Education

- PhD in Computer Science from University College London (UCL)
- MSc in Data Science (Statistics) with Distinction from UCL
- BSc in Mathematics with First Class Honours from the University of Liverpool and Xi'an Jiaotong-Liverpool University

Background

I am a Senior Researcher at Microsoft Research (MSR). My current research focuses on teaching language models (LMs) to code. I build learning loops in which LMs not only act but also reason within scalable, self-evolving environments.

Miscellany