Developed Coding Assistant Task Orchestrator (CATO), the first SLA-aware LLM serving algorithm for coding tasks, deployed internally at Huawei Cloud on Ascend NPU clusters, achieving up to 41.4% goodput improvement over model-centric approaches like Ray Serve
Published 'SLA-Awareness for AI-assisted coding' (arXiv:2503.19876)
Co-authored 'Rethinking Software Engineering in the Era of Foundation Models' (FSE 2024)
Co-authored 'Rethinking Software Engineering in the Foundation Model Era: From Task-Driven AI Copilots to Goal-Driven AI Pair Programmers' (arXiv:2404.10225)
Published 'Towards training reproducible deep learning models' (ICSE 2022)