Keqing He
Scholar

Keqing He

Google Scholar ID: 811USNoAAAAJ
Unknown affiliation
LLM
Citations & Impact
All-time
Citations
1,398
 
H-index
20
 
i10-index
37
 
Publications
20
 
Co-authors
0
 
Resume (English only)
Academic Achievements
  • ICLR 2025: "AgentRefine: Enhancing Agent Generalization through Refinement Tuning"
  • ACL 2024: "DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning"
  • EMNLP 2024: "How Do Your Code LLMs perform? Empowering Code Instruction Tuning with Really Good Data"
  • COLING 2024: 2 papers accepted
  • ACL 2024: 1 paper accepted
  • ICLR 2023: 1 paper accepted
  • EMNLP 2023: 4 papers accepted
  • ACL 2023: 4 papers accepted
  • EMNLP 2022: 4 papers accepted
  • COLING 2022: 3 papers accepted
  • CIKM 2022: 1 paper accepted
  • NAACL 2022: 2 papers accepted
  • SIGIR 2022: 1 paper accepted
  • ACL 2022: 1 paper accepted
  • arXiv: "SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild"
Research Experience
  • Mar 2023–Present: Full-time Researcher at Meituan LLM Group, focusing on reasoning models, MoE, and LLM alignment
  • Jun 2021–Mar 2023: Full-time Researcher at Meituan NLP Group, working on dialogue systems and dialogue pretraining
  • Jun 2020–Oct 2020: Research Intern at Alibaba DAMO Academy, focusing on recommendation systems
  • Mar 2020–Jun 2020: Research Intern at Tencent WeChat AI Lab, working on zero-shot learning and slot filling
  • Oct 2019–Mar 2020: Research Intern at Meituan NLP Group, researching GCN and dialogue systems
Background
  • Currently working at Meituan LLM Team, with research focus on reasoning models (e.g., o1), Mixture of Experts (MoE), and LLM alignment.
  • Research interests center on three key areas of Large Language Models (LLMs): complex reasoning, reinforcement learning in real-world settings, and LLM alignment.
  • In complex reasoning, focuses on the evolution of foundational models and optimization of Long-COT RL, aiming to build new technical pipelines from pre-training to post-training.
  • In real-world reinforcement learning, explores LLM-driven end-to-end agent systems (e.g., DeepResearch, GUI Agent, Embodied Agent) to push intelligence boundaries through interaction with dynamic environments.
  • In LLM alignment, works on scalable alignment learning, including data evaluation/optimization and preference learning algorithms, to ensure models are both powerful and aligned with human values.
Co-authors
0 total
Co-authors: 0 (list not available)