Xinpeng Wang
Scholar

Xinpeng Wang

Google Scholar ID: QcNNM2YAAAAJ
PhD Student, LMU Munich
Alignment
Citations & Impact
All-time
Citations
412
 
H-index
9
 
i10-index
9
 
Publications
12
 
Co-authors
14
list available
Publications
12 items
Browse publications on Google Scholar (top-right) ↗
Resume (English only)
Academic Achievements
  • - Is It Thinking or Cheating? Detecting Implicit Reward Hacking by Measuring Reasoning Effort, preprint, 2025
  • - Refusal Direction is Universal Across Safety-Aligned Languages, NeurIPS, 2025
  • - Surgical, Cheap, and Flexible: Mitigating False Refusal in Language Models via Single Vector Ablation, ICLR, 2025
  • - Look at the Text: Instruction-Tuned Language Models are More Robust Multiple Choice Selectors than You Think, COLM, 2024
  • - "My Answer is C": First-Token Probabilities Do Not Match Text Answers in Instruction-Tuned Language Models, ACL Findings, 2024
  • - ACTOR: Active Learning with Annotator-specific Classification Heads to Embrace Human Label Variation, EMNLP, 2023
  • - How to Distill your BERT: An Empirical Study on the Impact of Weight Initialisation and Distillation Objectives, ACL, 2023
  • - Sceneformer: Indoor Scene Generation with Transformers, 3DV, 2021
Research Experience
  • - PhD Student, MaiNLP Lab, LMU Munich
  • - Visiting Researcher, New York University
  • - Student Researcher, TUM
Education
  • - PhD, LMU Munich, Supervisor: Prof. Barbara Plank
  • - M.Sc., Robotics, Cognition, Intelligence, Technical University of Munich (TUM), Research: Indoor scene synthesis
Background
  • Research Interests: Alignment. Currently a PhD student at the MaiNLP lab at LMU Munich, supervised by Prof. Barbara Plank. Also a visiting researcher at New York University, advised by Prof. He He.
Miscellany
  • Personal interests and hobbies not mentioned