Wei Jie Yeo
Scholar

Wei Jie Yeo

Google Scholar ID: DcUMc_IAAAAJ
PhD candidate, Nanyang Technological University
Natural Language ProcessingExplainable AI
Citations & Impact
All-time
Citations
172
 
H-index
5
 
i10-index
3
 
Publications
10
 
Co-authors
0
 
Resume (English only)
Academic Achievements
  • - Publications:
  • - "Understanding Refusal in Language Models with Sparse Autoencoders" (Preprint, 2025)
  • - "Debiasing CLIP: Interpreting and Correcting Bias in Attention Heads" (Preprint, 2025)
  • - "A comprehensive review on financial explainable AI" (AIRE Journal, 2025)
  • - "Self-training Large Language Models through Knowledge Detection" (EMNLP, 2024)
  • - "How Interpretable are Reasoning Explanations from Prompting Large Language Models?" (NAACL, 2024)
  • - "Plausible Extractive Rationalization through Semi-Supervised Entailment Signal" (ACL, 2024)
Research Experience
  • - PhD Research Project: Focuses on the intersection of NLP and interpretability, especially in the application to AI safety
  • - Position: PhD Student
Education
  • - Degree: PhD
  • - University: Nanyang Technological University (NTU), Singapore
  • - Advisor: Prof. Erik Cambria
  • - Time: Expected to graduate by the end of 2025
  • - Major: Artificial Intelligence
Background
  • - Research Interests: Natural Language Processing and Interpretability, AI Safety
  • - Professional Field: AI research, particularly improving the understanding of how AI systems model complex behaviors
  • - Introduction: Currently a PhD student at Nanyang Technological University, Singapore, focusing on using interpretability to improve AI safety issues such as jailbreak or prompt injection attacks
Miscellany
  • - Personal Interests: Not provided
Co-authors
0 total
Co-authors: 0 (list not available)