Publications: 'Rote Learning Considered Useful: Generalizing over Memorized Data in LLMs' (ICML 2025 Workshop), 'Towards Reliable Latent Knowledge Estimation in LLMs: Zero-Prompt Many-Shot Based Factual Knowledge Extraction' (WSDM 2025). Other research includes privacy and security in LLMs, neuroscience-inspired modeling, and LLM systems and optimization.
Research Experience
As a PhD student at MPI-SWS, collaborates closely with Evimaria Terzi (Boston University), Mariya Toneva (MPI-SWS), and Muhammad Bilal Zafar (Ruhr University Bochum). Also served as a TA for a seminar course on LLM training at the University of Saarland.
Education
PhD: CS@Max Planck and Max Planck Institute for Software Systems (MPI-SWS), advised by Krishna Gummadi; Bachelor's: University of Electronic Science and Technology of China (UESTC), major in mathematics-physics.
Background
Research Interests: How large language models (LLMs) internalize, represent, and utilize knowledge—seeking to enhance their reliability, interpretability, and safety. The work centers on understanding the interplay between internal learning (from training) and external adaptation (via prompts, retrieval, or tool use). Ultimately, aims to understand and improve the loop between how LLMs learn, remember, refer, and act—toward more trustworthy and cognitively grounded AI systems.