Published several papers, including 'ManagerBench: Evaluating The Safety-Pragmatism Trade-Off In Autonomous LLMs', 'Planted in Pretraining, Swayed by Finetuning: A Case Study on the Origins of Cognitive Biases in LLMs', and more.
Research Experience
Interned at Meta AI, studying attention dynamics in translation models. Co-organized the GEM workshop at ACL 2025. Active contributor to the EvalEval coalition, which aims to standardize and compare evaluation outputs across frameworks.
Education
PhD candidate at Technion, co-advised by Yonatan Belinkov and Gabriel Stanovsky at the Hebrew University in Jerusalem. Completed M.Sc. at Tel Aviv University under Omer Levy, investigating how token-level spelling information is encoded in embedding matrices.
Background
Interested in evaluating and interpreting large language models (LLMs), with a particular focus on their reasoning and decision-making processes, including failures that reveal human-like cognitive biases. Combines behavioral and representational analyses to better understand model tendencies and the impact of fine-tuning and pretraining.
Miscellany
Open to collaborations and enjoys discussing research, language models, and everything in between. Feel free to reach out via email!