Publication: 'Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?', ICLR 2025; 'ASIDE: Architectural Separation of Instructions and Data in Language Models', ICLR 2025 Workshop on Building Trust in Language Models and Applications (Oral) and EurIPS 2025 Salon des Refusés; Contributed to the LLMail-Inject dataset project.
Research Experience
Gave a talk on instruction-data separation at IBM T.J. Watson Research Lab; Attended CISPA - ELLIS - Summer School 2025 on Trustworthy AI; Presented SEP and ASIDE papers at ICLR 2025 and gave a talk about ASIDE at the BuildTrust workshop; Gave an invited talk at ETH about ASIDE.
Education
PhD: ISTA, Austria, under the supervision of Christoph Lampert; B.S. (Hons) in Applied Math and CS from the Yandex Department of Data Analysis at the Moscow Institute of Physics and Technology.
Background
Research interests: AI Safety and Security, with a particular focus on improving LLM Security through architectures. Previous work includes formalizing the problem of instruction-data separation and proposing a method to increase such separation through architectural changes.
Miscellany
Interests: Making games, creating apps, writing poems, learning Ukulele.