Scholar

Anej Svete

Google Scholar ID: 9ezEOeUAAAAJ

ETH Zurich

computer sciencenatural language processingmachine learning

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

181

H-index

i10-index

Publications

Co-authors

list available

Contact

Emailasvete@ethz.ch CVOpen ↗TwitterOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

12 items

Olmo Hybrid: From Theory to Practice and Back

2026

Cited

Context-Free Recognition with Transformers

arXiv.org · 2026

Cited

Probability Distributions Computed by Hard-Attention Transformers

2025

Cited

On the Reasoning Abilities of Masked Diffusion Language Models

2025

Cited

The Transformer Cookbook

2025

Cited

Information Locality as an Inductive Bias for Neural Language Models

2025

Cited

Unique Hard Attention: A Tale of Two Sides

2025

Cited

Training Neural Networks as Recognizers of Formal Languages

arXiv.org · 2024

Cited

Resume (English only)

Academic Achievements

Published multiple papers at top-tier venues including ACL, ICLR, EMNLP, NAACL, and NeurIPS, such as:
“The Exact Expressive Power of Fixed-Precision Looped Padded Transformers” (arXiv)
“Information Locality as an Inductive Bias for Neural Language Models” (ACL 2025)
“Unique Hard Attention: A Tale of Two Sides” (ACL 2025)
“Gumbel Counterfactual Generation From Language Models” (ICLR 2025)
“Training Neural Networks as Recognizers of Formal Languages” (ICLR 2025)
“A Probability-Quality Trade-off in Aligned Language Models and its Relation to Sampling Adaptors” (EMNLP 2024)
“Can Transformers Learn n-gram Language Models?” (ACL 2024)
“On Efficiently Representing Regular Languages as RNNs” (ACL 2024 Findings)
“An L* Algorithm for Deterministic Weighted Regular Languages” (ACL 2024)
“On the Representational Capacity of Neural Language Models with Chain-of-Thought Reasoning” (ACL 2024)
“On Affine Homotopy between Language Encoders” (NeurIPS 2024)
“What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages” (ACL 2024)
“Transformers Can Represent n-gram Language Models” (NAACL 2024)
“Lower Bounds on the Expressivity of Recurrent Neural Language Models” (NAACL 2024)
“The Role of n-gram Smoothing in the Age of Neural Networks” (NAACL 2024)
Invited talk at NeurIPS 2025 Workshop on Principles of Generative Modeling (December 2025)
Organizing tutorial on The Underlying Logic of Language Models at ICML 2025 (July 2025)
Organizing tutorial on Computational Expressivity of Neural Language Models at ACL 2024 (August 2024)

Background

PhD Student in Natural Language Processing at ETH Zürich
Research at the intersection of formal language theory and modern language models
Investigates the capabilities and limitations of neural networks (e.g., Transformers): what problems they can solve, which aspects of language they capture, and whether they can truly 'reason'
Student Researcher at the Allen Institute for AI (Ai2) since Summer 2025
Collaborating with Ashish Sabharwal on reasoning and problem-solving in language models
Co-advised by Prof. Ryan Cotterell and Prof. Valentina Boeva
Co-organizes the Formal Languages and Neural Networks (FLaNN) Seminar

Co-authors

37 total