Scholar

Clément Dumas

Google Scholar ID: X3IvIvMAAAAJ

ENS Paris-Saclay

AI safetyDeep LearningNLPAI

Citations & Impact

All-time

Citations

H-index

i10-index

Publications

Co-authors

list available

Contact

Publications

5 items

2025

Cited

2025

Cited

2025

Cited

2025

Cited

arXiv.org · 2024

Cited

Resume (English only)

Academic Achievements

Several papers accepted and published, including: 'Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning' (NeurIPS 2025), 'Separating Tongue from Thought: Activation Patching Reveals Language-Agnostic Concept Representations in Transformers' (ACL 2025), 'Narrow Finetuning Leaves Clearly Readable Traces in the Activation Differences' (NeurIPS Mech Interp workshop 2025); developed tools like nnterp and tiny-dashboard.

Research Experience

Currently working with Julian Minder on evaluating different model diffing methods; previously focused on interpretability during an internship at EPFL DLAB, following up on the 'Do Llamas work in English?' paper; explored the emergence of XOR features in Large Language Models and the RAX hypothesis proposed by Sam Marks, did SPAR with Walter Laurito, and worked on non-maximizing training objectives for RL agents with Jobst Heitzig.

Education

Completing MSc in Vision & Learning (MVA) at École Normale Supérieure Paris-Saclay. Previously, completed a research internship at EPFL DLAB under the supervision of Chris Wendler and Bob West.

Background