Scholar

Francis Rhys Ward

Google Scholar ID: i98avZYAAAAJ

Imperial College London

AI alignmentdeceptionsafety evaluations

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

206

H-index

7

i10-index

5

Publications

17

Co-authors

15

list available

Contact

No contact links provided.

Publications

8 items

How does information access affect LLM monitors'ability to detect sabotage?

2026

Cited

0

Password-Activated Shutdown Protocols for Misaligned Frontier Agents

2025

Cited

0

Reasoning Under Pressure: How do Training Incentives Influence Chain-of-Thought Monitorability?

2025

Cited

0

CTRL-ALT-DECEIT: Sabotage Evaluations for Automated AI R&D

2025

Cited

0

Higher-Order Belief in Incomplete Information MAIDs

2025

Cited

0

The Elicitation Game: Evaluating Capability Elicitation Techniques

2025

Cited

0

Towards a Theory of AI Personhood

2025

Cited

0

AI Sandbagging: Language Models can Strategically Underperform on Evaluations

arXiv.org · 2024

Cited

11

Resume (English only)

Co-authors

15 total

Francesco Belardinelli

Imperial College London

Imperial College London

Teun van der Weij

Research Scientist

Staff Research Scientist at Google DeepMind

Samuel F. Brown

Unknown affiliation

Matt MacDermott

Imperial College London/Mila/LawZero

Professor of Safety-Critical Systems at the University of York

Associate Professor, LTCI, Télécom Paris, France