Hannah Cyberey
Scholar

Hannah Cyberey

Google Scholar ID: vXAeCu8AAAAJ
University of Virginia
Natural Language ProcessingEvaluationFlipping Frenemies
Citations & Impact
All-time
Citations
42
 
H-index
3
 
i10-index
1
 
Publications
8
 
Co-authors
2
list available
Resume (English only)
Academic Achievements
  • Aug 20, 2025: Our paper “Unsupervised Concept Vector Extraction for Bias Control in LLMs” is accepted to EMNLP 2025 (Main Conference)!
  • Jul 21, 2025: I successfully defended my PhD. I’m officially Dr. Cyberey!
  • Jul 08, 2025: Our paper “Steering the CensorShip: Uncovering Representation Vectors for LLM ‘Thought’ Control” is accepted to COLM 2025!
  • Apr 24, 2025: Our paper “Do Prevalent Bias Metrics Capture Allocational Harms from LLMs?” is accepted to the Workshop on Insights from Negative Results in NLP
  • May 09, 2024: I passed my PhD dissertation proposal defense!
Research Experience
  • Currently a Postdoctoral Research Associate in the School of Data Science at the University of Virginia, supervised by Alex Gates. Research focuses on trustworthy natural language processing (NLP), addressing issues related to robustness and fairness of language models, and exploring representation engineering methods for mitigating bias and countering censorship.
Education
  • Received a PhD in Computer Science from the University of Virginia, advised by Prof. David Evans and Yangfeng Ji.
Background
  • Postdoctoral Research Associate in the School of Data Science at the University of Virginia, with research interests in AI safety and ethics. Current research focuses on the functional backbone of AI progress—how AI capabilities emerge, interconnect, and evolve within the broader research ecosystem, as part of UVA’s National Security Data and Policy Institute.
Miscellany
  • Latest blog posts include 'Steering the CensorShip: Uncovering Representation Vectors for LLM 'Thought' Control' and more.