Scholar

Jingyu (Jack) Zhang

Google Scholar ID: 9EC0sDMAAAAJ

Johns Hopkins University

Natural Language Processing

Citations & Impact

All-time

Citations

386

H-index

i10-index

Publications

Co-authors

list available

Contact

Publications

4 items

arXiv.org · 2024

Cited

arXiv.org · 2024

Cited

arXiv.org · 2024

Cited

arXiv.org · 2024

Cited

Resume (English only)

Academic Achievements

“The Alignment Waltz: Jointly Training Agents to Collaborate for Safety” (arXiv preprint): Introduced WaltzRL, a multi-agent RL framework that improves LLM safety and reduces overrefusals through collaborative agent training
“Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements” (ICLR 2025): Proposed a framework for adapting LLMs to diverse safety requirements at inference without retraining
“Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data” (NAACL 2025 oral): Developed models that quote verbatim from trusted pre-training sources to enable easy verification
“SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation” (NAACL 2024): Proposed SemStamp, a sentence-level semantic watermarking method robust to paraphrasing via locality-sensitive hashing (LSH)
Recipient of the Amazon AI PhD Fellowship

Co-authors

46 total