AgoraResearch hub
ExploreLibraryProfile
Account
Josef Dai
Scholar

Josef Dai

Google Scholar ID: eRmX5AsAAAAJ
Zhejiang University
Alignment
Google Scholar↗
Citations & Impact
All-time
Citations
3,007
 
H-index
12
 
i10-index
14
 
Publications
20
 
Co-authors
5
list available
Contact
No contact links provided.
Publications
6 items
Debate with Images: Detecting Deceptive Behaviors in Multimodal Large Language Models
2025
Cited
0
SafeVLA: Towards Safety Alignment of Vision-Language-Action Model via Safe Reinforcement Learning
2025
Cited
0
Align Anything: Training All-Modality Models to Follow Instructions with Language Feedback
2024
Cited
0
PKU-SafeRLHF: Towards Multi-Level Safety Alignment for LLMs with Human Preference
2024
Cited
50
Language Models Resist Alignment: Evidence From Data Compression
2024
Cited
2
Reward Generalization in RLHF: A Topological Perspective
2024
Cited
4
Resume (English only)
Co-authors
5 total
Jiaming Ji (吉嘉铭)
Jiaming Ji (吉嘉铭)
Peking University
Boyuan Chen
Boyuan Chen
Peking University
Jiayi Zhou
Jiayi Zhou
Peking University Ph.D Student
Tianyi Alex Qiu
Tianyi Alex Qiu
Peking University
Donghai Hong
Donghai Hong
Peking University

Welcome back

Sign in to Agora

Welcome back! Please sign in to continue.

Do not have an account?