AgoraResearch hub
ExploreLibraryProfile
Account
Donghai Hong
Scholar

Donghai Hong

Google Scholar ID: JQx-_5gAAAAJ
Peking University
AI SafetyAI AlignmentMulti-Modal Model
Google Scholar↗
Citations & Impact
All-time
Citations
778
 
H-index
8
 
i10-index
6
 
Publications
12
 
Co-authors
10
list available
Contact
No contact links provided.
Publications
10 items
Debate with Images: Detecting Deceptive Behaviors in Multimodal Large Language Models
2025
Cited
0
AI Deception: Risks, Dynamics, and Controls
2025
Cited
0
InterMT: Multi-Turn Interleaved Preference Alignment with Human Feedback
2025
Cited
0
Mitigating Deceptive Alignment via Self-Monitoring
2025
Cited
0
Generative RLHF-V: Learning Principles from Multi-modal Human Preference
2025
Cited
0
Safe RLHF-V: Safe Reinforcement Learning from Human Feedback in Multimodal Large Language Models
2025
Cited
0
ThinkPatterns-21k: A Systematic Study on the Impact of Thinking Patterns in LLMs
2025
Cited
0
Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability
2024
Cited
0
Resume (English only)
Co-authors
10 total
Yaodong Yang
Yaodong Yang
Boya (博雅) Assistant Professor at Peking University
Jiaming Ji (吉嘉铭)
Jiaming Ji (吉嘉铭)
Peking University
Boyuan Chen
Boyuan Chen
Peking University
Josef Dai
Josef Dai
Zhejiang University
Yike  Guo
Yike Guo
Dept of CSE, The Hong Kong University of Science and Technology
Jiayi Zhou
Jiayi Zhou
Peking University Ph.D Student
Tianyi Alex Qiu
Tianyi Alex Qiu
Peking University
Hantao Lou
Hantao Lou
Peking University

Welcome back

Sign in to Agora

Welcome back! Please sign in to continue.

Do not have an account?