Scholar
Donghai Hong
Google Scholar ID: JQx-_5gAAAAJ
Peking University
AI Safety
AI Alignment
Multi-Modal Model
Follow
Google Scholar
↗
Citations & Impact
All-time
Citations
778
H-index
8
i10-index
6
Publications
12
Co-authors
10
list available
Contact
No contact links provided.
Publications
10 items
Debate with Images: Detecting Deceptive Behaviors in Multimodal Large Language Models
2025
Cited
0
AI Deception: Risks, Dynamics, and Controls
2025
Cited
0
InterMT: Multi-Turn Interleaved Preference Alignment with Human Feedback
2025
Cited
0
Mitigating Deceptive Alignment via Self-Monitoring
2025
Cited
0
Generative RLHF-V: Learning Principles from Multi-modal Human Preference
2025
Cited
0
Safe RLHF-V: Safe Reinforcement Learning from Human Feedback in Multimodal Large Language Models
2025
Cited
0
ThinkPatterns-21k: A Systematic Study on the Impact of Thinking Patterns in LLMs
2025
Cited
0
Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability
2024
Cited
0
Load more
Resume (English only)
Co-authors
10 total
Yaodong Yang
Boya (博雅) Assistant Professor at Peking University
Jiaming Ji (吉嘉铭)
Peking University
Boyuan Chen
Peking University
Josef Dai
Zhejiang University
Yike Guo
Dept of CSE, The Hong Kong University of Science and Technology
Jiayi Zhou
Peking University Ph.D Student
Tianyi Alex Qiu
Peking University
Hantao Lou
Peking University
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up