Scholar

Donghai Hong

Google Scholar ID: JQx-_5gAAAAJ

Peking University

AI SafetyAI AlignmentMulti-Modal Model

Google Scholar↗

Citations & Impact

All-time

Citations

778

H-index

8

i10-index

6

Publications

12

Co-authors

10

list available

Contact

No contact links provided.

Publications

10 items

Debate with Images: Detecting Deceptive Behaviors in Multimodal Large Language Models

2025

Cited

0

AI Deception: Risks, Dynamics, and Controls

2025

Cited

0

InterMT: Multi-Turn Interleaved Preference Alignment with Human Feedback

2025

Cited

0

Mitigating Deceptive Alignment via Self-Monitoring

2025

Cited

0

Generative RLHF-V: Learning Principles from Multi-modal Human Preference

2025

Cited

0

Safe RLHF-V: Safe Reinforcement Learning from Human Feedback in Multimodal Large Language Models

2025

Cited

0

ThinkPatterns-21k: A Systematic Study on the Impact of Thinking Patterns in LLMs

2025

Cited

0

Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability

2024

Cited

0

Resume (English only)

Co-authors

10 total

Boya (博雅) Assistant Professor at Peking University

Jiaming Ji (吉嘉铭)

Peking University

Peking University

Zhejiang University

Dept of CSE, The Hong Kong University of Science and Technology

Peking University Ph.D Student

Tianyi Alex Qiu

Peking University

Peking University