Towards Bidirectional Human-AI Alignment: A Systematic Review for Clarifications, Framework, and Future Directions

📅 2024-06-13
🏛️ arXiv.org
📈 Citations: 38
Influential: 2
📄 PDF
🤖 AI Summary
Current AI alignment research is hindered by conceptual ambiguity and a unidirectional, static paradigm (AI→human), impeding interdisciplinary collaboration and practical progress. To address this, we conduct a systematic literature review (SLR) of over 400 cross-disciplinary publications from 2019–2024. Our analysis yields the first “bidirectional human-AI alignment” framework: it formalizes not only AI’s adaptation to human goals and values but also humans’ cognitive recalibration to AI capabilities and limitations—emphasizing dynamism, interaction, and long-term co-evolution. As the first human-centered, integrative conceptual model, it bridges theoretical divides across HCI, NLP, and ML; clarifies definitional boundaries of alignment; identifies critical research gaps and core value dimensions; and synthesizes three key future challenges with actionable pathways. This work establishes a unified discourse and a foundational research roadmap for the alignment field.

Technology Category

Application Category

📝 Abstract
Recent advancements in general-purpose AI have highlighted the importance of guiding AI systems towards the intended goals, ethical principles, and values of individuals and groups, a concept broadly recognized as alignment. However, the lack of clarified definitions and scopes of human-AI alignment poses a significant obstacle, hampering collaborative efforts across research domains to achieve this alignment. In particular, ML- and philosophy-oriented alignment research often views AI alignment as a static, unidirectional process (i.e., aiming to ensure that AI systems' objectives match humans) rather than an ongoing, mutual alignment problem. This perspective largely neglects the long-term interaction and dynamic changes of alignment. To understand these gaps, we introduce a systematic review of over 400 papers published between 2019 and January 2024, spanning multiple domains such as Human-Computer Interaction (HCI), Natural Language Processing (NLP), Machine Learning (ML). We characterize, define and scope human-AI alignment. From this, we present a conceptual framework of"Bidirectional Human-AI Alignment"to organize the literature from a human-centered perspective. This framework encompasses both 1) conventional studies of aligning AI to humans that ensures AI produces the intended outcomes determined by humans, and 2) a proposed concept of aligning humans to AI, which aims to help individuals and society adjust to AI advancements both cognitively and behaviorally. Additionally, we articulate the key findings derived from literature analysis, including literature gaps and trends, human values, and interaction techniques. To pave the way for future studies, we envision three key challenges and give recommendations for future research.
Problem

Research questions and friction points this paper is trying to address.

Defining and clarifying the concept of AI alignment
Examining bidirectional human-AI relationships and gaps
Addressing human adaptation to advancing AI technologies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces Bidirectional Human-AI Alignment framework
Aligns AI with human values and vice versa
Supports human adaptation to advancing AI technologies
🔎 Similar Papers
No similar papers found.
Hua Shen
Hua Shen
Assistant Professor, NYU Shanghai / New York University
bidirectional human-AI alignmenthuman-AI interactionAI/LLM interpretability and evaluation
T
Tiffany Knearem
Google
R
Reshmi Ghosh
Microsoft
K
Kenan Alkiek
Carnegie Mellon University
Kundan Krishna
Kundan Krishna
Research Scientist @ Apple
Computer ScienceArtificial IntelligenceNatural Language Processing
Y
Yachuan Liu
Carnegie Mellon University
Ziqiao Ma
Ziqiao Ma
University of Michigan
Machine LearningComputational Linguistics
Savvas Petridis
Savvas Petridis
Senior Research Scientist, Google DeepMind
Human-AI InteractionArtificial IntelligenceNatural Language Processing
Yi-Hao Peng
Yi-Hao Peng
Carnegie Mellon University
human-computer interactionmachine learningaccessibility
L
Li Qiwei
Carnegie Mellon University
Sushrita Rakshit
Sushrita Rakshit
Undergraduate Researcher, University of Michigan
HCINLPResponsible AI
Chenglei Si
Chenglei Si
Stanford University
Large Language ModelsAI Scientist
Y
Yutong Xie
Carnegie Mellon University
Jeffrey P. Bigham
Jeffrey P. Bigham
Carnegie Mellon University & Apple
human-computer interactionhuman-AI interactionresponsible AIaccessibility
Frank Bentley
Frank Bentley
Google
J
Joyce Chai
University of Michigan
Z
Zachary Lipton
Carnegie Mellon University
Qiaozhu Mei
Qiaozhu Mei
Professor, University of Michigan
AIdata mininginformation retrievalnatural language processinghealth informatics
Rada Mihalcea
Rada Mihalcea
Professor of Computer Science, University of Michigan
Natural Language ProcessingComputational Social ScienceMultimodal Interaction
Michael Terry
Michael Terry
Google
Diyi Yang
Diyi Yang
Stanford University
Computational Social ScienceNatural Language ProcessingMachine Learning
Meredith Ringel Morris
Meredith Ringel Morris
Director, Human-AI Interaction Research, Google DeepMind
Human Computer InteractionSocial ComputingAccessibilityHuman-Centered AIResponsible AI
Paul Resnick
Paul Resnick
University of Michigan
David Jurgens
David Jurgens
Associate Professor, School of Information and Dept. of Computer Science, University of Michigan
Natural Language ProcessingComputational Social ScienceComputational Sociolinguistics