Resource Rational Contractualism Should Guide AI Alignment

📅 2025-06-20

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

This paper addresses the challenge of achieving efficient human-AI alignment in multi-objective, pluralistic value environments. We propose the Resource-Rational Contractualism (RRC) framework—a novel integration of contractualist ethics and resource-rational cognitive modeling—that employs cognitively inspired heuristics to rapidly approximate intersubjectively acceptable normative agreements under bounded computational resources. Unlike conventional approaches, RRC substantially reduces the computational and coordination overhead required for large-scale social consensus formation and enables dynamic adaptation of AI systems to heterogeneous value landscapes. Technically, RRC unifies normative modeling, computational game theory, bounded-rational decision-making, and social preference inference, jointly optimizing normative robustness and computational feasibility. Empirical evaluation demonstrates significant improvements in interpretability, cross-group adaptability, and collaborative reliability.

Technology Category

Application Category

📝 Abstract

AI systems will soon have to navigate human environments and make decisions that affect people and other AI agents whose goals and values diverge. Contractualist alignment proposes grounding those decisions in agreements that diverse stakeholders would endorse under the right conditions, yet securing such agreement at scale remains costly and slow -- even for advanced AI. We therefore propose Resource-Rational Contractualism (RRC): a framework where AI systems approximate the agreements rational parties would form by drawing on a toolbox of normatively-grounded, cognitively-inspired heuristics that trade effort for accuracy. An RRC-aligned agent would not only operate efficiently, but also be equipped to dynamically adapt to and interpret the ever-changing human social world.

Problem

Research questions and friction points this paper is trying to address.

AI systems need to handle diverse human and AI goals

Current contractualist alignment is costly and slow

Proposing Resource-Rational Contractualism for efficient dynamic adaptation

Innovation

Methods, ideas, or system contributions that make the work stand out.

AI uses normatively-grounded heuristics for alignment

Dynamic adaptation to human social changes

Trade effort for accuracy in agreements

🔎 Similar Papers

How Ethical Should AI Be? How AI Alignment Shapes the Risk Preferences of LLMs