🤖 AI Summary
This paper addresses the proportionality challenge in constitutional AI, where citizens collaboratively filter ethical principles via bidirectional “upvote/downvote” voting.
Method: We propose two orthogonal proportionality frameworks—joint (unified utility modeling for support and opposition) and separate (independent guarantee of veto power)—and formalize corresponding axiomatic systems. Leveraging Phragmén’s method, Proportional Approval Voting (PAV), and the Method of Equal Shares, we design novel proportional election mechanisms tailored to bidirectional voting.
Contribution/Results: We prove that several classical rules—when suitably adapted—satisfy our new axioms. This constitutes the first theoretically rigorous and algorithmically feasible proportional solution for democratic alignment with explicit veto rights, bridging formal social choice theory and value-aligned AI governance.
📝 Abstract
Consider the decision-making setting where agents elect a panel by expressing both positive and negative preferences. Prominently, in constitutional AI, citizens democratically select a slate of ethical preferences on which a foundation model is to be trained. There, in practice, agents may both approve and disapprove of different ethical principles. Proportionality has been well-studied in computational social choice for approval ballots, but its meaning remains unclear when negative sentiments are also considered. In this work, we propose two conceptually distinct approaches to interpret proportionality in the presence of up and down votes. The first approach treats the satisfaction from electing candidates and the impact of vetoing them as comparable, leading to combined proportionality guarantees. The second approach considers veto power separately, introducing guarantees distinct from traditional proportionality. We formalize axioms for each perspective and examine their satisfiability by suitable adaptations of Phragm'en's rule, Proportional Approval Voting rule and the Method of Equal Shares.