🤖 AI Summary
This paper investigates the recursive evolution of user-preference alignment in self-consuming generative models—where models continuously retrain on their own outputs—transforming alignment from a one-shot optimization into a path-dependent, power-structured dynamic process. We propose a theoretical framework grounded in a two-stage Bradley–Terry model, integrating dynamic social choice theory, game-theoretic analysis, and path-dependence formalism to establish, for the first time, a rigorous foundation for long-term alignment under recursive training. We characterize three fundamental convergence regimes: consensus collapse, shared optimal compromise, and asymmetric refinement. Moreover, we prove an impossibility theorem demonstrating that diversity preservation, symmetric influence among users, and initial-condition independence are mutually incompatible under recursive alignment. These results shift AI alignment from a static objective paradigm to an evolutionary equilibrium paradigm, providing a novel theoretical basis for sustainable alignment.
📝 Abstract
In self-consuming generative models that train on their own outputs, alignment with user preferences becomes a recursive rather than one-time process. We provide the first formal foundation for analyzing the long-term effects of such recursive retraining on alignment. Under a two-stage curation mechanism based on the Bradley-Terry (BT) model, we model alignment as an interaction between two factions: the Model Owner, who filters which outputs should be learned by the model, and the Public User, who determines which outputs are ultimately shared and retained through interactions with the model. Our analysis reveals three structural convergence regimes depending on the degree of preference alignment: consensus collapse, compromise on shared optima, and asymmetric refinement. We prove a fundamental impossibility theorem: no recursive BT-based curation mechanism can simultaneously preserve diversity, ensure symmetric influence, and eliminate dependence on initialization. Framing the process as dynamic social choice, we show that alignment is not a static goal but an evolving equilibrium, shaped both by power asymmetries and path dependence.