Online Minimization of Polarization and Disagreement via Low-Rank Matrix Bandits

📅 2025-10-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper studies online intervention in Friedkin-Johnsen opinion dynamics to minimize group polarization and disagreement under realistic conditions where agents’ innate opinions are unknown and only sequentially observable. The problem is formalized as a low-rank matrix bandit regret minimization task with scalar feedback—marking the first formulation of social opinion intervention as a low-rank matrix bandit learning problem. We propose a two-stage algorithm: first, estimating the low-dimensional subspace of innate opinions via low-rank approximation; second, deploying a linear bandit policy within this compressed subspace. We theoretically establish a cumulative regret bound of Õ(√T). Experiments demonstrate that our method significantly outperforms baseline linear bandit approaches in both regret performance and computational efficiency.

Technology Category

Application Category

📝 Abstract
We study the problem of minimizing polarization and disagreement in the Friedkin-Johnsen opinion dynamics model under incomplete information. Unlike prior work that assumes a static setting with full knowledge of users' innate opinions, we address the more realistic online setting where innate opinions are unknown and must be learned through sequential observations. This novel setting, which naturally mirrors periodic interventions on social media platforms, is formulated as a regret minimization problem, establishing a key connection between algorithmic interventions on social media platforms and theory of multi-armed bandits. In our formulation, a learner observes only a scalar feedback of the overall polarization and disagreement after an intervention. For this novel bandit problem, we propose a two-stage algorithm based on low-rank matrix bandits. The algorithm first performs subspace estimation to identify an underlying low-dimensional structure, and then employs a linear bandit algorithm within the compact dimensional representation derived from the estimated subspace. We prove that our algorithm achieves an $ widetilde{O}(sqrt{T}) $ cumulative regret over any time horizon $T$. Empirical results validate that our algorithm significantly outperforms a linear bandit baseline in terms of both cumulative regret and running time.
Problem

Research questions and friction points this paper is trying to address.

Minimizing polarization and disagreement in opinion dynamics
Learning unknown innate opinions through sequential observations
Formulating online interventions as regret minimization problem
Innovation

Methods, ideas, or system contributions that make the work stand out.

Low-rank matrix bandits for opinion dynamics
Subspace estimation for low-dimensional structure
Linear bandits in compact representation space
🔎 Similar Papers
No similar papers found.