LLMs Can Infer Political Alignment from Online Conversations

📅 2026-03-11

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

This study addresses the privacy risks posed by seemingly neutral linguistic preferences of online users, which may implicitly reveal political orientations. It presents the first systematic evaluation of large language models’ (LLMs) ability to infer users’ political leanings from non-explicitly political text. Leveraging discussion data from Debate.org and Reddit, the work aggregates multiple user utterances at the text level to enhance prediction accuracy. Results demonstrate that LLMs significantly outperform traditional machine learning approaches, particularly on texts embedded in stronger political contexts. Moreover, LLMs effectively identify culturally grounded, non-explicit political vocabulary, underscoring their capacity to extract subtle sociocultural signals and perform sophisticated privacy-sensitive inferences.

Technology Category

Application Category

📝 Abstract

Due to the correlational structure in our traits such as identities, cultures, and political attitudes, seemingly innocuous preferences such as following a band or using a specific slang, can reveal private traits. This possibility, especially when combined with massive, public social data and advanced computational methods, poses a fundamental privacy risk. Given our increasing data exposure online and the rapid advancement of AI are increasing the misuse potential of such risk, it is therefore critical to understand capacity of large language models (LLMs) to exploit it. Here, using online discussions on Debate.org and Reddit, we show that LLMs can reliably infer hidden political alignment, significantly outperforming traditional machine learning models. Prediction accuracy further improves as we aggregate multiple text-level inferences into a user-level prediction, and as we use more politics-adjacent domains. We demonstrate that LLMs leverage the words that can be highly predictive of political alignment while not being explicitly political. Our findings underscore the capacity and risks of LLMs for exploiting socio-cultural correlates.

Problem

Research questions and friction points this paper is trying to address.

political alignment

privacy risk

large language models

socio-cultural correlates

online conversations

Innovation

Methods, ideas, or system contributions that make the work stand out.

large language models

political alignment inference

privacy risk