🤖 AI Summary
Facing declining participation rates in traditional opinion polling and low credibility of social media data, this paper proposes an AI-driven crowdsourced data credibility assessment framework grounded exclusively in social interaction graph structure—not textual content. Methodologically, we develop a graph-analytic model simulating propagation behaviors of honest versus fraudulent participants, leveraging heterogeneous network structural features to detect multiple types of noncompliant engagement. Crucially, by eschewing semantic dependencies, our approach achieves enhanced cross-platform generalizability. Evaluated on multiple real-world social network datasets, the method attains over 90% accuracy in identifying noncompliant behaviors—significantly outperforming baseline models. Results empirically validate that topological structure alone suffices to effectively model participant eligibility and behavioral patterns. This work establishes a scalable, low-intrusion paradigm for high-credibility public opinion collection.
📝 Abstract
The emergence of crowdsourced data has significantly reshaped social science, enabling extensive exploration of collective human actions, viewpoints, and societal dynamics. However, ensuring safe, fair, and reliable participation remains a persistent challenge. Traditional polling methods have seen a notable decline in engagement over recent decades, raising concerns about the credibility of collected data. Meanwhile, social and peer-to-peer networks have become increasingly widespread, but data from these platforms can suffer from credibility issues due to fraudulent or ineligible participation. In this paper, we explore how social interactions can help restore credibility in crowdsourced data collected over social networks. We present an empirical study to detect ineligible participation in a polling task through AI-based graph analysis of social interactions among imperfect participants composed of honest and dishonest actors. Our approach focuses solely on the structure of social interaction graphs, without relying on the content being shared. We simulate different levels and types of dishonest behavior among participants who attempt to propagate the task within their social networks. We conduct experiments on real-world social network datasets, using different eligibility criteria and modeling diverse participation patterns. Although structural differences in social interaction graphs introduce some performance variability, our study achieves promising results in detecting ineligibility across diverse social and behavioral profiles, with accuracy exceeding 90% in some configurations.