🤖 AI Summary
Prior studies infer special interest group (SIG) positions on legislation indirectly, limiting accuracy and scalability. This work addresses the challenge of directly measuring SIG stances—support, opposition, amendment requests, or monitoring—across all U.S. legislative proposals. Method: We propose the first end-to-end stance annotation framework integrating large language models (LLMs) with graph neural networks (GNNs) to parse structured lobbying reports and perform multi-label classification. Contribution/Results: We introduce SIG-LEGIS, the first multidimensional dataset covering Congresses 111–117, comprising 42,000 bills and 279,000 expert-validated stance annotations. Empirical analysis uncovers systematic associations among legislative stage, firm size, policy domain, and industry-specific preferences, and identifies four generalizable political behavior patterns. Our framework establishes a novel methodological foundation for studying policy influence mechanisms and provides large-scale empirical evidence to advance computational political science and regulatory impact assessment.
📝 Abstract
Special interest groups (SIGs) in the U.S. participate in a range of political activities, such as lobbying and making campaign donations, to influence policy decisions in the legislative and executive branches. The competing interests of these SIGs have profound implications for global issues such as international trade policies, immigration, climate change, and global health challenges. Despite the significance of understanding SIGs' policy positions, empirical challenges in observing them have often led researchers to rely on indirect measurements or focus on a select few SIGs that publicly support or oppose a limited range of legislation. This study introduces the first large-scale effort to directly measure and predict a wide range of bill positions-Support, Oppose, Engage (Amend and Monitor)- across all legislative bills introduced from the 111th to the 117th Congresses. We leverage an advanced AI framework, including large language models (LLMs) and graph neural networks (GNNs), to develop a scalable pipeline that automatically extracts these positions from lobbying activities, resulting in a dataset of 42k bills annotated with 279k bill positions of 12k SIGs. With this large-scale dataset, we reveal (i) a strong correlation between a bill's progression through legislative process stages and the positions taken by interest groups, (ii) a significant relationship between firm size and lobbying positions, (iii) notable distinctions in lobbying position distribution based on bill subject, and (iv) heterogeneity in the distribution of policy preferences across industries. We introduce a novel framework for examining lobbying strategies and offer opportunities to explore how interest groups shape the political landscape.