🤖 AI Summary
Political interviews and congressional hearings serve dual functions—information elicitation and partisan narrative construction—yet systematic analysis of strategic questioning remains limited due to the absence of large-scale, structured datasets. To address this gap, we introduce C-QUERI, the first comprehensive, temporally annotated dataset of question-answer pairs from U.S. congressional committee hearings across the 108th–117th Congresses. We propose an NLP-driven automated pipeline for robust extraction and semantic annotation of Q&A pairs from unstructured hearing transcripts. Methodologically, we develop the first model capable of accurately inferring questioners’ partisan affiliation solely from question text. Empirical analysis reveals systematic cross-party differences in interrogative focus, rhetorical framing, and accountability intensity. C-QUERI and our analytical framework provide foundational infrastructure for scalable, comparative, and computationally grounded research on political discourse.
📝 Abstract
Questions in political interviews and hearings serve strategic purposes beyond information gathering including advancing partisan narratives and shaping public perceptions. However, these strategic aspects remain understudied due to the lack of large-scale datasets for studying such discourse. Congressional hearings provide an especially rich and tractable site for studying political questioning: Interactions are structured by formal rules, witnesses are obliged to respond, and members with different political affiliations are guaranteed opportunities to ask questions, enabling comparisons of behaviors across the political spectrum.
We develop a pipeline to extract question-answer pairs from unstructured hearing transcripts and construct a novel dataset of committee hearings from the 108th--117th Congress. Our analysis reveals systematic differences in questioning strategies across parties, by showing the party affiliation of questioners can be predicted from their questions alone. Our dataset and methods not only advance the study of congressional politics, but also provide a general framework for analyzing question-answering across interview-like settings.