π€ AI Summary
Political scientists often lack programming expertise yet require scalable time-series analysis of political texts. Method: This study develops PoliCorp, an open web platform enabling structured, interactive access to a 76-year (1949β2025) corpus of German Bundestag debatesβthe first such resource for this domain. Integrating NLP-based preprocessing, multi-field Boolean search, and efficient indexing, PoliCorp supports code-free advanced querying, dynamic subcorpus construction, and JSON export. Contribution/Results: Its core innovation lies in transforming long-term political discourse archives into a ready-to-use, social-science-oriented analytical infrastructure, substantially lowering barriers to qualitative text analysis. Hosted publicly (https://demo-pollux.gesis.org/), the platform provides reproducible, extensible data support for research on discursive change, policy agenda dynamics, and ideological evolution.
π Abstract
In this work, we present PoliCorp (https://demo-pollux.gesis.org/), a web portal designed to facilitate the search and analysis of political text corpora. PoliCorp provides researchers with access to rich textual data, enabling in-depth analysis of parliamentary discourse over time. The platform currently features a collection of transcripts from debates in the German parliament, spanning 76 years of proceedings. With the advanced search functionality, researchers can apply logical operations to combine or exclude search criteria, making it easier to filter through vast amounts of parliamentary debate data. The search can be customised by combining multiple fields and applying logical operators to uncover complex patterns and insights within the data. Additional data processing steps were performed to enable web-based search and incorporate extra features. A key feature that differentiates PoliCorp is its intuitive web-based interface that enables users to query processed political texts without requiring programming skills. The user-friendly platform allows for the creation of custom subcorpora via search parameters, which can be freely downloaded in JSON format for further analysis.