Fine-tuning with Hierarchical Prompting for Robust Propaganda Classification Across Annotation Schemas

📅 2026-05-13
📈 Citations: 0
Influential: 0
📄 PDF

career value

201K/year
🤖 AI Summary
This study addresses key challenges in detecting propaganda on social media—namely high textual noise, short message lengths, and low annotation consistency—by proposing a novel intent-based propaganda classification framework that better captures both expressive intent and reliability. The authors fine-tune and evaluate several large language models, including GPT-4.1-nano, Phi-4 14B, Qwen2.5-14B, and Qwen3-14B, using a hierarchical prompting strategy (HiPP). Experimental results demonstrate that fine-tuning substantially improves zero-shot performance, with the Qwen series consistently outperforming other models and Phi-4 14B surpassing GPT-4.1-nano. HiPP proves particularly effective under complex annotation scenarios. The work also introduces HQP, a new publicly released dataset to support future research in this domain.
📝 Abstract
Propaganda detection in social media is challenging due to noisy, short texts and low annotation agreements. We introduce a new intent-focused taxonomy of propaganda techniques and compare it against an established, higher-agreement schema. Along three dimensions (model portfolio, schema effects, and prompting strategy) we evaluate the taxonomies as a classification task with the help of four language models (GPT-4.1-nano, Phi-4 14B, Qwen2.5-14B, Qwen3-14B). Our results show that fine-tuning is essential, since it transforms weak zero-shot baselines into competitive systems and reveals methodological differences that are hidden using base models. Across schemas, the Qwen models achieve the strongest overall performance, and Phi-4 14B consistently outperforms GPT-4.1-nano. Our hierarchical prompting method (HiPP), which predicts fine-grained techniques before aggregating them, is especially beneficial after fine-tuning and on the more ambiguous, low-agreement taxonomy, while remaining competitive on the simpler schema. The HQP dataset, annotated with the new intent-based labels, provides a richer lens on propaganda's strategic goals and a challenging benchmark for future work on robust, real-world detection.
Problem

Research questions and friction points this paper is trying to address.

propaganda detection
annotation schemas
low annotation agreement
social media
classification robustness
Innovation

Methods, ideas, or system contributions that make the work stand out.

hierarchical prompting
propaganda classification
fine-tuning
annotation schema
intent-based taxonomy
L
Lukas Stähelin
Technische Universität Berlin, QU Lab, XplaiNLP Group, Berlin, Germany
Veronika Solopova
Veronika Solopova
Technische Universität Berlin
Computational linguisticsEthics of AI
M
Max Upravitelev
Technische Universität Berlin, QU Lab, XplaiNLP Group, Berlin, Germany; German Research Center for Artificial Intelligence (DFKI), Berlin, Germany
D
David Kaplan
Technische Universität Berlin, QU Lab, XplaiNLP Group, Berlin, Germany
A
Ariana Sahitaj
Technische Universität Berlin, QU Lab, XplaiNLP Group, Berlin, Germany; German Research Center for Artificial Intelligence (DFKI), Berlin, Germany
P
Premtim Sahitaj
Technische Universität Berlin, QU Lab, XplaiNLP Group, Berlin, Germany; German Research Center for Artificial Intelligence (DFKI), Berlin, Germany
C
Charlott Jakob
Technische Universität Berlin, QU Lab, XplaiNLP Group, Berlin, Germany; German Research Center for Artificial Intelligence (DFKI), Berlin, Germany
Sebastian Möller
Sebastian Möller
Professor for Quality and Usability, TU Berlin and Scientific Director, DFKI
Quality of ExperienceUser ExperienceSpeechDialogNatural Language Processing
Vera Schmitt
Vera Schmitt
Head of XplaiNLP Research Group at TU Berlin
NLP/LLMsXAIHCIDisinformationUsable Privacy