🤖 AI Summary
This study addresses the challenge of assessing the factual grounding of linguistic expressions in parliamentary transcripts relative to real-world information. We propose the first multidimensional factuality annotation framework specifically designed for parliamentary discourse, which finely distinguishes factual, possible, and fictional statements by integrating discourse semantics, cognitive modality, and contextual constraint theories; it is implemented for Hebrew yet designed for cross-lingual transferability. Based on nearly 5,000 manually annotated utterances—validated through multiple rounds of inter-annotator agreement (Krippendorff’s α > 0.85)—we release the first high-quality benchmark corpus for parliamentary factuality. Furthermore, we develop a context-aware factuality feature prediction model achieving high automatic identification accuracy (F1 ≥ 0.79). Our key contributions are: (1) a theory-driven, interpretable annotation scheme; (2) the first fine-grained factuality corpus for parliamentary settings; and (3) a linguistically grounded modeling approach for fact-checking signals.
📝 Abstract
Factuality assesses the extent to which a language utterance relates to real-world information; it determines whether utterances correspond to facts, possibilities, or imaginary situations, and as such, it is instrumental for fact checking. Factuality is a complex notion that relies on multiple linguistic signals, and has been studied in various disciplines.
We present a complex, multi-faceted annotation scheme of factuality that combines concepts from a variety of previous works. We developed the scheme for Hebrew, but we trust that it can be adapted to other languages. We also present a set of almost 5,000 sentences in the domain of parliamentary discourse that we manually annotated according to this scheme. We report on inter-annotator agreement, and experiment with various approaches to automatically predict (some features of) the scheme, in order to extend the annotation to a large corpus.