🤖 AI Summary
To address the inherent positional bias of large language models (LLMs) in document-level event argument extraction (DocEAE), this paper proposes ULTRA: a fine-tuning-free framework integrating hierarchical modeling and pairwise self-refinement. ULTRA employs hierarchical document reading coupled with candidate filtering and introduces LEAFER—a boundary-aware decoding module—to precisely locate cross-paragraph event role spans. Leveraging open-source LLMs, it processes documents via chunked sequence encoding and applies pairwise self-refinement to iteratively refine predictions. On Exact Match, ULTRA outperforms strong baselines by 9.8%, significantly surpassing both supervised models and ChatGPT. Its core innovation lies in the first unified integration of hierarchical reading, self-refinement filtering, and boundary-aware decoding—enabling high-accuracy, low-cost document-level event understanding without any parameter updates.
📝 Abstract
Structural extraction of events within discourse is critical since it avails a deeper understanding of communication patterns and behavior trends. Event argument extraction (EAE), at the core of event-centric understanding, is the task of identifying role-specific text spans (i.e., arguments) for a given event. Document-level EAE (DocEAE) focuses on arguments that are scattered across an entire document. In this work, we explore open-source Large Language Models (LLMs) for DocEAE, and propose ULTRA, a hierarchical framework that extracts event arguments more cost-effectively. Further, it alleviates the positional bias issue intrinsic to LLMs. ULTRA sequentially reads text chunks of a document to generate a candidate argument set, upon which non-pertinent candidates are dropped through self-refinement. We introduce LEAFER to address the challenge LLMs face in locating the exact boundary of an argument. ULTRA outperforms strong baselines, including strong supervised models and ChatGPT, by 9.8% when evaluated by Exact Match (EM).