🤖 AI Summary
Large language models (LLMs) face three key challenges in event extraction (EE): hallucination under weak supervision, fragile temporal/causal modeling across long contexts and documents, and limited long-horizon knowledge management due to fixed context windows. To address these, this paper proposes a novel paradigm—“event extraction as LLM cognitive scaffolding”—reframing EE as structured perception and memory layers. Our approach introduces: (1) an event-centric intermediate representation integrating schema constraints and slot-level verification; (2) graph-augmented retrieval (Graph RAG) coupled with a dynamically updatable event store; and (3) symbolic-neural hybrid multi-step reasoning with instruction-driven structured decoding. Experiments demonstrate substantial improvements in cross-document event linking accuracy, robustness of long-range causal inference, and capacity for continual knowledge evolution. The framework provides a unified, scalable foundation for building open-world, robust event-centric systems.
📝 Abstract
Large language models (LLMs) and multimodal LLMs are changing event extraction (EE): prompting and generation can often produce structured outputs in zero shot or few shot settings. Yet LLM based pipelines face deployment gaps, including hallucinations under weak constraints, fragile temporal and causal linking over long contexts and across documents, and limited long horizon knowledge management within a bounded context window. We argue that EE should be viewed as a system component that provides a cognitive scaffold for LLM centered solutions. Event schemas and slot constraints create interfaces for grounding and verification; event centric structures act as controlled intermediate representations for stepwise reasoning; event links support relation aware retrieval with graph based RAG; and event stores offer updatable episodic and agent memory beyond the context window. This survey covers EE in text and multimodal settings, organizing tasks and taxonomy, tracing method evolution from rule based and neural models to instruction driven and generative frameworks, and summarizing formulations, decoding strategies, architectures, representations, datasets, and evaluation. We also review cross lingual, low resource, and domain specific settings, and highlight open challenges and future directions for reliable event centric systems. Finally, we outline open challenges and future directions that are central to the LLM era, aiming to evolve EE from static extraction into a structurally reliable, agent ready perception and memory layer for open world systems.