🤖 AI Summary
Existing causal detection methods face a dual bottleneck: unsupervised approaches exhibit poor cross-domain generalization, while supervised methods are constrained by scarce annotated data. To address this, we propose the first retrieval-augmented generation (RAG)-based dynamic prompting framework tailored for causal mining. Our method jointly integrates large language models (LLMs)—including LLaMA-3, Qwen, and GLM—with causal pattern matching and semantic re-ranking modules, leveraging context-aware causal rule retrieval and adaptive prompt construction. Crucially, it operates without requiring large-scale labeled data, thereby significantly enhancing few-shot robustness and cross-domain generalizability. Evaluated on three standard causal detection benchmarks, our approach achieves an average 12.7% F1-score improvement over static prompting baselines. The gains hold consistently across five mainstream LLMs, demonstrating both the universality and effectiveness of the proposed framework.
📝 Abstract
Causality detection and mining are important tasks in information retrieval due to their enormous use in information extraction, and knowledge graph construction. To solve these tasks, in existing literature there exist several solutions -- both unsupervised and supervised. However, the unsupervised methods suffer from poor performance and they often require significant human intervention for causal rule selection, leading to poor generalization across different domains. On the other hand, supervised methods suffer from the lack of large training datasets. Recently, large language models (LLMs) with effective prompt engineering are found to be effective to overcome the issue of unavailability of large training dataset. Yet, in existing literature, there does not exist comprehensive works on causality detection and mining using LLM prompting. In this paper, we present several retrieval-augmented generation (RAG) based dynamic prompting schemes to enhance LLM performance in causality detection and extraction tasks. Extensive experiments over three datasets and five LLMs validate the superiority of our proposed RAG-based dynamic prompting over other static prompting schemes.