Extractive Fact Decomposition for Interpretable Natural Language Inference in one Forward Pass

📅 2025-09-23

📈 Citations: 0

✨ Influential: 0

career value

149K/year

🤖 AI Summary

Existing natural language inference (NLI) and fact-checking approaches rely on generative large language models (LLMs) for atomic fact decomposition, incurring high computational cost and limited interpretability. This work proposes JEDI—the first end-to-end extractive decomposition-and-inference model based solely on an encoder architecture—performing both fact extraction and logical judgment in a single forward pass, without invoking generative LLMs. To enhance robustness—particularly under out-of-distribution and adversarial settings—JEDI introduces a synthetically generated rationale corpus for auxiliary training, eliminating the need for human-annotated rationales. Evaluated on multiple NLI benchmarks, JEDI achieves state-of-the-art accuracy while simultaneously surpassing existing extractive methods in interpretability, inference efficiency, and generalization capability.

Technology Category

Application Category

📝 Abstract

Recent works in Natural Language Inference (NLI) and related tasks, such as automated fact-checking, employ atomic fact decomposition to enhance interpretability and robustness. For this, existing methods rely on resource-intensive generative large language models (LLMs) to perform decomposition. We propose JEDI, an encoder-only architecture that jointly performs extractive atomic fact decomposition and interpretable inference without requiring generative models during inference. To facilitate training, we produce a large corpus of synthetic rationales covering multiple NLI benchmarks. Experimental results demonstrate that JEDI achieves competitive accuracy in distribution and significantly improves robustness out of distribution and in adversarial settings over models based solely on extractive rationale supervision. Our findings show that interpretability and robust generalization in NLI can be realized using encoder-only architectures and synthetic rationales. Code and data available at https://jedi.nicpopovic.com

Problem

Research questions and friction points this paper is trying to address.

Replacing resource-intensive generative models for fact decomposition

Achieving interpretable natural language inference without generative LLMs

Improving robustness in out-of-distribution and adversarial NLI settings

Innovation

Methods, ideas, or system contributions that make the work stand out.

Extractive fact decomposition without generative models

Encoder-only architecture for joint decomposition and inference

Synthetic rationales training for robust generalization

🔎 Similar Papers

No similar papers found.