🤖 AI Summary
To address the dual requirements of accuracy and interpretability in domain-specific (education, healthcare, law) question-answering systems, this paper proposes Text-JEPA—a dual-process framework integrating natural language understanding with formal logical reasoning for efficient and transparent end-to-end logical QA. Methodologically, it introduces a lightweight NL2FOL translation mechanism grounded in dual-system cognitive theory to map textual inputs to first-order logic (FOL); leverages the Z3 theorem prover for symbolic inference; and adopts a neuro-symbolic collaboration paradigm. A novel three-tier evaluation metric suite is further proposed to enhance both logical translation fidelity and reasoning interpretability. Experimental results demonstrate that Text-JEPA achieves accuracy comparable to large language models on domain-specific benchmarks while reducing inference overhead by over 60%, thereby effectively balancing efficiency, precision, and explainability.
📝 Abstract
Recent advances in large language models (LLMs) have significantly enhanced question-answering (QA) capabilities, particularly in open-domain contexts. However, in closed-domain scenarios such as education, healthcare, and law, users demand not only accurate answers but also transparent reasoning and explainable decision-making processes. While neural-symbolic (NeSy) frameworks have emerged as a promising solution, leveraging LLMs for natural language understanding and symbolic systems for formal reasoning, existing approaches often rely on large-scale models and exhibit inefficiencies in translating natural language into formal logic representations.
To address these limitations, we introduce Text-JEPA (Text-based Joint-Embedding Predictive Architecture), a lightweight yet effective framework for converting natural language into first-order logic (NL2FOL). Drawing inspiration from dual-system cognitive theory, Text-JEPA emulates System 1 by efficiently generating logic representations, while the Z3 solver operates as System 2, enabling robust logical inference. To rigorously evaluate the NL2FOL-to-reasoning pipeline, we propose a comprehensive evaluation framework comprising three custom metrics: conversion score, reasoning score, and Spearman rho score, which collectively capture the quality of logical translation and its downstream impact on reasoning accuracy.
Empirical results on domain-specific datasets demonstrate that Text-JEPA achieves competitive performance with significantly lower computational overhead compared to larger LLM-based systems. Our findings highlight the potential of structured, interpretable reasoning frameworks for building efficient and explainable QA systems in specialized domains.