🤖 AI Summary
Automatic extraction and interpretation of implicit analogical knowledge—particularly cross-modal metaphors—in natural language remains challenging due to poor interpretability and limited grounding in embodied experience. Method: This paper proposes Logic-Augmented Generation (LAG), the first framework integrating structured semantic knowledge graphs with generative reasoning via logic-guided, multi-stage prompting. LAG enables traceable, cross-domain metaphor-level analogy modeling while mitigating large language models’ lack of physical-world experience. It comprises knowledge graph construction, multimodal metaphor detection, and interpretability-aware metaphor understanding evaluation. Results: LAG achieves state-of-the-art performance across three metaphor-related tasks on four benchmarks; its visual metaphor understanding accuracy surpasses human-level performance; and it delivers end-to-end interpretable reasoning paths. A current limitation is reduced generalizability to domain-specific metaphors.
📝 Abstract
Recent advances in Large Language Models have demonstrated their capabilities across a variety of tasks. However, automatically extracting implicit knowledge from natural language remains a significant challenge, as machines lack active experience with the physical world. Given this scenario, semantic knowledge graphs can serve as conceptual spaces that guide the automated text generation reasoning process to achieve more efficient and explainable results. In this paper, we apply a logic-augmented generation (LAG) framework that leverages the explicit representation of a text through a semantic knowledge graph and applies it in combination with prompt heuristics to elicit implicit analogical connections. This method generates extended knowledge graph triples representing implicit meaning, enabling systems to reason on unlabeled multimodal data regardless of the domain. We validate our work through three metaphor detection and understanding tasks across four datasets, as they require deep analogical reasoning capabilities. The results show that this integrated approach surpasses current baselines, performs better than humans in understanding visual metaphors, and enables more explainable reasoning processes, though still has inherent limitations in metaphor understanding, especially for domain-specific metaphors. Furthermore, we propose a thorough error analysis, discussing issues with metaphorical annotations and current evaluation methods.