Information Extraction from Conversation Transcripts: Neuro-Symbolic vs. LLM

📅 2025-10-13

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

Information extraction (IE) research increasingly relies on large language models (LLMs), often overlooking the robustness, interpretability, and efficiency of traditional symbolic/statistical methods. Method: This work conducts the first systematic comparative study between neuro-symbolic (NS) systems and LLMs on agricultural dialogue transcripts—spanning pork, dairy, and crop subdomains—evaluating performance (F1), inference efficiency, and controllability. Contribution/Results: While LLMs achieve higher overall F1 (69.4) and core IE F1 (63.0) than NS systems (52.7 and 47.2, respectively), NS approaches excel in inference speed, rule transparency, and human intervenability. The study uncovers critical “hidden costs” in NLP deployment—particularly latency, opacity, and inflexibility—and proposes a triadic trade-off framework balancing performance, efficiency, and control. It establishes the first empirical benchmark and methodological guidance for technology selection in domain-specific IE applications.

Technology Category

Application Category

📝 Abstract

The current trend in information extraction (IE) is to rely extensively on large language models, effectively discarding decades of experience in building symbolic or statistical IE systems. This paper compares a neuro-symbolic (NS) and an LLM-based IE system in the agricultural domain, evaluating them on nine interviews across pork, dairy, and crop subdomains. The LLM-based system outperforms the NS one (F1 total: 69.4 vs. 52.7; core: 63.0 vs. 47.2), where total includes all extracted information and core focuses on essential details. However, each system has trade-offs: the NS approach offers faster runtime, greater control, and high accuracy in context-free tasks but lacks generalizability, struggles with contextual nuances, and requires significant resources to develop and maintain. The LLM-based system achieves higher performance, faster deployment, and easier maintenance but has slower runtime, limited control, model dependency and hallucination risks. Our findings highlight the "hidden cost" of deploying NLP systems in real-world applications, emphasizing the need to balance performance, efficiency, and control.

Problem

Research questions and friction points this paper is trying to address.

Comparing neuro-symbolic and LLM-based information extraction systems

Evaluating performance trade-offs in agricultural conversation transcripts

Balancing efficiency, control, and accuracy in real-world NLP deployment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Neuro-symbolic system for controlled, fast information extraction

LLM-based system for higher performance and easier maintenance

Comparison highlighting trade-offs between performance and control

🔎 Similar Papers

No similar papers found.