Information Extraction from Conversation Transcripts: Neuro-Symbolic vs. LLM

📅 2025-10-13
📈 Citations: 0
✨ Influential: 0
📄 PDF
🤖 AI Summary
Information extraction (IE) research increasingly relies on large language models (LLMs), often overlooking the robustness, interpretability, and efficiency of traditional symbolic/statistical methods. Method: This work conducts the first systematic comparative study between neuro-symbolic (NS) systems and LLMs on agricultural dialogue transcripts—spanning pork, dairy, and crop subdomains—evaluating performance (F1), inference efficiency, and controllability. Contribution/Results: While LLMs achieve higher overall F1 (69.4) and core IE F1 (63.0) than NS systems (52.7 and 47.2, respectively), NS approaches excel in inference speed, rule transparency, and human intervenability. The study uncovers critical “hidden costs” in NLP deployment—particularly latency, opacity, and inflexibility—and proposes a triadic trade-off framework balancing performance, efficiency, and control. It establishes the first empirical benchmark and methodological guidance for technology selection in domain-specific IE applications.

Technology Category

Application Category

📝 Abstract
The current trend in information extraction (IE) is to rely extensively on large language models, effectively discarding decades of experience in building symbolic or statistical IE systems. This paper compares a neuro-symbolic (NS) and an LLM-based IE system in the agricultural domain, evaluating them on nine interviews across pork, dairy, and crop subdomains. The LLM-based system outperforms the NS one (F1 total: 69.4 vs. 52.7; core: 63.0 vs. 47.2), where total includes all extracted information and core focuses on essential details. However, each system has trade-offs: the NS approach offers faster runtime, greater control, and high accuracy in context-free tasks but lacks generalizability, struggles with contextual nuances, and requires significant resources to develop and maintain. The LLM-based system achieves higher performance, faster deployment, and easier maintenance but has slower runtime, limited control, model dependency and hallucination risks. Our findings highlight the "hidden cost" of deploying NLP systems in real-world applications, emphasizing the need to balance performance, efficiency, and control.
Problem

Research questions and friction points this paper is trying to address.

Comparing neuro-symbolic and LLM-based information extraction systems
Evaluating performance trade-offs in agricultural conversation transcripts
Balancing efficiency, control, and accuracy in real-world NLP deployment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Neuro-symbolic system for controlled, fast information extraction
LLM-based system for higher performance and easier maintenance
Comparison highlighting trade-offs between performance and control
🔎 Similar Papers
No similar papers found.
Alice Saebom Kwak
Alice Saebom Kwak
Department of Linguistics, University of Arizona
M
Maria Alexeeva
Lum AI
G
Gus Hahn-Powell
Lum AI
K
Keith Alcock
Lum AI
K
Kevin McLaughlin
Lum AI
D
Doug McCorkle
Eocene Environmental Group
G
Gabe McNunn
Eocene Environmental Group
Mihai Surdeanu
Mihai Surdeanu
Professor, Computer Science, University of Arizona
natural language processingapplied machine learningartificial intelligence