🤖 AI Summary
Scientific literature is vast, highly specialized, and heterogeneously reported, rendering manual extraction of interventions inefficient and error-prone. To address this, we propose LLM-Duo, a novel dual-agent system that introduces ontology-guided priority breadth-first search (BFS) for prompt generation and an explorer-evaluator adversarial collaboration paradigm, augmented by the Progressive Ontology Prompting (POP) algorithm. Integrating large language models (LLMs), ontology engineering, and multi-agent collaborative reasoning, our approach enables automated, structured knowledge discovery. Applied to speech-language therapy, it precisely identifies 2,421 interventions from 64,000 publications, constructing and open-sourcing the first domain-specific intervention knowledge base. This significantly enhances the reliability, completeness, and reproducibility of intervention knowledge discovery.
📝 Abstract
To address the challenge of automating knowledge discovery from a vast volume of literature, in this paper, we introduce a novel framework based on large language models (LLMs) that combines a progressive ontology prompting (POP) algorithm with a dual-agent system, named LLM-Duo, designed to enhance the automation of knowledge extraction from scientific articles. The POP algorithm utilizes a prioritized breadth-first search (BFS) across a predefined ontology to generate structured prompt templates and action orders, thereby guiding LLMs to discover knowledge in an automatic manner. Additionally, our LLM-Duo employs two specialized LLM agents: an explorer and an evaluator. These two agents work collaboratively and adversarially to enhance the reliability of the discovery and annotation processes. Experiments demonstrate that our method outperforms advanced baselines, enabling more accurate and complete annotations. To validate the effectiveness of our method in real-world scenarios, we employ our method in a case study of speech-language intervention discovery. Our method identifies 2,421 interventions from 64,177 research articles in the speech-language therapy domain. We curate these findings into a publicly accessible intervention knowledge base that holds significant potential to benefit the speech-language therapy community.