OneKE: A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System

📅 2024-12-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address poor generalizability, schema adaptation difficulty, and high maintenance costs in cross-domain (e.g., scientific and news) unstructured text (web pages/PDFs) knowledge extraction, this paper proposes a schema-guided multi-agent knowledge extraction system. Methodologically, it introduces a large language model–based multi-role collaborative agent architecture, integrating schema-guided prompt engineering, knowledge-base–driven iterative optimization, and containerized deployment. It pioneers dynamic schema adaptation and a closed-loop debugging mechanism to enable end-to-end structured knowledge generation. Extensive evaluation on multiple benchmark datasets demonstrates significant improvements in generalizability and robustness over state-of-the-art baselines. The system is open-sourced with a functional demo, confirming its practical deployability and real-world applicability.

Technology Category

Application Category

📝 Abstract
We introduce OneKE, a dockerized schema-guided knowledge extraction system, which can extract knowledge from the Web and raw PDF Books, and support various domains (science, news, etc.). Specifically, we design OneKE with multiple agents and a configure knowledge base. Different agents perform their respective roles, enabling support for various extraction scenarios. The configure knowledge base facilitates schema configuration, error case debugging and correction, further improving the performance. Empirical evaluations on benchmark datasets demonstrate OneKE's efficacy, while case studies further elucidate its adaptability to diverse tasks across multiple domains, highlighting its potential for broad applications. We have open-sourced the Code at https://github.com/zjunlp/OneKE and released a Video at http://oneke.openkg.cn/demo.mp4.
Problem

Research questions and friction points this paper is trying to address.

Knowledge Extraction
Efficiency
Accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Docker
Large Language Models (LLM)
Intelligent Agents
🔎 Similar Papers
No similar papers found.
Y
Yujie Luo
Zhejiang University, ZJU-Ant Group Joint Research Center for Knowledge Graphs, Hangzhou, China
X
Xiangyuan Ru
Zhejiang University, ZJU-Ant Group Joint Research Center for Knowledge Graphs, Hangzhou, China
Kangwei Liu
Kangwei Liu
Institute of Information Engineering, Chinese Academy of Sciences
Audio-driven Talking Face GenerationFacial Animation
L
Lin Yuan
ZJU-Ant Group Joint Research Center for Knowledge Graphs, Ant Group, Hangzhou, China
Mengshu Sun
Mengshu Sun
Beijing University of Technology
Deep LearningModel Compression and Acceleration
Ningyu Zhang
Ningyu Zhang
Ph.D. Student, Vanderbilt University
artificial intelligencelearning analyticslearning environments
Lei Liang
Lei Liang
Ant Group
Knowledge GraphAI
Z
Zhiqiang Zhang
ZJU-Ant Group Joint Research Center for Knowledge Graphs, Ant Group, Hangzhou, China
J
Jun Zhou
ZJU-Ant Group Joint Research Center for Knowledge Graphs, Ant Group, Hangzhou, China
L
Lanning Wei
ZJU-Ant Group Joint Research Center for Knowledge Graphs, Ant Group, Hangzhou, China
Da Zheng
Da Zheng
Amazon
High-performance computingData-intensive computingLarge-scale machine learningGraph neural networks
Haofen Wang
Haofen Wang
Tongji University
Knowledge GraphNatural Language ProcessingRetrieval Augmented Generation
H
Huajun Chen
Zhejiang University, ZJU-Ant Group Joint Research Center for Knowledge Graphs, Hangzhou, China