ProUIE: A Macro-to-Micro Progressive Learning Method for LLM-based Universal Information Extraction

๐Ÿ“… 2026-04-12
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

200K/year
๐Ÿค– AI Summary
Existing large-model-driven general information extraction methods often rely on external resources, suffer from training complexity, and yield limited performance gains. This work proposes a macroโ€“mesoโ€“micro progressive learning framework that requires no external data and enhances unified extraction capability through three stages: Complete Modeling (CM), Simplified Alignment (SA), and Deep Exploration (DE) powered by GRPO with Stepwise Fine-grained Rewards (SFR). The approach introduces, for the first time, a structure-unit-level fine-grained reward mechanism. Evaluated across 36 public datasets, it significantly outperforms strong baselines, achieving state-of-the-art average performance in named entity recognition and relation extraction with a smaller backbone model, and demonstrates tangible benefits in large-scale production environments.

Technology Category

Application Category

๐Ÿ“ Abstract
LLM-based universal information extraction (UIE) methods often rely on additional information beyond the original training data, which increases training complexity yet often yields limited gains. To address this, we propose ProUIE, a Macro-to-Micro progressive learning approach that improves UIE without introducing any external information. ProUIE consists of three stages: (i) macro-level Complete Modeling (CM), which learns NER, RE, and EE along their intrinsic difficulty order on the full training data to build a unified extraction foundation, (ii) meso-level Streamlined Alignment (SA), which operates on sampled data with simplified target formats, streamlining and regularizing structured outputs to make them more concise and controllable, and (iii) micro-level Deep Exploration (DE), which applies GRPO with stepwise fine-grained rewards (SFR) over structural units to guide exploration and improve performance. Experiments on 36 public datasets show that ProUIE consistently improves unified extraction, outperforming strong instruction-tuned baselines on average for NER and RE while using a smaller backbone, and it further demonstrates clear gains in large-scale production-oriented information extraction.
Problem

Research questions and friction points this paper is trying to address.

Universal Information Extraction
LLM-based UIE
Training Complexity
External Information
Unified Extraction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Progressive Learning
Universal Information Extraction
Macro-to-Micro
GRPO
Stepwise Fine-grained Rewards
๐Ÿ”Ž Similar Papers
W
Wenda Liu
Xiaomi Corporation, Beijing, China
Z
Zhigang Song
Xiaomi Corporation, Beijing, China
S
Shuai Nie
Xiaomi Corporation, Beijing, China
Guangyao Liu
Guangyao Liu
Huawei
Photonics Integrated CircuitsTransceiversLong-hual System
L
Lisung Chen
Xiaomi Corporation, Beijing, China
B
Binyu Yang
Xiaomi Corporation, Beijing, China
Y
Yaran Chen
Xiโ€™an Jiaotong-Liverpool University, Suzhou, China
P
Peng Zhou
Xiaomi Corporation, Beijing, China
H
Hongzhen Wang
Xiaomi Corporation, Beijing, China
Y
Yuchen Liu
Xiaomi Corporation, Beijing, China
W
Wenyue Hu
Xiaomi Corporation, Beijing, China
Jiaming Xu
Jiaming Xu
Xiaomi Corp.; before at CASIA
Speech and Language ProcessingSpeech SeparationDialogue System
R
Runyu Shi
Xiaomi Corporation, Beijing, China
Y
Ying Huang
Xiaomi Corporation, Beijing, China