🤖 AI Summary
Information extraction (IE) on resource-constrained edge devices faces critical challenges with large language models (LLMs), including severe hallucination, limited context length, and high inference latency—especially when dynamically adapting to diverse extraction schemas. This work proposes Dual-LoRA, a dual-module architecture coupled with an incremental schema caching mechanism, enabling the first decoupled design of schema identification and information extraction. It supports real-time, accurate adaptation of edge-deployed LLMs to hundreds of distinct schemas. The method integrates LoRA-based lightweight fine-tuning, two-stage efficient inference, and an edge-optimized inference engine. Experiments across multiple IE benchmarks demonstrate consistent improvements: +3.2–5.8 F1 points, 2.1× faster inference speed, and 37% reduction in memory footprint compared to baseline approaches.
📝 Abstract
Information extraction (IE) plays a crucial role in natural language processing (NLP) by converting unstructured text into structured knowledge. Deploying computationally intensive large language models (LLMs) on resource-constrained devices for information extraction is challenging, particularly due to issues like hallucinations, limited context length, and high latency-especially when handling diverse extraction schemas. To address these challenges, we propose a two-stage information extraction approach adapted for on-device LLMs, called Dual-LoRA with Incremental Schema Caching (DLISC), which enhances both schema identification and schema-aware extraction in terms of effectiveness and efficiency. In particular, DLISC adopts an Identification LoRA module for retrieving the most relevant schemas to a given query, and an Extraction LoRA module for performing information extraction based on the previously selected schemas. To accelerate extraction inference, Incremental Schema Caching is incorporated to reduce redundant computation, substantially improving efficiency. Extensive experiments across multiple information extraction datasets demonstrate notable improvements in both effectiveness and efficiency.