LIFT: Last-Mile Fine-Tuning for Table Explicitation

πŸ“… 2026-05-13
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

184K/year
πŸ€– AI Summary
This work addresses the challenge of extracting structured tables from unstructured clipboard text, which is hindered by data scarcity and highly variable input formats. The authors propose a β€œlast-mile” fine-tuning paradigm: a pretrained large language model first generates an initial table draft, which is then refined by a smaller fine-tuned language model (1B–24B parameters) optimized using Tree-Edit Distance-based Similarity (TEDS). This cascaded architecture achieves superior performance with only a few thousand training samples, outperforming end-to-end fine-tuning approaches by up to 0.144 TEDS points on a benchmark of 2,596 tables. The method demonstrates markedly enhanced robustness to input variations, particularly excelling in low-resource settings where data availability is limited.
πŸ“ Abstract
We propose last-mile fine-tuning, or Lift, a pipeline in which a pre-trained large language model extracts an initial table from unstructured clipboard text, and a fine-tuned small language model (1B-24B parameters SLM) repairs errors in the extracted table. On a benchmark of 2,596 tables from three datasets, Lift matches or exceeds end-to-end SLM fine-tuning on tree-edit-distance-based similarity (TEDS) metric while requiring as little as 1,000 training examples - where it outperforms end-to-end fine-tuning by up to 0.144 TEDS points. We term this approach last-mile fine-tuning and show it also more robust to input format variability. Comparisons with self-debug and end-to-end fine-tuning approaches show that last-mile fine-tuning provides an attractive option when training data is limited or when robustness to input variation is sought without compromising on accuracy.
Problem

Research questions and friction points this paper is trying to address.

table explicitation
last-mile fine-tuning
structured data extraction
input format variability
limited training data
Innovation

Methods, ideas, or system contributions that make the work stand out.

last-mile fine-tuning
table extraction
small language model
robustness
TEDS
πŸ”Ž Similar Papers