LIFT: Last-Mile Fine-Tuning for Table Explicitation

📅 2026-05-13

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

This work addresses the challenge of extracting structured tables from unstructured clipboard text, which is hindered by data scarcity and highly variable input formats. The authors propose a “last-mile” fine-tuning paradigm: a pretrained large language model first generates an initial table draft, which is then refined by a smaller fine-tuned language model (1B–24B parameters) optimized using Tree-Edit Distance-based Similarity (TEDS). This cascaded architecture achieves superior performance with only a few thousand training samples, outperforming end-to-end fine-tuning approaches by up to 0.144 TEDS points on a benchmark of 2,596 tables. The method demonstrates markedly enhanced robustness to input variations, particularly excelling in low-resource settings where data availability is limited.

📝 Abstract

We propose last-mile fine-tuning, or Lift, a pipeline in which a pre-trained large language model extracts an initial table from unstructured clipboard text, and a fine-tuned small language model (1B-24B parameters SLM) repairs errors in the extracted table. On a benchmark of 2,596 tables from three datasets, Lift matches or exceeds end-to-end SLM fine-tuning on tree-edit-distance-based similarity (TEDS) metric while requiring as little as 1,000 training examples - where it outperforms end-to-end fine-tuning by up to 0.144 TEDS points. We term this approach last-mile fine-tuning and show it also more robust to input format variability. Comparisons with self-debug and end-to-end fine-tuning approaches show that last-mile fine-tuning provides an attractive option when training data is limited or when robustness to input variation is sought without compromising on accuracy.

Problem

Research questions and friction points this paper is trying to address.

table explicitation

last-mile fine-tuning

structured data extraction

input format variability

limited training data

Innovation

Methods, ideas, or system contributions that make the work stand out.

last-mile fine-tuning

table extraction

small language model