Efficient Inference Using Large Language Models with Limited Human Data: Fine-Tuning then Rectification

📅 2025-11-23

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

To address the insufficient social science reasoning performance of large language models (LLMs) under limited human-annotated data, this paper proposes a two-stage “fine-tune-then-calibrate” optimization framework. Methodologically, it innovatively formulates fine-tuning as minimization of prediction error variance and dynamically allocates training versus calibration samples via empirically grounded scaling laws, enabling data-driven optimal resource allocation. The framework jointly leverages parameter fine-tuning and output bias calibration to suppress reasoning biases while preserving model generalizability. Experimental results demonstrate that, compared to standalone fine-tuning or calibration, our approach achieves average improvements of 12.7% in estimation accuracy and 23.4% in inference stability across downstream tasks—including causal inference and policy effect estimation—under low-data regimes. This work establishes an interpretable, reproducible paradigm for small-sample AI modeling in social science research.

Technology Category

Application Category

📝 Abstract

Driven by recent advances in artificial intelligence (AI), a growing body of work demonstrates the potential of using large language models (LLMs) to generate human-like responses in market research and social science applications. Two primary approaches can be applied to improve the performance of LLMs: fine-tuning, which aligns LLM predictions more closely with human responses, and rectification, which corrects biases in LLM outputs. In this paper, we develop a framework that combines fine-tuning and rectification, and optimally allocates limited labeled samples across the two stages. Unlike the conventional objective that minimizes the mean squared prediction errors, we propose to minimize the variance of the prediction errors as the fine-tuning objective, which is optimal for the downstream rectification stage. Building on this insight, we leverage empirical scaling laws to develop a data-driven method for optimally splitting samples between the fine-tuning and rectification stages. Empirical analysis validates our framework, demonstrating improved estimation and inference performance compared to using either fine-tuning or rectification alone.

Problem

Research questions and friction points this paper is trying to address.

Optimizing sample allocation between fine-tuning and rectification stages

Minimizing prediction error variance for improved downstream rectification

Enhancing LLM inference with limited human data via combined framework

Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines fine-tuning and rectification stages

Minimizes variance of prediction errors

Optimally splits samples between stages

🔎 Similar Papers

No similar papers found.