Composable Cross-prompt Essay Scoring by Merging Models

📅 2025-05-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In cross-prompt automated essay scoring (AES), existing joint-training approaches require repeated access to source-domain data, raising privacy concerns and computational inefficiency. To address this, we propose a source-data-free parameter fusion framework: individual prompt-specific models are first trained independently; then, cross-prompt adaptation is achieved via linear combination of task vectors. We introduce Prior Information Maximization (PIM), an unsupervised objective that encourages preservation of prompt-invariant linguistic knowledge, and employ Bayesian optimization to efficiently determine optimal fusion coefficients. Crucially, our method eliminates the need for source data re-access, enhancing both privacy guarantees and deployment flexibility. Experiments across multiple AES benchmarks demonstrate consistent superiority over full-source joint training baselines; under strong distributional shifts, our approach outperforms state-of-the-art cross-prompt methods, achieving superior robustness and computational efficiency.

Technology Category

Application Category

📝 Abstract
Recent advances in cross-prompt automated essay scoring (AES) typically train models jointly on all source prompts, often requiring additional access to unlabeled target prompt essays simultaneously. However, using all sources is suboptimal in our pilot study, and re-accessing source datasets during adaptation raises privacy concerns. We propose a source-free adaptation approach that selectively merges individually trained source models' parameters instead of datasets. In particular, we simulate joint training through linear combinations of task vectors -- the parameter updates from fine-tuning. To optimize the combination's coefficients, we propose Prior-encoded Information Maximization (PIM), an unsupervised objective which promotes the model's score discriminability regularized by priors pre-computed from the sources. We employ Bayesian optimization as an efficient optimizer of PIM. Experimental results with LLMs on in-dataset and cross-dataset adaptation show that our method (1) consistently outperforms training jointly on all sources, (2) maintains superior robustness compared to other merging methods, (3) excels under severe distribution shifts where recent leading cross-prompt methods struggle, all while retaining computational efficiency.
Problem

Research questions and friction points this paper is trying to address.

Improves cross-prompt essay scoring without joint training
Addresses privacy by merging models not datasets
Enhances robustness under severe distribution shifts
Innovation

Methods, ideas, or system contributions that make the work stand out.

Source-free adaptation via model parameter merging
Prior-encoded Information Maximization optimization
Bayesian optimization for efficient coefficient tuning
🔎 Similar Papers
No similar papers found.