🤖 AI Summary
In cross-prompt automated essay scoring (AES), existing joint-training approaches require repeated access to source-domain data, raising privacy concerns and computational inefficiency. To address this, we propose a source-data-free parameter fusion framework: individual prompt-specific models are first trained independently; then, cross-prompt adaptation is achieved via linear combination of task vectors. We introduce Prior Information Maximization (PIM), an unsupervised objective that encourages preservation of prompt-invariant linguistic knowledge, and employ Bayesian optimization to efficiently determine optimal fusion coefficients. Crucially, our method eliminates the need for source data re-access, enhancing both privacy guarantees and deployment flexibility. Experiments across multiple AES benchmarks demonstrate consistent superiority over full-source joint training baselines; under strong distributional shifts, our approach outperforms state-of-the-art cross-prompt methods, achieving superior robustness and computational efficiency.
📝 Abstract
Recent advances in cross-prompt automated essay scoring (AES) typically train models jointly on all source prompts, often requiring additional access to unlabeled target prompt essays simultaneously. However, using all sources is suboptimal in our pilot study, and re-accessing source datasets during adaptation raises privacy concerns. We propose a source-free adaptation approach that selectively merges individually trained source models' parameters instead of datasets. In particular, we simulate joint training through linear combinations of task vectors -- the parameter updates from fine-tuning. To optimize the combination's coefficients, we propose Prior-encoded Information Maximization (PIM), an unsupervised objective which promotes the model's score discriminability regularized by priors pre-computed from the sources. We employ Bayesian optimization as an efficient optimizer of PIM. Experimental results with LLMs on in-dataset and cross-dataset adaptation show that our method (1) consistently outperforms training jointly on all sources, (2) maintains superior robustness compared to other merging methods, (3) excels under severe distribution shifts where recent leading cross-prompt methods struggle, all while retaining computational efficiency.