N2C2: Nearest Neighbor Enhanced Confidence Calibration for Cross-Lingual In-Context Learning

📅 2025-03-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In cross-lingual in-context learning (ICL), model prediction confidence is severely miscalibrated, resulting in low accuracy and high calibration error (e.g., Expected Calibration Error, ECE). To address this, we propose the first nearest-neighbor–enhanced confidence calibration framework for cross-lingual sentiment classification. Our method constructs a semantically aligned few-shot datastore, integrates semantic-consistent retrieval, confidence-aware distribution modeling, and adaptive neighbor-weighted aggregation—enabling high-fidelity calibration under few-shot settings. Crucially, it requires no fine-tuning or gradient updates; calibration is performed solely via inference-time dynamic retrieval and fusion of semantically proximal examples. Evaluated on two multilingual sentiment datasets, our approach substantially outperforms standard ICL, full fine-tuning, prompt tuning, and state-of-the-art methods—achieving up to 2.1% higher accuracy and reducing ECE by up to 47.3%. This work establishes the first efficient, lightweight, and strongly calibrated inference paradigm for cross-lingual ICL.

Technology Category

Application Category

📝 Abstract
Recent advancements of in-context learning (ICL) show language models can significantly improve their performance when demonstrations are provided. However, little attention has been paid to model calibration and prediction confidence of ICL in cross-lingual scenarios. To bridge this gap, we conduct a thorough analysis of ICL for cross-lingual sentiment classification. Our findings suggest that ICL performs poorly in cross-lingual scenarios, exhibiting low accuracy and presenting high calibration errors. In response, we propose a novel approach, N2C2, which employs a -nearest neighbors augmented classifier for prediction confidence calibration. N2C2 narrows the prediction gap by leveraging a datastore of cached few-shot instances. Specifically, N2C2 integrates the predictions from the datastore and incorporates confidence-aware distribution, semantically consistent retrieval representation, and adaptive neighbor combination modules to effectively utilize the limited number of supporting instances. Evaluation on two multilingual sentiment classification datasets demonstrates that N2C2 outperforms traditional ICL. It surpasses fine tuning, prompt tuning and recent state-of-the-art methods in terms of accuracy and calibration errors.
Problem

Research questions and friction points this paper is trying to address.

Improves cross-lingual sentiment classification accuracy
Reduces calibration errors in in-context learning
Enhances prediction confidence with nearest neighbor approach
Innovation

Methods, ideas, or system contributions that make the work stand out.

Nearest neighbor augmented classifier for calibration
Datastore of cached few-shot instances integration
Confidence-aware distribution and adaptive neighbor combination
🔎 Similar Papers
No similar papers found.