Multi-LLM Adaptive Conformal Inference for Reliable LLM Responses

📅 2026-02-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the critical challenge of ensuring factual reliability in large language models (LLMs) within high-stakes applications, where existing conformal inference methods often prove either overly conservative or ill-equipped to handle complex grouping structures, leading to excessive rejection of valid statements. To overcome these limitations, the authors propose a novel multi-LLM adaptive conformal inference framework that models factuality as a product of statement-level scores and leverages ensemble scoring across multiple LLMs to enhance accuracy. The approach further incorporates grouped conditional calibration and an adaptive filtering mechanism, which jointly maximize the retention of true statements while strictly adhering to user-specified coverage guarantees and reducing computational overhead. Experimental results demonstrate that the method achieves superior performance over current baselines without compromising statistical validity.

Technology Category

Application Category

📝 Abstract
Ensuring factuality is essential for the safe use of Large Language Models (LLMs) in high-stakes domains such as medicine and law. Conformal inference provides distribution-free guarantees, but existing approaches are either overly conservative, discarding many true-claims, or rely on adaptive error rates and simple linear models that fail to capture complex group structures. To address these challenges, we reformulate conformal inference in a multiplicative filtering setting, modeling factuality as a product of claim-level scores. Our method, Multi-LLM Adaptive Conformal Inference (MACI), leverages ensembles to produce more accurate factuality-scores, which in our experiments led to higher retention, while validity is preserved through group-conditional calibration. Experiments show that MACI consistently achieves user-specified coverage with substantially higher retention and lower time cost than baselines. Our repository is available at https://github.com/MLAI-Yonsei/MACI
Problem

Research questions and friction points this paper is trying to address.

factuality
conformal inference
Large Language Models
adaptive error rates
group structures
Innovation

Methods, ideas, or system contributions that make the work stand out.

Conformal Inference
Large Language Models
Factuality Calibration
Ensemble Methods
Group-Conditional Coverage
🔎 Similar Papers
No similar papers found.
K
Kangjun Noh
Department of Applied Statistics and Data Science, Yonsei University
S
Seongchan Lee
Department of Mathematical Sciences, KAIST
Ilmun Kim
Ilmun Kim
Yonsei University
Statistics
Kyungwoo Song
Kyungwoo Song
Yonsei University
Machine LearningDeep LearningNeural Networks