🤖 AI Summary
This work addresses the challenge of conformal risk control (CRC) under non-monotonic loss functions, where existing methods struggle to guarantee that the expected loss remains below a user-specified threshold. Under a distribution-free setting with a finite calibration grid, the authors establish a theoretical relationship between sample size and grid resolution by integrating importance weighting, Lipschitz continuity, and a structural monotonicity assumption. They derive a finite-sample optimal upper bound on the risk and prove that the excess risk converges at the minimax-optimal rate of $\sqrt{\log(m)/n}$. The approach is further extended to handle distribution shift scenarios. Empirical evaluations demonstrate that the proposed method achieves more stable and reliable risk control in multi-label classification and object detection tasks.
📝 Abstract
Conformal risk control (CRC) provides distribution-free guarantees for controlling the expected loss at a user-specified level. Existing theory typically assumes that the loss decreases monotonically with a tuning parameter that governs the size of the prediction set. This assumption is often violated in practice, where losses may behave non-monotonically due to competing objectives such as coverage and efficiency.
We study CRC under non-monotone loss functions when the tuning parameter is selected from a finite grid, a common scenario in thresholding or discretized decision rules. Revisiting a known counterexample, we show that the validity of CRC without monotonicity depends on the relationship between the calibration sample size and the grid resolution. In particular, risk control can still be achieved when the calibration sample is sufficiently large relative to the grid.
We provide a finite-sample guarantee for bounded losses over a grid of size $m$, showing that the excess risk above the target level $α$ is of order $\sqrt{\log(m)/n}$, where $n$ is the calibration sample size. A matching lower bound shows that this rate is minimax optimal. We also derive refined guarantees under additional structural conditions, including Lipschitz continuity and monotonicity, and extend the analysis to settings with distribution shift via importance weighting.
Numerical experiments on synthetic multilabel classification and real object detection data illustrate the practical impact of non-monotonicity. Methods that account for finite-sample deviations achieve more stable risk control than approaches based on monotonicity transformations, while maintaining competitive prediction-set sizes.