🤖 AI Summary
This work addresses the problem of learning probabilistic predictive models from supervisory signals given as possibility distributions in multiclass classification tasks. The authors propose transforming input possibility distributions into sets of probability distributions that satisfy compatibility and linear shape constraints, thereby introducing— for the first time—structured probabilistic set modeling for possibility-based supervision. Model outputs are efficiently calibrated to these sets via Kullback–Leibler projection, implemented through Dykstra’s algorithm combined with Bregman projections, ensuring minimal adjustment while preserving the ordinal relationships among class possibilities. Empirical evaluations on both synthetic data and the ChaosNLI natural language inference benchmark demonstrate the method’s effectiveness in achieving strong predictive performance and faithful order-preserving calibration.
📝 Abstract
We consider learning with possibilistic supervision for multi-class classification. For each training instance, the supervision is a normalized possibility distribution that expresses graded plausibility over the classes. From this possibility distribution, we construct a non-empty closed convex set of admissible probability distributions by combining two requirements: probabilistic compatibility with the possibility and necessity measures induced by the possibility distribution, and linear shape constraints that must be satisfied to preserve the qualitative structure of the possibility distribution. Thus, classes with the same possibility degree receive equal probabilities, and if a class has a strictly larger possibility degree than another class, then it receives a strictly larger probability. Given a strictly positive probability vector output by a model for an instance, we compute its Kullback-Leibler projection onto the admissible set. This projection yields the closest admissible probability distribution in Kullback-Leibler sense. We can then train the model by minimizing the divergence between the prediction and its projection, which quantifies the smallest adjustment needed to satisfy the induced dominance and shape constraints. The projection is computed with Dykstra's algorithm using Bregman projections associated with the negative entropy, and we provide explicit formulas for the projections onto each constraint set. Experiments conducted on synthetic data and on a real-world natural language inference task, based on the ChaosNLI dataset, show that the proposed projection algorithm is efficient enough for practical use, and that the resulting projection-based learning objective can improve predictive performance.