🤖 AI Summary
In hierarchical classification, conventional heuristic decoding rules (e.g., maximum probability, nearest ancestor) are misaligned with hierarchical evaluation metrics such as hFβ, leading to suboptimal predictions.
Method: This paper proposes a target-metric-oriented posterior-optimal decision framework that unifies node-level and subset-level prediction. It derives, for the first time, the theoretically optimal decoding rule for hFβ and related hierarchical metrics, and designs a general algorithm combining dynamic programming and set optimization—guaranteeing optimality at the node level and significantly improving reliability under underdetermined conditions. The method jointly incorporates hierarchical structural constraints and metric-driven loss minimization.
Results: Experiments on multiple benchmark datasets demonstrate consistent superiority over heuristic decoders, particularly for low-confidence samples, with substantial gains in hFβ and other hierarchical metrics.
📝 Abstract
Hierarchical classification offers an approach to incorporate the concept of mistake severity by leveraging a structured, labeled hierarchy. However, decoding in such settings frequently relies on heuristic decision rules, which may not align with task-specific evaluation metrics. In this work, we propose a framework for the optimal decoding of an output probability distribution with respect to a target metric. We derive optimal decision rules for increasingly complex prediction settings, providing universal algorithms when candidates are limited to the set of nodes. In the most general case of predicting a subset of nodes, we focus on rules dedicated to the hierarchical $hF_{eta}$ scores, tailored to hierarchical settings. To demonstrate the practical utility of our approach, we conduct extensive empirical evaluations, showcasing the superiority of our proposed optimal strategies, particularly in underdetermined scenarios. These results highlight the potential of our methods to enhance the performance and reliability of hierarchical classifiers in real-world applications. The code is available at https://github.com/RomanPlaud/hierarchical_decision_rules