Top-$k$ Feature Importance Ranking

📅 2025-09-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Accurately identifying and ranking the top-k most important features remains a fundamental challenge in interpretable machine learning; existing methods typically rely on post-hoc transformations and lack direct optimization for top-k ranking or theoretical guarantees. This paper introduces RAMPART, the first framework explicitly designed for top-k feature ranking. It employs adaptive sequential halving coupled with recursive pruning to directly optimize ranking objectives. To enhance stability and computational efficiency, RAMPART integrates observational and feature-subset sampling via a novel MiniPatches ensemble mechanism. Theoretical analysis establishes high-probability correctness of its top-k selections. Extensive experiments on diverse synthetic benchmarks and high-dimensional genomics tasks demonstrate that RAMPART consistently outperforms state-of-the-art feature importance methods, achieving superior accuracy and robustness in top-k feature identification.

Technology Category

Application Category

📝 Abstract
Accurate ranking of important features is a fundamental challenge in interpretable machine learning with critical applications in scientific discovery and decision-making. Unlike feature selection and feature importance, the specific problem of ranking important features has received considerably less attention. We introduce RAMPART (Ranked Attributions with MiniPatches And Recursive Trimming), a framework that utilizes any existing feature importance measure in a novel algorithm specifically tailored for ranking the top-$k$ features. Our approach combines an adaptive sequential halving strategy that progressively focuses computational resources on promising features with an efficient ensembling technique using both observation and feature subsampling. Unlike existing methods that convert importance scores to ranks as post-processing, our framework explicitly optimizes for ranking accuracy. We provide theoretical guarantees showing that RAMPART achieves the correct top-$k$ ranking with high probability under mild conditions, and demonstrate through extensive simulation studies that RAMPART consistently outperforms popular feature importance methods, concluding with a high-dimensional genomics case study.
Problem

Research questions and friction points this paper is trying to address.

Accurate ranking of top-k important features in machine learning
Optimizing for ranking accuracy rather than post-processing importance scores
Addressing understudied feature ranking challenge for interpretable ML
Innovation

Methods, ideas, or system contributions that make the work stand out.

MiniPatches and Recursive Trimming framework
Adaptive sequential halving strategy optimization
Observation and feature subsampling ensembling technique
🔎 Similar Papers
No similar papers found.
Yuxi Chen
Yuxi Chen
Carnegie Mellon University
Causal InferenceData-Driven Decision-MakingExplainable AI
T
Tiffany Tang
University of Notre Dame
G
Genevera Allen
Columbia University