"A 6 or a 9?": Ensemble Learning Through the Multiplicity of Performant Models and Explanations

📅 2025-09-10

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

In real-world domains (e.g., manufacturing, healthcare), multiple models achieve similar in-distribution performance yet exhibit markedly divergent generalization capabilities; conventional model selection methods fail to identify truly robust solutions. Method: We propose Rashomon Ensemble—a novel framework that jointly models predictive performance and explanation diversity. It partitions the solution space via clustering based on feature importance or local explanations, then constructs a dual-dimensional ensemble optimizing both accuracy and interpretability consistency. Leveraging Rashomon ratio analysis, it dynamically selects and combines models to enhance robustness under distributional shift. Contribution/Results: Experiments demonstrate substantial improvements in out-of-distribution generalization: AUROC increases by over 0.20 on high-Rashomon-ratio datasets. The method is validated across multiple open-source and production-scale industrial datasets, confirming its effectiveness, robustness, and practical deployability.

Technology Category

Application Category

📝 Abstract

Creating models from past observations and ensuring their effectiveness on new data is the essence of machine learning. However, selecting models that generalize well remains a challenging task. Related to this topic, the Rashomon Effect refers to cases where multiple models perform similarly well for a given learning problem. This often occurs in real-world scenarios, like the manufacturing process or medical diagnosis, where diverse patterns in data lead to multiple high-performing solutions. We propose the Rashomon Ensemble, a method that strategically selects models from these diverse high-performing solutions to improve generalization. By grouping models based on both their performance and explanations, we construct ensembles that maximize diversity while maintaining predictive accuracy. This selection ensures that each model covers a distinct region of the solution space, making the ensemble more robust to distribution shifts and variations in unseen data. We validate our approach on both open and proprietary collaborative real-world datasets, demonstrating up to 0.20+ AUROC improvements in scenarios where the Rashomon ratio is large. Additionally, we demonstrate tangible benefits for businesses in various real-world applications, highlighting the robustness, practicality, and effectiveness of our approach.

Problem

Research questions and friction points this paper is trying to address.

Selecting generalizable models from multiple high-performing solutions

Improving ensemble robustness to distribution shifts in data

Maximizing model diversity while maintaining predictive accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Ensemble method selects diverse performant models

Groups models by performance and explanations

Maximizes diversity while maintaining predictive accuracy

🔎 Similar Papers

No similar papers found.