"A 6 or a 9?": Ensemble Learning Through the Multiplicity of Performant Models and Explanations

πŸ“… 2025-09-10
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
In real-world domains (e.g., manufacturing, healthcare), multiple models achieve similar in-distribution performance yet exhibit markedly divergent generalization capabilities; conventional model selection methods fail to identify truly robust solutions. Method: We propose Rashomon Ensembleβ€”a novel framework that jointly models predictive performance and explanation diversity. It partitions the solution space via clustering based on feature importance or local explanations, then constructs a dual-dimensional ensemble optimizing both accuracy and interpretability consistency. Leveraging Rashomon ratio analysis, it dynamically selects and combines models to enhance robustness under distributional shift. Contribution/Results: Experiments demonstrate substantial improvements in out-of-distribution generalization: AUROC increases by over 0.20 on high-Rashomon-ratio datasets. The method is validated across multiple open-source and production-scale industrial datasets, confirming its effectiveness, robustness, and practical deployability.

Technology Category

Application Category

πŸ“ Abstract
Creating models from past observations and ensuring their effectiveness on new data is the essence of machine learning. However, selecting models that generalize well remains a challenging task. Related to this topic, the Rashomon Effect refers to cases where multiple models perform similarly well for a given learning problem. This often occurs in real-world scenarios, like the manufacturing process or medical diagnosis, where diverse patterns in data lead to multiple high-performing solutions. We propose the Rashomon Ensemble, a method that strategically selects models from these diverse high-performing solutions to improve generalization. By grouping models based on both their performance and explanations, we construct ensembles that maximize diversity while maintaining predictive accuracy. This selection ensures that each model covers a distinct region of the solution space, making the ensemble more robust to distribution shifts and variations in unseen data. We validate our approach on both open and proprietary collaborative real-world datasets, demonstrating up to 0.20+ AUROC improvements in scenarios where the Rashomon ratio is large. Additionally, we demonstrate tangible benefits for businesses in various real-world applications, highlighting the robustness, practicality, and effectiveness of our approach.
Problem

Research questions and friction points this paper is trying to address.

Selecting generalizable models from multiple high-performing solutions
Improving ensemble robustness to distribution shifts in data
Maximizing model diversity while maintaining predictive accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Ensemble method selects diverse performant models
Groups models by performance and explanations
Maximizes diversity while maintaining predictive accuracy
πŸ”Ž Similar Papers
No similar papers found.
G
Gianlucca Zuin
Universidade Federal de Minas Gerais, Brazil and Instituto Kunumi, Brazil
Adriano Veloso
Adriano Veloso
Associate Professor of Computer Science, Universidade Federal de Minas Gerais
Machine LearningNatural Language Processing