🤖 AI Summary
Federated learning (FL) has long suffered from poor model interpretability. This paper proposes FedNAMs, the first framework to systematically integrate Neural Additive Models (NAMs) into FL, enabling privacy-preserving, client-local interpretable modeling. Methodologically, it introduces feature-decoupled training, a two-tier (local-global) interpretability mechanism, and attribution-aligned cross-device aggregation—supporting both client-specific explanations and identification of globally salient features. Evaluated on heterogeneous datasets—including Wine, Heart Disease, and Iris—FedNAMs achieves accuracy within <1% of federated deep neural networks, while delivering fine-grained, causally grounded feature attributions (e.g., volatile acidity, type of chest pain). This substantially enhances model trustworthiness in high-stakes domains such as finance and healthcare, where transparency and accountability are critical.
📝 Abstract
Federated learning continues to evolve but faces challenges in interpretability and explainability. To address these challenges, we introduce a novel approach that employs Neural Additive Models (NAMs) within a federated learning framework. This new Federated Neural Additive Models (FedNAMs) approach merges the advantages of NAMs, where individual networks concentrate on specific input features, with the decentralized approach of federated learning, ultimately producing interpretable analysis results. This integration enhances privacy by training on local data across multiple devices, thereby minimizing the risks associated with data centralization and improving model robustness and generalizability. FedNAMs maintain detailed, feature-specific learning, making them especially valuable in sectors such as finance and healthcare. They facilitate the training of client-specific models to integrate local updates, preserve privacy, and mitigate concerns related to centralization. Our studies on various text and image classification tasks, using datasets such as OpenFetch ML Wine, UCI Heart Disease, and Iris, show that FedNAMs deliver strong interpretability with minimal accuracy loss compared to traditional Federated Deep Neural Networks (DNNs). The research involves notable findings, including the identification of critical predictive features at both client and global levels. Volatile acidity, sulfates, and chlorides for wine quality. Chest pain type, maximum heart rate, and number of vessels for heart disease. Petal length and width for iris classification. This approach strengthens privacy and model efficiency and improves interpretability and robustness across diverse datasets. Finally, FedNAMs generate insights on causes of highly and low interpretable features.