🤖 AI Summary
To address the absence of high-quality multimodal benchmark datasets for recycling critical raw materials from electrolyzers, this work introduces the first co-registered RGB + hyperspectral (400–2500 nm) multimodal benchmark dataset tailored for intelligent scrap sorting, comprising 55 samples and over 4.2 million pixel-level annotations. We propose a zero-shot detection framework integrated with pixel-wise voting to achieve non-invasive, robust object-level material classification. Systematic evaluation on state-of-the-art architectures—including SpectralFormer and Multimodal Fusion Transformer—reveals key performance bottlenecks in cross-modal fusion and spectral generalization. All code, the dataset, and FAIR-compliant metadata are publicly released. This resource fills a critical gap in multimodal benchmarks for electrolyzer material recovery, enabling real-time waste analysis and facilitating the deployment of green recycling protocols.
📝 Abstract
The global challenge of sustainable recycling demands automated, fast, and accurate, state-of-the-art (SOTA) material detection systems that act as a bedrock for a circular economy. Democratizing access to these cutting-edge solutions that enable real-time waste analysis is essential for scaling up recycling efforts and fostering the Green Deal. In response, we introduce extbf{Electrolyzers-HSI}, a novel multimodal benchmark dataset designed to accelerate the recovery of critical raw materials through accurate electrolyzer materials classification. The dataset comprises 55 co-registered high-resolution RGB images and hyperspectral imaging (HSI) data cubes spanning the 400--2500 nm spectral range, yielding over 4.2 million pixel vectors and 424,169 labeled ones. This enables non-invasive spectral analysis of shredded electrolyzer samples, supporting quantitative and qualitative material classification and spectral properties investigation. We evaluate a suite of baseline machine learning (ML) methods alongside SOTA transformer-based deep learning (DL) architectures, including Vision Transformer, SpectralFormer, and the Multimodal Fusion Transformer, to investigate architectural bottlenecks for further efficiency optimisation when deploying transformers in material identification. We implement zero-shot detection techniques and majority voting across pixel-level predictions to establish object-level classification robustness. In adherence to the FAIR data principles, the electrolyzers-HSI dataset and accompanying codebase are openly available at https://github.com/hifexplo/Electrolyzers-HSI and https://rodare.hzdr.de/record/3668, supporting reproducible research and facilitating the broader adoption of smart and sustainable e-waste recycling solutions.