π€ AI Summary
Accurate localization of electrode tab endpoints in X-ray images of lithium-ion batteries is challenging due to high endpoint density, low contrast, multi-scale variations, and imaging artifacts.
Method: We propose a novel point-level segmentation paradigm. First, we introduce PBD5Kβthe first large-scale, publicly available X-ray dataset for power battery analysis. Second, we design MDCNeXt, a model that fuses multi-dimensional structural cues and integrates prompt filtering with density-aware re-ranking. Third, we incorporate state-space modeling, distance-adaptive mask generation, and an intelligent annotation pipeline.
Results: Our approach achieves high-precision tab endpoint localization across diverse battery types. It demonstrates strong robustness under high-density and low-contrast conditions, significantly outperforming conventional object detection and semantic/instance segmentation methods. The framework provides a reliable, automated solution for internal defect inspection in battery quality control, advancing industrial non-destructive evaluation capabilities.
π Abstract
Power batteries are essential components in electric vehicles, where internal structural defects can pose serious safety risks. We conduct a comprehensive study on a new task, power battery detection (PBD), which aims to localize the dense endpoints of cathode and anode plates from industrial X-ray images for quality inspection. Manual inspection is inefficient and error-prone, while traditional vision algorithms struggle with densely packed plates, low contrast, scale variation, and imaging artifacts. To address this issue and drive more attention into this meaningful task, we present PBD5K, the first large-scale benchmark for this task, consisting of 5,000 X-ray images from nine battery types with fine-grained annotations and eight types of real-world visual interference. To support scalable and consistent labeling, we develop an intelligent annotation pipeline that combines image filtering, model-assisted pre-labeling, cross-verification, and layered quality evaluation. We formulate PBD as a point-level segmentation problem and propose MDCNeXt, a model designed to extract and integrate multi-dimensional structure clues including point, line, and count information from the plate itself. To improve discrimination between plates and suppress visual interference, MDCNeXt incorporates two state space modules. The first is a prompt-filtered module that learns contrastive relationships guided by task-specific prompts. The second is a density-aware reordering module that refines segmentation in regions with high plate density. In addition, we propose a distance-adaptive mask generation strategy to provide robust supervision under varying spatial distributions of anode and cathode positions. The source code and datasets will be publicly available at href{https://github.com/Xiaoqi-Zhao-DLUT/X-ray-PBD}{PBD5K}.