AttriBE: Quantifying Attribute Expressivity in Body Embeddings for Recognition and Identification

📅 2026-04-29

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

This study addresses the susceptibility of existing person re-identification (ReID) systems to confounding attributes such as gender, pose, and body mass index (BMI), which compromises model fairness and generalization. The work introduces, for the first time, a mutual information–driven expressiveness metric into the ReID embedding space, employing auxiliary neural networks to quantify the encoding strength of each attribute within body embeddings and reveal their evolution across model layers and training stages. A Transformer-based cross-spectral analysis—spanning visible, short-wave, mid-wave, and long-wave infrared modalities—demonstrates that BMI exhibits the strongest representation in deep embeddings, while pose gains显著 importance under cross-modal conditions, highlighting divergent roles of morphological and structural cues across spectral domains.

📝 Abstract

Person re-identification (ReID) systems that match individuals across images or video frames are essential in many real-world applications. However, existing methods are often influenced by attributes such as gender, pose, and body mass index (BMI), which vary in unconstrained settings and raise concerns related to fairness and generalization. To address this, we extend the notion of expressivity, defined as the mutual information between learned features and specific attributes, using a secondary neural network to quantify how strongly attributes are encoded. Applying this framework to three transformer-based ReID models on a large-scale visible-spectrum dataset, we find that BMI consistently shows the highest expressivity in deeper layers. Attributes in the final representation are ranked as BMI > Pitch > Gender > Yaw, and expressivity evolves across layers and training epochs, with pose peaking in intermediate layers and BMI strengthening with depth. We further extend the analysis to cross-spectral person identification across infrared modalities including short-wave, medium-wave, and long-wave infrared. In this setting, pitch becomes comparable to BMI and attribute trends increase monotonically across depth, suggesting increased reliance on structural cues when bridging modality gaps. Overall, the results show that transformer-based ReID embeddings encode a hierarchy of implicit attributes, with morphometric information persistently embedded and pose contributing more strongly under cross-spectral conditions.

Problem

Research questions and friction points this paper is trying to address.

person re-identification

attribute bias

fairness

generalization

cross-spectral identification

Innovation

Methods, ideas, or system contributions that make the work stand out.

attribute expressivity

person re-identification

transformer embeddings