HAMoBE: Hierarchical and Adaptive Mixture of Biometric Experts for Video-based Person ReID

📅 2025-08-07

📈 Citations: 0

✨ Influential: 0

career value

253K/year

🤖 AI Summary

To address insufficient and weakly discriminative feature selection in video-based person re-identification (re-ID), this paper proposes the Hierarchical Adaptive Biometric Expert Mixture (HABE) framework. HABE emulates human multimodal perception by introducing, for the first time, a multi-expert collaborative architecture coupled with a dual-input decision gating mechanism, enabling dynamic selection and hierarchical modeling of appearance, body shape, and gait features from query-gallery video pairs. Leveraging frozen CLIP-extracted multi-granularity features, HABE employs specialized expert networks to model long-term, short-term, and temporal patterns separately, with adaptive fusion guided by the gating mechanism. Evaluated on benchmarks including MEVID, HABE achieves up to a 13.0% improvement in Rank-1 accuracy, demonstrating significantly enhanced robustness and generalization capability under complex, real-world scenarios.

Technology Category

Application Category

📝 Abstract

Recently, research interest in person re-identification (ReID) has increasingly focused on video-based scenarios, which are essential for robust surveillance and security in varied and dynamic environments. However, existing video-based ReID methods often overlook the necessity of identifying and selecting the most discriminative features from both videos in a query-gallery pair for effective matching. To address this issue, we propose a novel Hierarchical and Adaptive Mixture of Biometric Experts (HAMoBE) framework, which leverages multi-layer features from a pre-trained large model (e.g., CLIP) and is designed to mimic human perceptual mechanisms by independently modeling key biometric features--appearance, static body shape, and dynamic gait--and adaptively integrating them. Specifically, HAMoBE includes two levels: the first level extracts low-level features from multi-layer representations provided by the frozen large model, while the second level consists of specialized experts focusing on long-term, short-term, and temporal features. To ensure robust matching, we introduce a new dual-input decision gating network that dynamically adjusts the contributions of each expert based on their relevance to the input scenarios. Extensive evaluations on benchmarks like MEVID demonstrate that our approach yields significant performance improvements (e.g., +13.0% Rank-1 accuracy).

Problem

Research questions and friction points this paper is trying to address.

Identifying discriminative features in video-based person ReID

Adaptively integrating appearance, body shape, and gait features

Dynamic expert contribution adjustment for robust matching

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical and Adaptive Mixture of Biometric Experts

Multi-layer features from pre-trained large model

Dual-input decision gating network for dynamic adjustment

🔎 Similar Papers

Enhancing person re-identification via Uncertainty Feature Fusion Method and Auto-weighted Measure Combination

2024-05-02Knowledge-Based SystemsCitations: 3

Recognizing Identities From Human Skeletons: A Survey on 3D Skeleton Based Person Re-Identification

2024-01-27Citations: 0