Lightweight Facial Landmark Detection in Thermal Images via Multi-Level Cross-Modal Knowledge Transfer

๐Ÿ“… 2025-10-13
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Thermal facial landmark detection suffers from sparse visual cues and challenges in cross-modal transferโ€”such as artifact introduction or excessive computational overhead. To address these issues, we propose a multi-level cross-modal knowledge distillation framework centered on a novel bidirectional injection-based distillation mechanism. This mechanism enforces semantic consistency between RGB and thermal modalities via closed-loop supervision, while integrating feature decoupling compression and student-representation feedback to enable efficient knowledge transfer without updating the pre-trained teacher model. Evaluated on public thermal landmark datasets, our method achieves state-of-the-art accuracy, with significantly reduced parameter count and FLOPs. It thus strikes an effective balance between high precision and model lightweightness, demonstrating strong practical deployability in resource-constrained thermal imaging applications.

Technology Category

Application Category

๐Ÿ“ Abstract
Facial Landmark Detection (FLD) in thermal imagery is critical for applications in challenging lighting conditions, but it is hampered by the lack of rich visual cues. Conventional cross-modal solutions, like feature fusion or image translation from RGB data, are often computationally expensive or introduce structural artifacts, limiting their practical deployment. To address this, we propose Multi-Level Cross-Modal Knowledge Distillation (MLCM-KD), a novel framework that decouples high-fidelity RGB-to-thermal knowledge transfer from model compression to create both accurate and efficient thermal FLD models. A central challenge during knowledge transfer is the profound modality gap between RGB and thermal data, where traditional unidirectional distillation fails to enforce semantic consistency across disparate feature spaces. To overcome this, we introduce Dual-Injected Knowledge Distillation (DIKD), a bidirectional mechanism designed specifically for this task. DIKD establishes a connection between modalities: it not only guides the thermal student with rich RGB features but also validates the student's learned representations by feeding them back into the frozen teacher's prediction head. This closed-loop supervision forces the student to learn modality-invariant features that are semantically aligned with the teacher, ensuring a robust and profound knowledge transfer. Experiments show that our approach sets a new state-of-the-art on public thermal FLD benchmarks, notably outperforming previous methods while drastically reducing computational overhead.
Problem

Research questions and friction points this paper is trying to address.

Addressing thermal facial landmark detection with limited visual cues
Overcoming modality gap between RGB and thermal image data
Developing computationally efficient cross-modal knowledge transfer framework
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-level knowledge distillation for thermal landmark detection
Bidirectional distillation mechanism bridges RGB-thermal modality gap
Closed-loop supervision ensures semantic alignment and model efficiency
๐Ÿ”Ž Similar Papers
2024-03-22IEEE transactions on circuits and systems for video technology (Print)Citations: 2
Q
Qiyi Tong
Human-Robot Interfaces and Interaction Laboratory, Istituto Italiano di Tecnologia, Genoa, Italy
O
Olivia Nocentini
Human-Robot Interfaces and Interaction Laboratory, Istituto Italiano di Tecnologia, Genoa, Italy
M
Marta Lagomarsino
Human-Robot Interfaces and Interaction Laboratory, Istituto Italiano di Tecnologia, Genoa, Italy
K
Kuanqi Cai
Human-Robot Interfaces and Interaction Laboratory, Istituto Italiano di Tecnologia, Genoa, Italy
Marta Lorenzini
Marta Lorenzini
Istituto Italiano di Tecnologia
Human-robot interfaces and Interaction
Arash Ajoudani
Arash Ajoudani
Tenured Senior Scientist, Istituto Italiano di Tecnologia
Collaborative RoboticsPhysical Human-Robot interactionHuman-Robot CollaborationAssistive RoboticsTelerobotics