FPBench: A Comprehensive Benchmark of Multimodal Large Language Models for Fingerprint Analysis

📅 2025-12-19

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

Multimodal large language models (MLLMs) remain unexplored for fingerprint analysis—a critical domain in biometrics and forensic science—due to the absence of dedicated benchmarks and systematic evaluation protocols. Method: We introduce FPBench, the first comprehensive benchmark for fingerprint analysis, comprising seven real and synthetic datasets and eight fine-grained tasks—including quality assessment, matching reasoning, and forensic interpretation—to evaluate 20 open- and closed-source MLLMs under zero-shot and chain-of-thought (CoT) inference settings. Contribution/Results: Our study reveals, for the first time, fundamental limitations of current MLLMs in texture perception, causal reasoning, and explanation generation. We publicly release FPBench—including its datasets, evaluation protocols, and baseline results—to establish a standardized foundation for developing fingerprint-aware foundation models.

Technology Category

Application Category

📝 Abstract

Multimodal LLMs (MLLMs) have gained significant traction in complex data analysis, visual question answering, generation, and reasoning. Recently, they have been used for analyzing the biometric utility of iris and face images. However, their capabilities in fingerprint understanding are yet unexplored. In this work, we design a comprehensive benchmark, extsc{FPBench} that evaluates the performance of 20 MLLMs (open-source and proprietary) across 7 real and synthetic datasets on 8 biometric and forensic tasks using zero-shot and chain-of-thought prompting strategies. We discuss our findings in terms of performance, explainability and share our insights into the challenges and limitations. We establish extsc{FPBench} as the first comprehensive benchmark for fingerprint domain understanding with MLLMs paving the path for foundation models for fingerprints.

Problem

Research questions and friction points this paper is trying to address.

Evaluates MLLMs on fingerprint biometric and forensic tasks

Assesses performance across real and synthetic datasets using varied strategies

Establishes first comprehensive benchmark for fingerprint understanding with MLLMs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluates 20 MLLMs on fingerprint tasks using zero-shot and chain-of-thought prompting

Assesses performance across 7 real and synthetic datasets on 8 biometric tasks

Establishes first comprehensive benchmark for fingerprint understanding with MLLMs

🔎 Similar Papers

SoK: Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency