DeFloMat: Detection with Flow Matching for Stable and Efficient Generative Object Localization

📅 2025-12-26
📈 Citations: 0
Influential: 0
📄 PDF

career value

212K/year
🤖 AI Summary
To address the high latency of diffusion-based detectors (e.g., DiffusionDet) caused by multi-step stochastic denoising—rendering them unsuitable for real-time clinical applications such as Crohn’s disease diagnosis via magnetic resonance enterography (MRE)—this paper proposes DeFloMat: the first object detection framework integrating conditional flow matching (CFM) and rectified flow. Its core innovation lies in a deterministic, single-step generative localization mechanism grounded in conditional optimal transport theory, effectively breaking the inherent accuracy–speed trade-off. Leveraging an ordinary differential equation (ODE) solver for inference, DeFloMat achieves 43.32% AP₁₀:₅₀ on the MRE dataset in merely three solver steps—1.4× faster than DiffusionDet—while significantly improving localization stability and recall under low-step regimes.

Technology Category

Application Category

📝 Abstract
We propose DeFloMat (Detection with Flow Matching), a novel generative object detection framework that addresses the critical latency bottleneck of diffusion-based detectors, such as DiffusionDet, by integrating Conditional Flow Matching (CFM). Diffusion models achieve high accuracy by formulating detection as a multi-step stochastic denoising process, but their reliance on numerous sampling steps ($T gg 60$) makes them impractical for time-sensitive clinical applications like Crohn's Disease detection in Magnetic Resonance Enterography (MRE). DeFloMat replaces this slow stochastic path with a highly direct, deterministic flow field derived from Conditional Optimal Transport (OT) theory, specifically approximating the Rectified Flow. This shift enables fast inference via a simple Ordinary Differential Equation (ODE) solver. We demonstrate the superiority of DeFloMat on a challenging MRE clinical dataset. Crucially, DeFloMat achieves state-of-the-art accuracy ($43.32% ext{ } AP_{10:50}$) in only $3$ inference steps, which represents a $1.4 imes$ performance improvement over DiffusionDet's maximum converged performance ($31.03% ext{ } AP_{10:50}$ at $4$ steps). Furthermore, our deterministic flow significantly enhances localization characteristics, yielding superior Recall and stability in the few-step regime. DeFloMat resolves the trade-off between generative accuracy and clinical efficiency, setting a new standard for stable and rapid object localization.
Problem

Research questions and friction points this paper is trying to address.

Reduces latency in generative object detection for clinical use
Replaces slow diffusion steps with fast deterministic flow matching
Improves accuracy and stability in few-step inference regimes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates Conditional Flow Matching for object detection
Replaces diffusion with deterministic flow from optimal transport
Enables fast inference using simple ODE solver
🔎 Similar Papers
No similar papers found.