DeltaFlow: An Efficient Multi-frame Scene Flow Estimation Method

📅 2025-08-23

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

Existing scene flow methods predominantly rely on two-frame inputs, limiting their ability to model long-term temporal motion; multi-frame approaches, meanwhile, suffer from rapidly escalating computational cost with increasing frame count, severe class imbalance, and inconsistent instance-level motion. This paper proposes ΔFlow, a lightweight multi-frame scene flow framework. It introduces a Δ-feature extraction mechanism that explicitly encodes inter-frame motion differences to efficiently exploit temporal information; a class-balanced loss to mitigate learning bias toward long-tailed categories; and an instance-consistency loss to enforce motion coherence across frames. Crucially, ΔFlow achieves these improvements without increasing inference overhead. Evaluated on Argoverse 2 and Waymo, it establishes new state-of-the-art performance—reducing average endpoint error by up to 22% and doubling inference speed—while demonstrating strong cross-domain generalization capability.

Technology Category

Application Category

📝 Abstract

Previous dominant methods for scene flow estimation focus mainly on input from two consecutive frames, neglecting valuable information in the temporal domain. While recent trends shift towards multi-frame reasoning, they suffer from rapidly escalating computational costs as the number of frames grows. To leverage temporal information more efficiently, we propose DeltaFlow ($Δ$Flow), a lightweight 3D framework that captures motion cues via a $Δ$ scheme, extracting temporal features with minimal computational cost, regardless of the number of frames. Additionally, scene flow estimation faces challenges such as imbalanced object class distributions and motion inconsistency. To tackle these issues, we introduce a Category-Balanced Loss to enhance learning across underrepresented classes and an Instance Consistency Loss to enforce coherent object motion, improving flow accuracy. Extensive evaluations on the Argoverse 2 and Waymo datasets show that $Δ$Flow achieves state-of-the-art performance with up to 22% lower error and $2 imes$ faster inference compared to the next-best multi-frame supervised method, while also demonstrating a strong cross-domain generalization ability. The code is open-sourced at https://github.com/Kin-Zhang/DeltaFlow along with trained model weights.

Problem

Research questions and friction points this paper is trying to address.

Efficient multi-frame scene flow estimation with low computational cost

Addressing imbalanced object class distributions in motion estimation

Improving motion consistency and accuracy across diverse datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight 3D framework with delta scheme

Category-balanced loss for underrepresented classes

Instance consistency loss for motion coherence

🔎 Similar Papers

Generalizable Implicit Motion Modeling for Video Frame Interpolation