๐ค AI Summary
This work addresses the gradient imbalance and learning instability caused by moving objects in self-supervised joint training of depth, odometry, and optical flow networks. To mitigate these issues, the authors propose CoopNet, a framework that enforces consistency between depthโodometry and optical flow reconstructions to automatically identify and exclude dynamic regions. Furthermore, a hybrid loss function is designed based on the distribution of photometric reconstruction errors, enabling dynamic reweighting of multi-task gradients to promote balanced optimization. Experiments on the KITTI and Cityscapes datasets demonstrate that CoopNet significantly improves fairness and performance in multi-task co-optimization, achieving depth, odometry, and optical flow estimation results that match or surpass current state-of-the-art methods.
๐ Abstract
We present CoopNet, an approach that improves the cooperation of co-trained networks by dynamically adapting the apportionment of gradient, to ensure equitable learning progress. It is applied to motion-aware self-supervised prediction of depth maps, by introducing a new hybrid loss, based on a distribution model of photo-metric reconstruction errors made by, on the one hand the depth + odometry paired networks, and on the other hand the optical flow network. This model essentially assumes that the pixels from moving objects (that must be discarded for training depth and odometry), correspond to those where the two reconstructions strongly disagree. We justify this model by theoretical considerations and experimental evidences. A comparative evaluation on KITTI and CityScapes datasets shows that CoopNet improves or is comparable to the state-of-the-art in depth, odometry and optical flow predictions.