🤖 AI Summary
This work addresses the challenge of high-precision, unsupervised or weakly supervised tracking of single particles in noisy video sequences by proposing an end-to-end framework that integrates neural networks with differentiable physical models. The method employs a split-bottleneck autoencoder to generate particle heatmaps and incorporates a differentiable physics module to enforce trajectory consistency with known dynamical laws. Its key innovation lies in the introduction of the Physics-Informed Landmark Loss (PILL) and its supervised variant (PILLS), which enable physically consistent trajectory learning without ground-truth annotations or using only synthetic data. Evaluated across 64 experimental configurations, the approach consistently achieves sub-pixel accuracy, demonstrating robustness and effectiveness under both clean and noisy conditions.
📝 Abstract
We propose Physics-Informed Tracking (PIT), a video-based framework for tracking a single particle from video, where a neural network autoencoder localizes a particle as a heatmap peak (landmark) and a differentiable physics module embedded in the autoencoder constrains several landmarks over time (a trajectory) to satisfy known dynamics. The novel Physics-Informed Landmark Loss (PILL) compares this predicted trajectory back against the landmarks, enforcing physical consistency without labels. Its supervised variant (PILLS) instead compares the prediction against ground-truth position, velocity, and bounce from simulation, enabling end-to-end backpropagation. To support supervised and unsupervised learning, we use an autoencoder with a split bottleneck that separates A) tracking-related structure via landmark heatmaps from B) background noise and subsequent image reconstruction. We evaluate a replicated 26 factorial design (n = 4 replicates, 64 configurations), showing that PILLS consistently achieves sub-pixel tracking accuracy for the bilinear and physics-refined decoder outputs under both clean and noisy conditions.