🤖 AI Summary
This study addresses the challenge of non-contact liquid-level sensing within opaque, sealed containers. Conventional imaging methods capture only surface information and cannot infer internal liquid levels. To overcome this limitation, we propose the first speckle-vibration-based cross-container liquid-level estimation method: ambient acoustic excitation induces minute vibrations in container walls; a high-speed camera captures dynamic speckle patterns across multiple container surfaces, yielding two-dimensional gridded vibration time-series data. We then design a vibration-source-agnostic Transformer model that generalizes across diverse container materials and geometries. The approach enables batch, remote, non-contact measurement without weighing or opening containers, achieving high accuracy under both natural and controlled acoustic excitation. By leveraging vibration-induced visual cues rather than direct optical access to the liquid surface, our method significantly extends the applicability of vision-based perception for state monitoring in sealed systems.
📝 Abstract
Computer vision seeks to infer a wide range of information about objects and events. However, vision systems based on conventional imaging are limited to extracting information only from the visible surfaces of scene objects. For instance, a vision system can detect and identify a Coke can in the scene, but it cannot determine whether the can is full or empty. In this paper, we aim to expand the scope of computer vision to include the novel task of inferring the hidden liquid levels of opaque containers by sensing the tiny vibrations on their surfaces. Our method provides a first-of-a-kind way to inspect the fill level of multiple sealed containers remotely, at once, without needing physical manipulation and manual weighing. First, we propose a novel speckle-based vibration sensing system for simultaneously capturing scene vibrations on a 2D grid of points. We use our system to efficiently and remotely capture a dataset of vibration responses for a variety of everyday liquid containers. Then, we develop a transformer-based approach for analyzing the captured vibrations and classifying the container type and its hidden liquid level at the time of measurement. Our architecture is invariant to the vibration source, yielding correct liquid level estimates for controlled and ambient scene sound sources. Moreover, our model generalizes to unseen container instances within known classes (e.g., training on five Coke cans of a six-pack, testing on a sixth) and fluid levels. We demonstrate our method by recovering liquid levels from various everyday containers.