Foveated Instance Segmentation

📅 2025-03-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address high latency and poor real-time performance of instance segmentation on resource-constrained AR/VR devices, this paper proposes a gaze-driven sparse instance segmentation paradigm—the first to incorporate human visual attention mechanisms into this task. Methodologically, we integrate a real-time eye-tracking interface, design a lightweight network (FSNet), and introduce a gaze-guided ROI cropping and feature reweighting mechanism that performs fine-grained segmentation exclusively on instances within the user’s foveal region. Evaluated on ADE20K and LVIS, our method achieves IoU scores of 0.56 and 0.54, respectively—substantially outperforming baseline methods—while reducing inference latency by 63%. This enables dynamic alignment between computational resource usage and perceptual demand. Our core contribution is a novel “gaze-centric” efficient segmentation paradigm, offering a scalable solution for vision understanding under strict hardware constraints.

Technology Category

Application Category

📝 Abstract
Instance segmentation is essential for augmented reality and virtual reality (AR/VR) as it enables precise object recognition and interaction, enhancing the integration of virtual and real-world elements for an immersive experience. However, the high computational overhead of segmentation limits its application on resource-constrained AR/VR devices, causing large processing latency and degrading user experience. In contrast to conventional scenarios, AR/VR users typically focus on only a few regions within their field of view before shifting perspective, allowing segmentation to be concentrated on gaze-specific areas. This insight drives the need for efficient segmentation methods that prioritize processing instance of interest, reducing computational load and enhancing real-time performance. In this paper, we present a foveated instance segmentation (FovealSeg) framework that leverages real-time user gaze data to perform instance segmentation exclusively on instance of interest, resulting in substantial computational savings. Evaluation results show that FSNet achieves an IoU of 0.56 on ADE20K and 0.54 on LVIS, notably outperforming the baseline. The code is available at https://github.com/SAI-
Problem

Research questions and friction points this paper is trying to address.

Reducing computational overhead in AR/VR instance segmentation
Focusing segmentation on gaze-specific areas for efficiency
Enhancing real-time performance with foveated instance segmentation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Foveated segmentation using real-time gaze data
Focuses segmentation on user interest areas
Reduces computational load for AR/VR
🔎 Similar Papers
No similar papers found.