The Interplay of Attention and Memory in Visual Enumeration

📅 2025-10-07

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

Traditional planar-screen experiments inadequately capture the cognitive dynamics of visual enumeration in ecologically valid, large-field-of-view environments, particularly overlooking interactions between attention and working memory in complex scenes. Method: Leveraging immersive virtual reality and high-precision eye-tracking, we employed a two-phase enumeration paradigm, orthogonally manipulating task intention (counting vs. identity recognition) and spatial layout across two stimulus types: real-world objects and simple geometric shapes. Contribution/Results: Semantic processing imposes substantial cognitive load that significantly suppresses memory retrieval—constituting a primary bottleneck in real-world enumeration. Moreover, selective counting itself incurs substantial cognitive cost, which escalates with stimulus complexity. This study transcends conventional focus on visual search alone, providing the first empirical evidence of how semantic–memory interactions constrain enumeration efficiency. These findings offer critical constraints for developing embodied models of numerical cognition.

Technology Category

Application Category

📝 Abstract

Humans navigate and understand complex visual environments by subconsciously quantifying what they see, a process known as visual enumeration. However, traditional studies using flat screens fail to capture the cognitive dynamics of this process over the large visual fields of real-world scenes. To address this gap, we developed an immersive virtual reality system with integrated eye-tracking to investigate the interplay between attention and memory during complex enumeration. We conducted a two-phase experiment where participants enumerated scenes of either simple abstract shapes or complex real-world objects, systematically varying the task intent (e.g., selective vs. exhaustive counting) and the spatial layout of items. Our results reveal that task intent is the dominant factor driving performance, with selective counting imposing a significant cognitive cost that was dramatically amplified by stimulus complexity. The semantic processing required for real-world objects reduced accuracy and suppressed memory recall, while the influence of spatial layout was secondary and statistically non-significant when a higher-order cognitive task intent was driving the human behaviour. We conclude that real-world enumeration is fundamentally constrained by the cognitive load of semantic processing, not just the mechanics of visual search. Our findings demonstrate that under high cognitive demand, the effort to understand what we are seeing directly limits our capacity to remember it.

Problem

Research questions and friction points this paper is trying to address.

Investigating attention-memory interplay during visual enumeration in immersive VR

Examining how task intent dominates enumeration performance over spatial layout

Demonstrating semantic processing constraints on real-world enumeration accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Immersive VR system with eye-tracking for enumeration

Two-phase experiment with varying task intent conditions

Analyzed cognitive load impact on memory during enumeration

🔎 Similar Papers

Mind the GAP: Glimpse-based Active Perception improves generalization and sample efficiency of visual reasoning

0Citations: 0

What is the Visual Cognition Gap between Humans and Multimodal LLMs?

2024-06-14arXiv.orgCitations: 15

Visual Enumeration is Challenging for Large-scale Generative AI

2024-01-09Citations: 2

Hierarchical Working Memory and a New Magic Number

2024-08-14bioRxivCitations: 1