A Rigorous Behavior Assessment of CNNs Using a Data-Domain Sampling Regime

📅 2025-07-04

📈 Citations: 0

✨ Influential: 0

career value

225K/year

🤖 AI Summary

This work investigates the behavioral mechanisms of convolutional neural networks (CNNs) in bar chart ratio estimation—a fundamental graphical perception task—focusing on their sensitivity to train-test distribution shifts, stability under low-data regimes, and comparative performance against human observers. We introduce a data-domain sampling framework that systematically controls chart generation distributions, enabling rigorous analysis across controlled visual variations. Our methodology integrates large-scale CNN evaluation (800 models, 16 million trials), human behavioral experiments (113 participants, 6,825 trials), human-AI comparative testing, and statistical modeling. Key findings: CNNs consistently outperform humans in accuracy and exhibit monotonic bias growth with increasing train-test distribution distance—revealing a simple, quantifiable regularity in visual reasoning. This study provides the first controlled empirical demonstration that CNNs can achieve both robustness and interpretability in chart perception, bridging gaps between machine vision and human graphical cognition.

Technology Category

Application Category

📝 Abstract

We present a data-domain sampling regime for quantifying CNNs' graphic perception behaviors. This regime lets us evaluate CNNs' ratio estimation ability in bar charts from three perspectives: sensitivity to training-test distribution discrepancies, stability to limited samples, and relative expertise to human observers. After analyzing 16 million trials from 800 CNNs models and 6,825 trials from 113 human participants, we arrived at a simple and actionable conclusion: CNNs can outperform humans and their biases simply depend on the training-test distance. We show evidence of this simple, elegant behavior of the machines when they interpret visualization images. osf.io/gfqc3 provides registration, the code for our sampling regime, and experimental results.

Problem

Research questions and friction points this paper is trying to address.

Assess CNNs' graphic perception in bar charts

Compare CNNs' ratio estimation to human performance

Analyze CNNs' sensitivity to training-test distribution gaps

Innovation

Methods, ideas, or system contributions that make the work stand out.

Data-domain sampling regime for CNN assessment

Evaluates CNNs' ratio estimation in bar charts

Compares CNNs' performance to human observers

🔎 Similar Papers

No similar papers found.

💼 Related Jobs

Researcher, Interpretability

OpenAI

$295K – $445K • Offers Equity

San Francisco

Research Scientist Intern, Applied Perception Science (PhD)