Learning to Pay Attention: Unsupervised Modeling of Attentive and Inattentive Respondents in Survey Data

📅 2026-03-02

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This study proposes an unsupervised framework to automatically identify inattentive respondents who provide random or low-effort answers in behavioral and social science surveys. The approach jointly leverages geometric reconstruction via autoencoders and probabilistic dependency modeling through Chow–Liu trees to assess response consistency, enhanced by a novel “percentile loss” to improve robustness against outliers. The work reveals a “psychometric–machine learning alignment” phenomenon: questionnaire structures exhibiting high internal consistency inherently facilitate effective algorithmic detection of data quality issues. Experiments across nine real-world, heterogeneous survey datasets demonstrate that detection performance is primarily governed by questionnaire structure rather than model complexity, with linear models already achieving strong discriminative power when applied to high-quality scales.

Technology Category

Application Category

📝 Abstract

The integrity of behavioral and social-science surveys depends on detecting inattentive respondents who provide random or low-effort answers. Traditional safeguards, such as attention checks, are often costly, reactive, and inconsistent. We propose a unified, label-free framework for inattentiveness detection that scores response coherence using complementary unsupervised views: geometric reconstruction (Autoencoders) and probabilistic dependency modeling (Chow-Liu trees). While we introduce a "Percentile Loss" objective to improve Autoencoder robustness against anomalies, our primary contribution is identifying the structural conditions that enable unsupervised quality control. Across nine heterogeneous real-world datasets, we find that detection effectiveness is driven less by model complexity than by survey structure: instruments with coherent, overlapping item batteries exhibit strong covariance patterns that allow even linear models to reliably separate attentive from inattentive respondents. This reveals a critical ``Psychometric-ML Alignment'': the same design principles that maximize measurement reliability (e.g., internal consistency) also maximize algorithmic detectability. The framework provides survey platforms with a scalable, domain-agnostic diagnostic tool that links data quality directly to instrument design, enabling auditing without additional respondent burden.

Problem

Research questions and friction points this paper is trying to address.

inattentive respondents

survey data quality

attention detection

unsupervised modeling

response coherence

Innovation

Methods, ideas, or system contributions that make the work stand out.

unsupervised learning

response coherence

Autoencoders