🤖 AI Summary
This study investigates the misalignment between data workers’ implicit cognitive models of complex hierarchical data (e.g., nested tables) and the explicit data models encoded in analysis code, and how such misalignment negatively impacts analytical efficiency and accuracy. Method: Through semi-structured interviews, cognitive sketching, and reflexive thematic coding with 10 collaborative data practitioners, we systematically identify divergent, coexisting cognitive models within teams. Contribution/Results: We introduce the novel concept of “parallel risk”—a form of collaborative breakdown arising from persistent cognitive misalignment between data model designers and end users. All participants exhibited internal representations inconsistent with the true data structure, leading to systematic reasoning errors. Based on these findings, we derive human-centered design principles and intervention strategies for analytical tools that promote cognitive alignment. This work establishes a theoretical foundation and practical framework for improving usability in data engineering and visualization systems.
📝 Abstract
Data workers may have a a different mental model of their data that the one reified in code. Understanding the organization of their data is necessary for analyzing data, be it through scripting, visualization or abstract thought. More complicated organizations, such as tables with attached hierarchies, may tax people's ability to think about and interact with data. To better understand and ultimately design for these situations, we conduct a study across a team of ten people work ing with the same reified data model. Through interviews and sketching, we conduct a study across a team of ten people working with the same reified data model. Through interviews and sketching, we probed their conception of the data model and developed themes through reflexive data analysis. Participants had diverse data models that differed from the reified data model, even among team members who had designed the model, resulting in parallel hazards limiting their ability to reason about the data. From these observations, we suggest potential design interventions for data analysis processes and tools.