More Than "Means to an End": Supporting Reasoning with Transparently Designed AI Data Science Processes

📅 2026-03-25

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

Existing end-to-end generative AI systems struggle to support users in evaluating alternatives or reframing problems in high-stakes domains, thereby limiting effective reasoning in open-ended data science tasks. To address this, this work proposes a transparent AI workflow centered on human-readable intermediate artifacts—such as query languages, structured concept definitions, and input-output examples—which serve as “thinking tools” to enable user reflection and knowledge injection within otherwise opaque AI pipelines. Empirical evaluation in a healthcare setting demonstrates that this approach significantly enhances non-expert users’ understanding of analytical decisions, empowering them to refine their initial queries and effectively integrate domain knowledge. Consequently, the method improves both the collaborativeness and reliability of data science workflows in high-risk contexts.

Technology Category

Application Category

📝 Abstract

Generative artificial intelligence (AI) tools can now help people perform complex data science tasks regardless of their expertise. While these tools have great potential to help more people work with data, their end-to-end approach does not support users in evaluating alternative approaches and reformulating problems, both critical to solving open-ended tasks in high-stakes domains. In this paper, we reflect on two AI data science systems designed for the medical setting and how they function as tools for thought. We find that success in these systems was driven by constructing AI workflows around intentionally-designed intermediate artifacts, such as readable query languages, concept definitions, or input-output examples. Despite opaqueness in other parts of the AI process, these intermediates helped users reason about important analytical choices, refine their initial questions, and contribute their unique knowledge. We invite the HCI community to consider when and how intermediate artifacts should be designed to promote effective data science thinking.

Problem

Research questions and friction points this paper is trying to address.

generative AI

data science

reasoning

intermediate artifacts

open-ended tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

intermediate artifacts

transparent AI design

AI-assisted data science