🤖 AI Summary
Organizations widely adopt generative AI, yet productivity gains remain uneven, underscoring the critical role of human–AI collaboration strategies. This study presents the first field randomized controlled trial in a real-world workplace setting, comparing the effects of behavioral scaffolding—structured protocols for paired AI use—and cognitive scaffolding—training that frames AI as a thinking partner—on employees’ document quality and output. Leveraging large language model–based automated scoring, surveys, and sensitivity analyses, the findings reveal that cognitive scaffolding significantly enhances high-end document quality and strengthens users’ positive beliefs about AI. In contrast, behavioral scaffolding leads to reduced document quality and lower output, highlighting the central importance of cognitive reframing in fostering high-quality human–AI collaboration.
📝 Abstract
Organizations have widely deployed generative AI tools, yet productivity gains remain uneven, suggesting that how people use AI matters as much as whether they have access. We conducted a field experiment with 388 employees at a Fortune 500 retailer to test two scaffolding interventions for human-AI collaboration. All participants had access to the same AI tool; we varied only the structure surrounding its use. A behavioral scaffolding intervention (a structured protocol requiring joint AI use within pairs) was associated with lower document quality relative to unstructured use and substantially lower document production. A cognitive scaffolding intervention (partnership training that reframed AI as a thought partner) was associated with higher individual document quality at the top of the distribution. Treatment participants also showed greater positive belief change across the session, though sensitivity analyses suggest this likely reflects recovery from carry-over effects rather than genuine training-induced shifts. Both findings are subject to design limitations including an AM/PM session confound, differential attrition, and LLM grading sensitivity to document length.