Shaping Credibility Judgments in Human-GenAI Partnership via Weaker LLMs: A Transactive Memory Perspective on AI Literacy

📅 2026-03-27

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This study addresses how to cultivate students’ ability to evaluate the credibility of generative AI (GenAI) outputs and sustain human accountability during everyday collaboration with AI systems. Drawing on transactive memory theory, the research proposes an innovative AI literacy framework that operationalizes credibility assessment through locally deployed, intentionally weakened large language models and differentiated interaction protocols—such as reflection-first and mandatory verification—to render trust judgments explicit and amenable to intervention. A randomized controlled trial in undergraduate STEM courses demonstrates that these procedural designs significantly influence trust calibration: participants in the reflection-first condition exhibited the lowest reliance on AI, while those in the control group showed the highest, thereby validating the efficacy of both interaction protocols and model weakening in fostering appropriate levels of AI dependence.

Technology Category

Application Category

📝 Abstract

Generative AI (GenAI) is increasingly used as a knowledge partner in higher education, raising the need for instructional designs that emphasize AI literacy practices such as evaluating output credibility and maintaining human accountability. Existing AI literacy frameworks focus more on what learners should do than on how these practices are enacted in routine student-GenAI collaboration. We address this gap by framing student-GenAI interaction as a transactive memory partnership, where credibility regulates reliance and verification. To make this process visible during coursework, we used a weaker large language model (LLM): small enough to run on most students' computers during class, helpful enough to support learning, but not so capable that it removes the need for verification. In an undergraduate STEM course, students were randomly assigned to one of three conditions across repeated activities: reflection-first (think first, then consult AI), verification-required (use AI, then evaluate the output), or control (unrestricted use). Students completed a transactive memory survey at three time points (N = 42). Weighted credibility diverged by condition over time. ANCOVA controlling for baseline credibility showed a condition effect at mid-semester, F(2, 38) = 4.02, p = .026, partial eta squared = .175, and a stronger effect at post-intervention, F(2, 38) = 5.48, p = .008, partial eta squared = .224; adjusted means were lowest in reflection-first, intermediate in verification-required, and highest in control. Parallel analyses of specialization and coordination were not significant. These findings suggest that workflow sequencing, deliberate use of weaker LLMs, and accountability cues embedded in assignment instructions can recalibrate students' credibility judgments in GenAI use, with reflection-first producing the strongest downward shift in reliance.

Problem

Research questions and friction points this paper is trying to address.

AI literacy

credibility judgment

human-GenAI collaboration

transactive memory

generative AI

Innovation

Methods, ideas, or system contributions that make the work stand out.

transactive memory

weaker LLM

AI literacy