Access Timing as Scaffolding: A Reinforcement Learning Approach to GenAI in Education

📅 2026-05-15

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

This study addresses the challenge of strategically regulating when generative artificial intelligence (GenAI) is accessible in educational settings to mitigate risks such as overreliance, metacognitive disengagement, and diminished learning outcomes. For the first time, GenAI access timing is conceptualized as an implicit scaffold, and a reinforcement learning agent is designed to dynamically determine optimal usage moments by integrating principles from metacognition theory, cognitive load theory, and productive failure theory. This approach enhances learning efficacy without requiring explicit prompts, thereby establishing a scalable new paradigm for human–AI collaborative learning. Empirical results demonstrate that, compared to unrestricted or completely prohibited GenAI access, the proposed strategy significantly improves post-test performance and metacognitive accuracy while reducing task errors and completion time, all while remaining compatible with existing GenAI tools.

📝 Abstract

In recent years, generative AI (GenAI) in educational settings has become ubiquitous in students'daily lives, despite its potential to induce over-reliance, metacognitive disengagement, and diminished learning when used unrestrictedly. While most prior research has thus focused on how to pedagogically scaffold its usage, the question of when to allow off-the-shelf GenAI remains understudied and lacks pedagogically grounded empirical investigation. We treat access timing itself as a form of implicit scaffolding and operationalize it through a reinforcement learning (RL) agent that decides when students should access GenAI, with a reward function grounded in metacognitive theory, cognitive load theory, and productive failure. In a mixed-methods controlled lab study with N=105 participants, we compared the agent's effect on learning gains and metacognitive engagement to unrestricted and fully restricted use. Results show that strategically timed GenAI access under the reinforcement learning condition improved objective post-test performance and metacognitive accuracy compared with unrestricted access, while reducing task errors and time on task relative to complete withholding, all without the need for explicit metacognitive prompts or structured scaffolding. However, no between-condition differences emerged on self-reported metacognitive awareness. Overall, timing of GenAI access therefore is a tractable, theoretically grounded, and scalable pedagogical paradigm that improves over completely unrestricted and withheld access, compatible with off-the-shelf tools and potentially low adoption barrier. This opens up a new research area that explores how access timing can be facilitated by educators and implemented in human-AI learning system design.

Problem

Research questions and friction points this paper is trying to address.

generative AI

access timing

scaffolding

metacognitive engagement

educational technology

Innovation

Methods, ideas, or system contributions that make the work stand out.

reinforcement learning

access timing

generative AI in education