Learning to Play Blackjack: A Curriculum Learning Perspective

📅 2026-03-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of low training efficiency and suboptimal performance commonly faced by reinforcement learning agents in high-dimensional action spaces. It proposes, for the first time, a method that leverages large language models to dynamically generate action-level curricula, constructing multi-stage training trajectories for both Tabular Q-Learning and Deep Q-Network (DQN) agents in the game of Blackjack. By progressively introducing more complex actions, the approach integrates large language models, curriculum learning, and deep reinforcement learning to enhance learning efficacy. Evaluated in an eight-deck Blackjack environment, the method significantly improves agent performance: the DQN agent’s win rate increases from 43.97% to 47.41%, its bust rate decreases from 32.9% to 28.0%, and training converges over 74% faster—requiring less total training time than the evaluation phase of baseline methods.
📝 Abstract
Reinforcement Learning (RL) agents often struggle with efficiency and performance in complex environments. We propose a novel framework that uses a Large Language Model (LLM) to dynamically generate a curriculum over available actions, enabling the agent to incorporate each action individually. We apply this framework to the game of Blackjack, where the LLM creates a multi-stage training path that progressively introduces complex actions to a Tabular Q-Learning and a Deep Q-Network (DQN) agent. Our evaluation in a realistic 8-deck simulation over 10 independent runs demonstrates significant performance gains over standard training methods. The curriculum-based approach increases the DQN agent's average win rate from 43.97% to 47.41%, reduces the average bust rate from 32.9% to 28.0%, and accelerates the overall workflow by over 74%, with the agent's full training completing faster than the baseline's evaluation phase alone. These results validate that LLM-guided curricula can build more effective, robust, and efficient RL agents.
Problem

Research questions and friction points this paper is trying to address.

Reinforcement Learning
Curriculum Learning
Efficiency
Performance
Complex Environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Curriculum Learning
Large Language Model
Reinforcement Learning
Action-level Curriculum
Deep Q-Network
🔎 Similar Papers
No similar papers found.
A
Amirreza Alasti
Leibniz University Hannover
E
Efe Erdal
Leibniz University Hannover
Y
Yücel Celik
Leibniz University Hannover
Theresa Eimer
Theresa Eimer
RL Team Lead, Leibniz Universität Hannover
Reinforcement LearningAutoRLGeneralization