Vejde: A Framework for Inductive Deep Reinforcement Learning Based on Factor Graph Color Refinement

📅 2025-09-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses decision-making under structured state spaces—comprising object classes and relational dependencies—by proposing an inductive policy learning framework. Methodologically, it models MDP states as bipartite graphs, introduces factor graph coloring to guide message passing in graph neural networks, and jointly trains the policy network via supervised and reinforcement learning. The key contribution is the first integration of factor graph coloring into deep reinforcement learning, enabling cross-instance generalization of policies across varying problem scales and topologies. Empirical evaluation across 80 task instances from 8 RDDL domains demonstrates that the Veijde agent achieves average test performance on unseen instances comparable to a per-instance trained MLP baseline, and substantially outperforms the online planning algorithm Prost. These results validate its strong inductive transfer capability.

Technology Category

Application Category

📝 Abstract
We present and evaluate Vejde; a framework which combines data abstraction, graph neural networks and reinforcement learning to produce inductive policy functions for decision problems with richly structured states, such as object classes and relations. MDP states are represented as data bases of facts about entities, and Vejde converts each state to a bipartite graph, which is mapped to latent states through neural message passing. The factored representation of both states and actions allows Vejde agents to handle problems of varying size and structure. We tested Vejde agents on eight problem domains defined in RDDL, with ten problem instances each, where policies were trained using both supervised and reinforcement learning. To test policy generalization, we separate problem instances in two sets, one for training and the other solely for testing. Test results on unseen instances for the Vejde agents were compared to MLP agents trained on each problem instance, as well as the online planning algorithm Prost. Our results show that Vejde policies in average generalize to the test instances without a significant loss in score. Additionally, the inductive agents received scores on unseen test instances that on average were close to the instance-specific MLP agents.
Problem

Research questions and friction points this paper is trying to address.

Inductive policy functions for structured decision problems
Generalizing reinforcement learning to varying problem sizes
Handling richly structured states with object relations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines data abstraction with graph neural networks
Uses neural message passing for latent states
Handles varying problem sizes via factored representation
🔎 Similar Papers
No similar papers found.
J
Jakob Nyberg
Division of Network and Systems Engineering, KTH Royal Institute of Technology, Stockholm, Sweden
Pontus Johnson
Pontus Johnson
Professor, KTH Royal Institute of Technology
cyber securityenterprise architecture