๐ค AI Summary
Conventional discrete choice models rely on manual trial-and-error and subjective assumptions, resulting in low efficiency and poor reproducibility; existing metaheuristic approaches treat model specification as a static optimization problem, ignoring historical estimation information and thus failing to enable dynamic search adaptation or cross-task knowledge transfer. Method: We propose the first deep reinforcement learningโbased (DQN) automated model search framework, formalizing model specification as a sequential decision-making process. We design a reward function jointly optimizing goodness-of-fit and model parsimony, and employ a serialized structural encoding scheme. Contribution/Results: The method requires no domain-specific prior knowledge, supports dynamic exploration control and cross-scenario knowledge transfer, and consistently converges to high-quality models under diverse data-generating processes. It significantly improves search efficiency, robustness, and generalization capability compared to state-of-the-art alternatives.
๐ Abstract
Discrete choice modelling is a theory-driven modelling framework for understanding and forecasting choice behaviour. To obtain behavioural insights, modellers test several competing model specifications in their attempts to discover the 'true' data generation process. This trial-and-error process requires expertise, is time-consuming, and relies on subjective theoretical assumptions. Although metaheuristics have been proposed to assist choice modellers, they treat model specification as a classic optimisation problem, relying on static strategies, applying predefined rules, and neglecting outcomes from previous estimated models. As a result, current metaheuristics struggle to prioritise promising search regions, adapt exploration dynamically, and transfer knowledge to other modelling tasks. To address these limitations, we introduce a deep reinforcement learning-based framework where an 'agent' specifies models by estimating them and receiving rewards based on goodness-of-fit and parsimony. Results demonstrate the agent dynamically adapts its strategies to identify promising specifications across data generation processes, showing robustness and potential transferability, without prior domain knowledge.