Combine and conquer: model averaging for out-of-distribution forecasting

📅 2025-06-04
📈 Citations: 0
Influential: 0
📄 PDF

career value

199K/year
🤖 AI Summary
Existing travel demand forecasting models exhibit poor accuracy for out-of-distribution (OOD) trips—particularly long-distance trips—due to distributional shift between training and deployment contexts. Method: We propose a distance-aware dynamic model averaging framework that integrates econometric, behavioral-psychological, and data-driven models. Crucially, we design a distance-adaptive weighting mechanism: as trip distance increasingly deviates from the training distribution, the framework automatically assigns higher weights to more interpretable econometric and behavioral models—departing from conventional static ensemble strategies. The method jointly incorporates distance-aware distributional modeling, multi-source model integration, and a dedicated OOD generalization evaluation framework. Contribution/Results: Two empirical studies demonstrate consistent improvements in predictive accuracy across in-distribution (training/validation sets) and critical OOD distance intervals. Moreover, the approach enhances both model robustness against distributional shifts and post-hoc interpretability—bridging the gap between performance and transparency in transportation forecasting.

Technology Category

Application Category

📝 Abstract
Travel behaviour modellers have an increasingly diverse set of models at their disposal, ranging from traditional econometric structures to models from mathematical psychology and data-driven approaches from machine learning. A key question arises as to how well these different models perform in prediction, especially when considering trips of different characteristics from those used in estimation, i.e. out-of-distribution prediction, and whether better predictions can be obtained by combining insights from the different models. Across two case studies, we show that while data-driven approaches excel in predicting mode choice for trips within the distance bands used in estimation, beyond that range, the picture is fuzzy. To leverage the relative advantages of the different model families and capitalise on the notion that multiple `weak' models can result in more robust models, we put forward the use of a model averaging approach that allocates weights to different model families as a function of the emph{distance} between the characteristics of the trip for which predictions are made, and those used in model estimation. Overall, we see that the model averaging approach gives larger weight to models with stronger behavioural or econometric underpinnings the more we move outside the interval of trip distances covered in estimation. Across both case studies, we show that our model averaging approach obtains improved performance both on the estimation and validation data, and crucially also when predicting mode choices for trips of distances outside the range used in estimation.
Problem

Research questions and friction points this paper is trying to address.

Evaluate model performance for out-of-distribution travel behavior prediction
Combine diverse models to improve robustness in forecasting mode choices
Propose distance-based model averaging to enhance prediction accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Model averaging combines diverse model families
Weights adjust based on trip distance characteristics
Improves out-of-distribution mode choice prediction
🔎 Similar Papers
No similar papers found.