🤖 AI Summary
This paper addresses the insufficient multi-scale representation learning in time-series forecasting, specifically for markdown pricing prediction. Methodologically: (1) it introduces a novel hierarchical time-series patching mechanism to enable multi-granularity local modeling; (2) it designs a multi-scale embedding and fusion strategy for time-varying known covariates; (3) it incorporates a cross-sequence Mixer module to enhance global dependency modeling; and (4) it constructs a scalable hierarchical output head. Evaluated on a large-scale real-world markdown forecasting task from a major retailer, the proposed Multi-Resolution Transformer significantly outperforms both the production baseline and state-of-the-art deep models—including Informer and Autoformer—demonstrating the effectiveness of multi-resolution representations in capturing long-horizon, non-stationary markdown signals. The work establishes a scalable, multi-scale modeling paradigm for time-series forecasting.
📝 Abstract
We propose a transformer architecture for time series forecasting with a focus on time series tokenisation and apply it to a real-world prediction problem from the pricing domain. Our architecture aims to learn effective representations at many scales across all available data simultaneously. The model contains a number of novel modules: a differentiated form of time series patching which employs multiple resolutions, a multiple-resolution module for time-varying known variables, a mixer-based module for capturing cross-series information, and a novel output head with favourable scaling to account for the increased number of tokens. We present an application of this model to a real world prediction problem faced by the markdown team at a very large retailer. On the experiments conducted our model outperforms in-house models and the selected existing deep learning architectures.