🤖 AI Summary
This work addresses the performance degradation of time series foundation models (TSFMs) in zero-shot forecasting due to domain shifts. To mitigate this issue, the authors propose MixFT, a novel approach that leverages a Bayesian mixture model to partition source data into more homogeneous latent subdomains. For each inferred subdomain, a lightweight LoRA module is independently fine-tuned, enabling fine-grained adaptation of the TSFM. Unlike conventional strategies that apply uniform fine-tuning across the entire dataset or rely on predefined dataset splits, MixFT more accurately captures the distinct characteristics of individual subdomains. Extensive experiments demonstrate that MixFT significantly outperforms existing methods across multiple zero-shot forecasting tasks, thereby validating the efficacy and novelty of subdomain-specialized fine-tuning.
📝 Abstract
Time series foundation models (TSFMs) have become increasingly popular for zero-shot forecasting. However, for a new time series domain not fully covered by the pretraining set, performance can suffer. Therefore, when a practitioner cares about a new domain and has access to a set of related datasets, the question arises: how best to fine-tune a TSFM to improve zero-shot forecasting? A typical approach to this type of problem is to fine-tune a LoRA module on all datasets or separately on each dataset. Tuning a separate module on each dataset allows for the specialisation of the TSFM to different types of data distribution, by selecting differing combinations of per-dataset modules for different time series contexts. However, we find that, using per-dataset modules might not be optimal, since a time series dataset can contain data from several types of distributions, i.e. sub-domains. This can be due to the distribution shifting or having differing distributions for different dimensions of the time series. Hence, we propose MixFT which re-divides the data using Bayesian mixtures into sets that best represent the sub-domains present in the data, and fine-tunes separately on each of these sets. This re-division of the data ensures that each set is more homogeneous, leading to fine-tuned modules focused on specific sub-domains. Our experiments show that MixFT performs better than per-dataset methods and when fine-tuning a single module on all the data. This suggests that by re-partitioning the data to represent sub-domains we can better specialise TSFMs to improve zero-shot forecasting.