On Creating a Causally Grounded Usable Rating Method for Assessing the Robustness of Foundation Models Supporting Time Series

📅 2025-02-17

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

This paper addresses the lack of robustness evaluation for Foundation Models for Time Series (FMTS) under input perturbations. We propose the first interpretable, multi-dimensional rating framework integrating causal inference. Methodologically, we design causal sensitivity analysis and perturbation robustness quantification metrics to systematically compare multimodal versus unimodal, and task-specific versus general pre-trained FMTS; practical utility is validated via user studies and interactive visualizations. Key contributions include: (1) the first application of causal inference to FMTS robustness assessment; (2) a standardized, actionable, and interpretable rating system; (3) empirical evidence that multimodal and task-specific FMTS exhibit superior robustness and accuracy; and (4) substantial reduction in cross-model comparison effort, thereby enhancing trustworthiness in high-stakes domains such as finance.

Technology Category

Application Category

📝 Abstract

Foundation Models (FMs) have improved time series forecasting in various sectors, such as finance, but their vulnerability to input disturbances can hinder their adoption by stakeholders, such as investors and analysts. To address this, we propose a causally grounded rating framework to study the robustness of Foundational Models for Time Series (FMTS) with respect to input perturbations. We evaluate our approach to the stock price prediction problem, a well-studied problem with easily accessible public data, evaluating six state-of-the-art (some multi-modal) FMTS across six prominent stocks spanning three industries. The ratings proposed by our framework effectively assess the robustness of FMTS and also offer actionable insights for model selection and deployment. Within the scope of our study, we find that (1) multi-modal FMTS exhibit better robustness and accuracy compared to their uni-modal versions and, (2) FMTS pre-trained on time series forecasting task exhibit better robustness and forecasting accuracy compared to general-purpose FMTS pre-trained across diverse settings. Further, to validate our framework's usability, we conduct a user study showcasing FMTS prediction errors along with our computed ratings. The study confirmed that our ratings reduced the difficulty for users in comparing the robustness of different systems.

Problem

Research questions and friction points this paper is trying to address.

Assess robustness of Foundation Models for Time Series

Propose causally grounded rating framework for FMTS

Evaluate FMTS robustness in stock price prediction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Causally grounded rating framework

Robustness assessment for FMTS

Multi-modal FMTS outperform uni-modal

🔎 Similar Papers

A Reliable Framework for Human-in-the-Loop Anomaly Detection in Time Series