pyFAST: A Modular PyTorch Framework for Time Series Modeling with Multi-source and Sparse Data

📅 2025-08-26

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

Existing Python time-series frameworks lack native support for irregular, multi-source, and sparse temporal data, limiting modeling flexibility and scalability. To address this, we propose a modular time-series modeling framework that decouples data preprocessing from model computation, enabling irregular-sampling data loading, dynamic normalization, mask-driven modeling, and external covariate integration. We introduce an alignment-agnostic sparse data fusion mechanism supporting efficient sequence- or block-level padding, alongside dedicated sparsity-aware metrics and loss functions. Built on PyTorch, the framework unifies linear models, CNNs, RNNs, Transformers, GNNs, and LLM-inspired architectures, and incorporates batch-stream co-training and hardware-aware optimization strategies. Experiments demonstrate superior scalability and computational efficiency on multi-source heterogeneous and sparse time-series forecasting and imputation tasks.

Technology Category

Application Category

📝 Abstract

Modern time series analysis demands frameworks that are flexible, efficient, and extensible. However, many existing Python libraries exhibit limitations in modularity and in their native support for irregular, multi-source, or sparse data. We introduce pyFAST, a research-oriented PyTorch framework that explicitly decouples data processing from model computation, fostering a cleaner separation of concerns and facilitating rapid experimentation. Its data engine is engineered for complex scenarios, supporting multi-source loading, protein sequence handling, efficient sequence- and patch-level padding, dynamic normalization, and mask-based modeling for both imputation and forecasting. pyFAST integrates LLM-inspired architectures for the alignment-free fusion of sparse data sources and offers native sparse metrics, specialized loss functions, and flexible exogenous data fusion. Training utilities include batch-based streaming aggregation for evaluation and device synergy to maximize computational efficiency. A comprehensive suite of classical and deep learning models (Linears, CNNs, RNNs, Transformers, and GNNs) is provided within a modular architecture that encourages extension. Released under the MIT license at GitHub, pyFAST provides a compact yet powerful platform for advancing time series research and applications.

Problem

Research questions and friction points this paper is trying to address.

Handling irregular multi-source sparse time series data

Decoupling data processing from model computation

Supporting alignment-free fusion of sparse data sources

Innovation

Methods, ideas, or system contributions that make the work stand out.

Modular PyTorch framework for multi-source sparse data

Decouples data processing from model computation

Integrates LLM-inspired architectures for alignment-free fusion

🔎 Similar Papers

Deep Time Series Models: A Comprehensive Survey and Benchmark