🤖 AI Summary
Existing Python time-series frameworks lack native support for irregular, multi-source, and sparse temporal data, limiting modeling flexibility and scalability. To address this, we propose a modular time-series modeling framework that decouples data preprocessing from model computation, enabling irregular-sampling data loading, dynamic normalization, mask-driven modeling, and external covariate integration. We introduce an alignment-agnostic sparse data fusion mechanism supporting efficient sequence- or block-level padding, alongside dedicated sparsity-aware metrics and loss functions. Built on PyTorch, the framework unifies linear models, CNNs, RNNs, Transformers, GNNs, and LLM-inspired architectures, and incorporates batch-stream co-training and hardware-aware optimization strategies. Experiments demonstrate superior scalability and computational efficiency on multi-source heterogeneous and sparse time-series forecasting and imputation tasks.
📝 Abstract
Modern time series analysis demands frameworks that are flexible, efficient, and extensible. However, many existing Python libraries exhibit limitations in modularity and in their native support for irregular, multi-source, or sparse data. We introduce pyFAST, a research-oriented PyTorch framework that explicitly decouples data processing from model computation, fostering a cleaner separation of concerns and facilitating rapid experimentation. Its data engine is engineered for complex scenarios, supporting multi-source loading, protein sequence handling, efficient sequence- and patch-level padding, dynamic normalization, and mask-based modeling for both imputation and forecasting. pyFAST integrates LLM-inspired architectures for the alignment-free fusion of sparse data sources and offers native sparse metrics, specialized loss functions, and flexible exogenous data fusion. Training utilities include batch-based streaming aggregation for evaluation and device synergy to maximize computational efficiency. A comprehensive suite of classical and deep learning models (Linears, CNNs, RNNs, Transformers, and GNNs) is provided within a modular architecture that encourages extension. Released under the MIT license at GitHub, pyFAST provides a compact yet powerful platform for advancing time series research and applications.