🤖 AI Summary
Scientific instruments generate data at increasingly high rates, far exceeding local computational capacity and file-transfer bandwidth—rendering file-based remote HPC analysis inadequate for time-critical experiments. To address this, we propose the first quantitative feasibility assessment framework for scientific stream processing. Our core contribution is the Streaming Speed Score (SSS), a unified metric that jointly models data generation rate, network transmission efficiency, end-to-end processing latency, and I/O overhead to precisely identify regimes where streaming outperforms batch file transfer. The framework integrates streaming dataflow execution, analytical latency modeling, and empirical measurements of network and storage performance, validated using real-world datasets from large-scale facilities such as LCLS-II. Experiments demonstrate that streaming reduces end-to-end completion time by up to 97% under high data rates; under extreme network congestion, file-transfer latency increases more than tenfold—highlighting the necessity of dynamic, context-aware feasibility evaluation.
📝 Abstract
Modern scientific instruments generate data at rates that increasingly exceed local compute capabilities and, when paired with the staging and I/O overheads of file-based transfers, also render file-based use of remote HPC resources impractical for time-sensitive analysis and experimental steering. Real-time streaming frameworks promise to reduce latency and improve system efficiency, but lack a principled way to assess their feasibility. In this work, we introduce a quantitative framework and an accompanying Streaming Speed Score to evaluate whether remote high-performance computing (HPC) resources can provide timely data processing compared to local alternatives. Our model incorporates key parameters including data generation rate, transfer efficiency, remote processing power, and file input/output overhead to compute total processing completion time and identify operational regimes where streaming is beneficial. We motivate our methodology with use cases from facilities such as APS, FRIB, LCLS-II, and the LHC, and validate our approach through an illustrative case study based on LCLS-II data. Our measurements show that streaming can achieve up to 97% lower end-to-end completion time than file-based methods under high data rates, while worst-case congestion can increase transfer times by over an order of magnitude, underscoring the importance of tail latency in streaming feasibility decisions.