🤖 AI Summary
Existing online analysis and data reduction techniques for high-velocity, massive, strongly time-dependent seismic stream data lack theoretical foundations for autoregressive time series. Method: This paper proposes a sequential online sampling framework based on streaming leverage scores—the first such application in autoregressive streaming settings. It dynamically determines block lengths via randomized starting points and a sequential stopping rule, ensuring asymptotic normality while maximizing sample efficiency. The method integrates streaming leverage score computation, sequential sampling, least-squares estimation, and nonlinear autoregressive modeling, enabling real-time statistical inference for both linear and nonlinear streaming time series. Contribution/Results: Applied to the Turkey–Syria dual earthquake and Oklahoma microseismic datasets, the method successfully detects seismic events and characterizes complex temporal dependencies. Simulation studies confirm its superior statistical accuracy and computational efficiency compared to existing approaches.
📝 Abstract
Seismic data contain complex temporal information that arrives at high speed and has a large, even potentially unbounded volume. The explosion of temporally correlated streaming data from advanced seismic sensors poses analytical challenges due to its sheer volume and real-time nature. Sampling, or data reduction, is a natural yet powerful tool for handling large streaming data while balancing estimation accuracy and computational cost. Currently, data reduction methods and their statistical properties for streaming data, especially streaming autoregressive time series, are not well-studied in the literature. In this article, we propose an online leverage-based sequential data reduction algorithm for streaming autoregressive time series with application to seismic data. The proposed Sequential Leveraging Sampling (SLS) method selects only one consecutively recorded block from the data stream for inference. While the starting point of the SLS block is chosen using a random mechanism based on streaming leverage scores of data, the block size is determined by a sequential stopping rule. The SLS block offers efficient sample usage, as evidenced by our results confirming asymptotic normality for the normalized least squares estimator in both linear and nonlinear autoregressive settings. The SLS method is applied to two seismic datasets: the 2023 Turkey-Syria earthquake doublet data on the macroseismic scale and the Oklahoma seismic data on the microseismic scale. We demonstrate the ability of the SLS method to efficiently identify seismic events and elucidate their intricate temporal dependence structure. Simulation studies are presented to evaluate the empirical performance of the SLS method.