🤖 AI Summary
Existing Computational Object Storage (COS) systems face three key bottlenecks when performing large-scale scientific tabular data SQL analytics in HPC environments: rigid output formats, limited operator pushdown capability, and inadequate adaptation to deep storage hierarchies. To address these, we propose COS-SQL—a near-data SQL analytics framework tailored for HPC. Our approach features: (1) flexible output format support—including Arrow columnar layout; (2) full-stage pushdown of complex operators and array expressions; and (3) dynamic execution path selection based on hierarchical storage structure. COS-SQL adopts object-level storage organization and tightly integrates with Apache Spark. Evaluated on real-world HPC workloads, it achieves up to 32.7% end-to-end performance improvement over state-of-the-art COS systems, significantly enhancing both analytical flexibility and execution efficiency.
📝 Abstract
Computation-Enabled Object Storage (COS) systems, such as MinIO and Ceph, have recently emerged as promising storage solutions for post hoc, SQL-based analysis on large-scale datasets in High-Performance Computing (HPC) environments. By supporting object-granular layouts, COS facilitates column-oriented access and supports in-storage execution of data reduction operators, such as filters, close to where the data resides. Despite growing interest and adoption, existing COS systems exhibit several fundamental limitations that hinder their effectiveness. First, they impose rigid constraints on output data formats, limiting flexibility and interoperability. Second, they support offloading for only a narrow set of operators and expressions, restricting their applicability to more complex analytical tasks. Third--and perhaps most critically--they fail to incorporate design strategies that enable compute offloading optimized for the characteristics of deep storage hierarchies. To address these challenges, this paper proposes OASIS, a novel COS system that features: (i) flexible and interoperable output delivery through diverse formats, including columnar layouts such as Arrow; (ii) broad support for complex operators (e.g., aggregate, sort) and array-aware expressions, including element-wise predicates over array structures; and (iii) dynamic selection of optimal execution paths across internal storage layers, guided by operator characteristics and data movement costs. We implemented a prototype of OASIS and integrated it into the Spark analytics framework. Through extensive evaluation using real-world scientific queries from HPC workflows, OASIS achieves up to a 32.7% performance improvement over Spark configured with existing COS-based storage systems.