🤖 AI Summary
To address the poor scalability and low query efficiency of traditional databases in large-scale heterogeneous data environments, this paper proposes IDSS—a decentralized data storage service leveraging peer-to-peer (P2P) networks and embedded relational databases. IDSS integrates lightweight relational engines (e.g., SQLite) directly into P2P nodes, enabling a distributed relational data layer under a unified logical schema. It introduces a distributed query processor supporting cross-node joins, aggregations, and predicate pushdown—without requiring centralized coordination. This design preserves relational semantics while achieving linear scalability. Experimental evaluation on TB-scale heterogeneous datasets demonstrates that IDSS improves complex query throughput by 1.8–3.2× over conventional distributed databases and reduces failure recovery time by 60%. Overall, IDSS significantly enhances efficiency, robustness, and scalability for massive data management.
📝 Abstract
The rate at which data is generated has been increasing rapidly, raising challenges related to its management. Traditional database management systems suffer from scalability and are usually inefficient when dealing with large-scale and heterogeneous data. This paper introduces IDSS (InnoCyPES Data Storage Service), a novel large-scale data storage tool that leverages peer-to-peer networks and embedded relational databases. We present the IDSS architecture and its design, and provide details related to the implementation. The peer-to-peer framework is used to provide support for distributed queries leveraging a relational database architecture based on a common schema. Furthermore, methods to support complex distributed query processing, enabling robust and efficient management of vast amounts of data are presented.