🤖 AI Summary
This work addresses the absence of a unified, verifiable, data-centric analytical paradigm for machine learning models. We propose, for the first time, treating trained models as *first-class data objects* within the relational database paradigm. Methodologically, we design a declarative query framework built upon SQL extensions, integrating model structural metadata, behavioral interfaces, and interpretability modules, while deeply embedding into standard relational query engines. Our core contribution is enabling unified, composable, and formally verifiable SQL expressions for critical model-assurance tasks—including fairness auditing, robustness verification, and data provenance tracing—thereby overcoming the longstanding analytical fragmentation between models and data. Experimental evaluation demonstrates that our framework achieves both high efficiency and formal verifiability across diverse diagnostic tasks, establishing a foundation for rigorous, database-backed ML model analysis.
📝 Abstract
We consider machine learning models, learned from data, to be an important, intensional, kind of data in themselves. As such, various analysis tasks on models can be thought of as queries over this intensional data, often combined with extensional data such as data for training or validation. We demonstrate that relational database systems and SQL can actually be well suited for many such tasks.