🤖 AI Summary
The widespread deployment of foundation models raises critical challenges concerning trustworthiness, safety, and human alignment. Method: This paper introduces “Model Science” as a novel paradigm that places post-trained models at the analytical center, systematically investigating verification, explanation, control, and interaction across diverse environments. It establishes a conceptual framework built upon four pillars—verification, explanation, control, and interface—and proposes context-aware evaluation protocols, internal mechanism probing methods, model alignment techniques, and interactive visualization tools. Contribution/Results: The resulting integrated analysis framework enables rigorous assessment, transparent interpretation, and reliable control of foundation models. It provides both theoretical foundations and practical pathways for AI system governance and engineering deployment, facilitating a paradigm shift from data-centric to model-centric science.
📝 Abstract
The growing adoption of foundation models calls for a paradigm shift from Data Science to Model Science. Unlike data-centric approaches, Model Science places the trained model at the core of analysis, aiming to interact, verify, explain, and control its behavior across diverse operational contexts. This paper introduces a conceptual framework for a new discipline called Model Science, along with the proposal for its four key pillars: Verification, which requires strict, context-aware evaluation protocols; Explanation, which is understood as various approaches to explore of internal model operations; Control, which integrates alignment techniques to steer model behavior; and Interface, which develops interactive and visual explanation tools to improve human calibration and decision-making. The proposed framework aims to guide the development of credible, safe, and human-aligned AI systems.