π€ AI Summary
The widespread deployment of generative foundation models (GenFMs) necessitates systematic, trustworthy evaluation and enhancement mechanisms.
Method: We propose the first full-stack framework for GenFM trustworthiness, integrating multidimensional governance guidelines, a dynamic evaluation platform (TrustGen), and an evolution roadmap. Our approach uniquely unifies technical, ethical, legal, and societal perspectives to formulate trustworthiness principles; introduces TrustGenβa modular, multimodal, iteratively updatable platform supporting metadata governance, test-case generation, and contextual perturbation for fine-grained trust assessment; and grounds design in global AI policy analysis and interdisciplinary co-modeling.
Contribution/Results: We systematically characterize advances and bottlenecks across robustness, fairness, and explainability dimensions; open-source the TrustGen toolkit; and deliver a reusable, extensible infrastructure for dynamic, evidence-based GenFM trust evaluation.
π Abstract
Generative Foundation Models (GenFMs) have emerged as transformative tools. However, their widespread adoption raises critical concerns regarding trustworthiness across dimensions. This paper presents a comprehensive framework to address these challenges through three key contributions. First, we systematically review global AI governance laws and policies from governments and regulatory bodies, as well as industry practices and standards. Based on this analysis, we propose a set of guiding principles for GenFMs, developed through extensive multidisciplinary collaboration that integrates technical, ethical, legal, and societal perspectives. Second, we introduce TrustGen, the first dynamic benchmarking platform designed to evaluate trustworthiness across multiple dimensions and model types, including text-to-image, large language, and vision-language models. TrustGen leverages modular components--metadata curation, test case generation, and contextual variation--to enable adaptive and iterative assessments, overcoming the limitations of static evaluation methods. Using TrustGen, we reveal significant progress in trustworthiness while identifying persistent challenges. Finally, we provide an in-depth discussion of the challenges and future directions for trustworthy GenFMs, which reveals the complex, evolving nature of trustworthiness, highlighting the nuanced trade-offs between utility and trustworthiness, and consideration for various downstream applications, identifying persistent challenges and providing a strategic roadmap for future research. This work establishes a holistic framework for advancing trustworthiness in GenAI, paving the way for safer and more responsible integration of GenFMs into critical applications. To facilitate advancement in the community, we release the toolkit for dynamic evaluation.