🤖 AI Summary
Machine learning model metadata lacks standardization, machine-readability, and explicit support for quantifying environmental impact. Method: We propose the first standardized, JSON-LD–compatible schema for ML metadata tailored to knowledge graph (KG) embedding, uniquely integrating sustainability metrics—including energy consumption and carbon footprint—into model card semantics. We construct a Neo4j-based KG that unifies metadata extraction, cross-source ontology mapping, and empirical wireless localization data (22 models across 4 datasets). Contribution/Results: The resulting KG comprises 113 nodes and 199 relations, significantly improving model retrieval accuracy, cross-model environmental impact comparability, and automated reasoning capabilities. This infrastructure advances sustainable AI by enabling model reuse, discovery, and interoperability across platforms.
📝 Abstract
As the complexity and number of machine learning (ML) models grows, well-documented ML models are essential for developers and companies to use or adapt them to their specific use cases. Model metadata, already present in unstructured format as model cards in online repositories such as Hugging Face, could be more structured and machine readable while also incorporating environmental impact metrics such as energy consumption and carbon footprint. Our work extends the existing State of the Art by defining a structured schema for ML model metadata focusing on machine-readable format and support for integration into a knowledge graph (KG) for better organization and querying, enabling a wider set of use cases. Furthermore, we present an example wireless localization model metadata dataset consisting of 22 models trained on 4 datasets, integrated into a Neo4j-based KG with 113 nodes and 199 relations.