🤖 AI Summary
This work addresses the lack of effective, standardized methods for classifying and managing complex structural features—such as surfaces, interfaces, defects, and dimensional variations—in materials science. To this end, we propose M-CODE, a compact ontological framework that uniquely integrates structure dimensionality, complexity (spanning from pristine to processed states), and evolutionary pathways into a unified classification system. By leveraging reusable conceptual modules and provenance-aware transformations, M-CODE enables semantic-rich, extensible material descriptions. Built upon JSON Schema and Python/TypeScript type systems, the accompanying open-source toolchain facilitates data generation, validation, and community-driven collaboration, thereby significantly enhancing the reproducibility and interoperability of materials data.
📝 Abstract
The rapid advancement of artificial intelligence in materials science requires data standards and data management practices that can capture the complexity of real-world structures, including surfaces, interfaces, defects, and dimensionality reduction. We present M-CODE - Materials Categorization via Ontology, Dimensionality and Evolution - a compact categorization system that links materials-science-specific terminology to a set of reusable concepts as building blocks and provenance-aware transformations. M-CODE classifies structures by dimensionality, structural complexity (from pristine to compound pristine, defective, and processed), and variants that capture common structure creation and evolution approaches. A practical implementation of the categorization is provided in an open-source codebase that includes JSON schemas, examples, and Python and TypeScript types/interfaces, designed to support reproducible dataset generation, validation, and community contributions.