A Python workflow definition for computational materials design

📅 2025-05-26
📈 Citations: 0
Influential: 0
📄 PDF

career value

196K/year
🤖 AI Summary
In computational materials science, multi-source workflow management systems (e.g., AiiDA, jobflow, pyiron) employ heterogeneous formats, severely hindering workflow interoperability and FAIR reproducibility. To address this, we propose Python Workflow Definition (PWD)—the first Python-native, cross-platform workflow exchange standard tailored for materials design. PWD decouples scientific logic from execution environments by encapsulating workflows via three lightweight, portable, and re-executable components: conda environments, Python function modules, and JSON-encoded directed acyclic graphs (DAGs). It enables bidirectional import/export among the three major systems, supports parameter tuning and resource reconfiguration, and is openly integrated into each platform. This work establishes the first domain-specific, unified workflow exchange paradigm, significantly enhancing interoperability, reproducibility, and sustainable sharing of computational materials workflows.

Technology Category

Application Category

📝 Abstract
Numerous Workflow Management Systems (WfMS) have been developed in the field of computational materials science with different workflow formats, hindering interoperability and reproducibility of workflows in the field. To address this challenge, we introduce here the Python Workflow Definition (PWD) as a workflow exchange format to share workflows between Python-based WfMS, currently AiiDA, jobflow, and pyiron. This development is motivated by the similarity of these three Python-based WfMS, that represent the different workflow steps and data transferred between them as nodes and edges in a graph. With the PWD, we aim at fostering the interoperability and reproducibility between the different WfMS in the context of Findable, Accessible, Interoperable, Reusable (FAIR) workflows. To separate the scientific from the technical complexity, the PWD consists of three components: (1) a conda environment that specifies the software dependencies, (2) a Python module that contains the Python functions represented as nodes in the workflow graph, and (3) a workflow graph stored in the JavaScript Object Notation (JSON). The first version of the PWD supports directed acyclic graph (DAG)-based workflows. Thus, any DAG-based workflow defined in one of the three WfMS can be exported to the PWD and afterwards imported from the PWD to one of the other WfMS. After the import, the input parameters of the workflow can be adjusted and computing resources can be assigned to the workflow, before it is executed with the selected WfMS. This import from and export to the PWD is enabled by the PWD Python library that implements the PWD in AiiDA, jobflow, and pyiron.
Problem

Research questions and friction points this paper is trying to address.

Lack of interoperability among computational materials science workflows
Need for reproducible workflow exchange between Python-based WfMS
Separation of scientific and technical complexity in workflow design
Innovation

Methods, ideas, or system contributions that make the work stand out.

Python Workflow Definition for interoperability
Three components: conda, Python module, JSON
Supports DAG-based workflows across WfMS
🔎 Similar Papers
2024-06-08Annual Meeting of the Association for Computational LinguisticsCitations: 2
💼 Related Jobs
Postdoctoral Fellow – AI-Driven Multi-Omics Integration for Predictive Toxicology
Pfizer
The annual base salary for this position ranges from $64,600.00 to $107,600.00. In addition, this position is eligible for participation in Pfizer’s Global Performance Plan with a bonus target of 7.5% of the base salary. We offer comprehensive and generous benefits and programs to help our colleagues lead healthy lives and to support each of life’s moments. Benefits offered include a 401(k) plan with Pfizer Matching Contributions and an additional Pfizer Retirement Savings Contribution, paid vacation, holiday and personal days, paid caregiver/parental and medical leave, and health benefits to include medical, prescription drug, dental and vision coverage. Learn more at Pfizer Candidate Site – U.S. Benefits | (uscandidates.mypfizerbenefits.com). Pfizer compensation structures and benefit packages are aligned based on the location of hire. The United States salary range provided does not apply to Tampa, FL or any location outside of the United States. Relocation assistance may be available based on business needs and/or eligibility.
Hybrid
Jan Janssen
Jan Janssen
Max Planck Institute for Sustainable Materials
J
Janine George
Bundesanstalt für Materialforschung und -prüfung, 12205 Berlin, Germany; Friedrich-Schiller Universität Jena, 07743 Jena, Germany
J
Julian Geiger
PSI Center for Scientific Computing, Theory and Data, 5232 Villigen PSI, Switzerland
M
Marnik Bercx
PSI Center for Scientific Computing, Theory and Data, 5232 Villigen PSI, Switzerland
X
Xing Wang
PSI Center for Scientific Computing, Theory and Data, 5232 Villigen PSI, Switzerland
C
Christina Ertural
Bundesanstalt für Materialforschung und -prüfung, 12205 Berlin, Germany
J
J. Schaarschmidt
Karlsruhe Institute of Technology (KIT), 76344 Eggenstein-Leopoldshafen, Germany
A
Alex M. Ganose
Imperial College London, 80 Wood Lane, W12 7TA London, UK
Giovanni Pizzi
Giovanni Pizzi
Laboratory for Materials Simulations, Paul Scherrer Institute (PSI), Villigen PSI, Switzerland
Solid-state PhysicsMaterials ScienceMaterials simulations
Tilmann Hickel
Tilmann Hickel
BAM Federal Institute for Materials Research and Testing
Materials Informatics
J
Joerg Neugebauer
Max Planck Institute for Sustainable Materials, 40237 Düsseldorf, Germany