Modular Machine Learning with Applications to Genetic Circuit Composition

📅 2025-09-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses multi-module systems with known modular architecture but unknown functional behavior. Method: We propose a modular machine learning framework that incorporates structural prior knowledge. Central to our approach is the concept of “module identifiability,” for which we provide theoretical guarantees that individual module input–output mappings can be uniquely recovered from partial input–output data. The model architecture explicitly couples neural networks with topological constraints imposed by the system’s interconnection structure, enabling structure-aware modeling. Contribution/Results: Compared to black-box neural networks, our framework substantially reduces data requirements—achieving theoretically minimal sampling complexity—while supporting cross-configuration generalization. Experiments demonstrate accurate module functional identification and high predictive accuracy when extrapolating to novel system configurations outside the training distribution, a capability absent in conventional methods that lack structural priors.

Technology Category

Application Category

📝 Abstract
In several applications, including in synthetic biology, one often has input/output data on a system composed of many modules, and although the modules' input/output functions and signals may be unknown, knowledge of the composition architecture can significantly reduce the amount of training data required to learn the system's input/output mapping. Learning the modules' input/output functions is also necessary for designing new systems from different composition architectures. Here, we propose a modular learning framework, which incorporates prior knowledge of the system's compositional structure to (a) identify the composing modules' input/output functions from the system's input/output data and (b) achieve this by using a reduced amount of data compared to what would be required without knowledge of the compositional structure. To achieve this, we introduce the notion of modular identifiability, which allows recovery of modules' input/output functions from a subset of the system's input/output data, and provide theoretical guarantees on a class of systems motivated by genetic circuits. We demonstrate the theory on computational studies showing that a neural network (NNET) that accounts for the compositional structure can learn the composing modules' input/output functions and predict the system's output on inputs outside of the training set distribution. By contrast, a neural network that is agnostic of the structure is unable to predict on inputs that fall outside of the training set distribution. By reducing the need for experimental data and allowing module identification, this framework offers the potential to ease the design of synthetic biological circuits and of multi-module systems more generally.
Problem

Research questions and friction points this paper is trying to address.

Learning input/output functions of modules in compositional systems
Reducing training data requirements using compositional structure knowledge
Enabling prediction on inputs outside training distribution
Innovation

Methods, ideas, or system contributions that make the work stand out.

Modular learning framework using compositional structure knowledge
Modular identifiability concept with theoretical guarantees
Neural network leveraging architecture for module identification
🔎 Similar Papers
No similar papers found.
J
Jichi Wang
Department of Mechanical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
E
Eduardo D. Sontag
Department of Electrical and Computer Engineering and Department of Bioengineering, Northeastern University, Boston, Massachusetts 02115, USA
Domitilla Del Vecchio
Domitilla Del Vecchio
Mechanical Engineering, MIT
control and dynamical systems theorysystems and synthetic biology