🤖 AI Summary
This work addresses the challenge of generating structurally diverse ensembles of three-dimensional genomic conformations for *Escherichia coli* from Hi-C contact maps, rather than a single consensus structure. To this end, the authors formulate genome reconstruction as a conditional generation task and propose a novel approach that integrates diffusion models with Transformers for prokaryotic 3D modeling—marking the first such application in this domain. The method introduces a replication-aware representation and unidirectional physical constraints to enhance interpretability. Built upon a latent diffusion framework, it combines a variational autoencoder, a Transformer encoder, and cross-attention mechanisms, and is trained using a flow-matching objective. The resulting conformational ensembles not only accurately recapitulate the distance decay and structural correlations observed in Hi-C data but also exhibit significant yet biologically plausible structural diversity.
📝 Abstract
In this study, we present a conditional diffusion-transformer framework for generating ensembles of three-dimensional Escherichia coli genome conformations guided by Hi-C contact maps. Instead of producing a single deterministic structure, we formulate genome reconstruction as a conditional generative modeling problem that samples heterogeneous conformations whose ensemble-averaged contacts are consistent with the input Hi-C data. A synthetic dataset is constructed using coarse-grained molecular dynamics simulations to generate chromatin ensembles and corresponding Hi-C maps under circular topology. Our models operate in a latent diffusion setting with a variational autoencoder that preserves per-bin alignment and supports replication-aware representations. Hi-C information is injected through a transformer-based encoder and cross-attention, enforcing a physically interpretable one-way constraint from Hi-C to structure. The model is trained using a flow-matching objective for stable optimization. On held-out ensembles, generated structures reproduce the input Hi-C distance-decay and structural correlation metrics while maintaining substantial conformational diversity, demonstrating the effectiveness of diffusion-based generative modeling for ensemble-level 3D genome reconstruction.