🤖 AI Summary
Current 3D genomics research suffers from a semantic disconnect between chromatin structure (e.g., Hi-C contact maps) and epigenetic functional signals, lacking a unified multimodal representation. Method: We propose MIX-HIC, the first foundational multimodal model integrating Hi-C–derived 3D chromatin architecture with multi-omics epigenetic signals. It introduces cross-modal interaction and mapping modules, leverages the first large-scale pretraining dataset comprising over one million paired Hi-C–epigenomic samples, and jointly models contact map topology with temporal/spatial patterns of epigenetic signals. Contributions/Results: (1) It achieves end-to-end unified representation of 3D genome structure and functional semantics; (2) It significantly outperforms state-of-the-art methods on downstream tasks including chromatin state prediction and enhancer–promoter link inference; (3) We publicly release both the model and dataset, establishing a scalable foundational infrastructure for functional interpretation of 3D genome organization.
📝 Abstract
Deep learning techniques have driven significant progress in various analytical tasks within 3D genomics in computational biology. However, a holistic understanding of 3D genomics knowledge remains underexplored. Here, we propose MIX-HIC, the first multimodal foundation model of 3D genome that integrates both 3D genome structure and epigenomic tracks, which obtains unified and comprehensive semantics. For accurate heterogeneous semantic fusion, we design the cross-modal interaction and mapping blocks for robust unified representation, yielding the accurate aggregation of 3D genome knowledge. Besides, we introduce the first large-scale dataset comprising over 1 million pairwise samples of Hi-C contact maps and epigenomic tracks for high-quality pre-training, enabling the exploration of functional implications in 3D genomics. Extensive experiments show that MIX-HIC can significantly surpass existing state-of-the-art methods in diverse downstream tasks. This work provides a valuable resource for advancing 3D genomics research.