🤖 AI Summary
Conventional BioTree construction methods struggle with the scale and complexity of modern multimodal, multiscale biological data, particularly in integrating prior biological knowledge and achieving interpretable multimodal fusion. Method: This study systematically categorizes six types of biological priors—including evolutionary constraints and developmental trajectories—and identifies four key bottlenecks in multimodal fusion. We propose an interpretable, AI-driven fusion framework that integrates Bayesian prior modeling, graph neural networks, contrastive learning, and multi-omics alignment. Validation is conducted across NCBI, CellxGene, and TreeBASE. Contribution/Results: The work establishes a taxonomy of BioTree fusion methods and provides a theoretical framework and practical pipeline for interpretable inference of evolutionary and developmental trees, advancing both methodological rigor and biological interpretability in phylogenomic and developmental analyses.
📝 Abstract
Biological tree (BioTree) analysis is a foundational tool in biology, enabling the exploration of evolutionary and differentiation relationships among organisms, genes, and cells. Traditional tree construction methods, while instrumental in early research, face significant challenges in handling the growing complexity and scale of modern biological data, particularly in integrating multimodal datasets. Advances in deep learning (DL) offer transformative opportunities by enabling the fusion of biological prior knowledge with data-driven models. These approaches address key limitations of traditional methods, facilitating the construction of more accurate and interpretable BioTrees. This review highlights critical biological priors essential for phylogenetic and differentiation tree analyses and explores strategies for integrating these priors into DL models to enhance accuracy and interpretability. Additionally, the review systematically examines commonly used data modalities and databases, offering a valuable resource for developing and evaluating multimodal fusion models. Traditional tree construction methods are critically assessed, focusing on their biological assumptions, technical limitations, and scalability issues. Recent advancements in DL-based tree generation methods are reviewed, emphasizing their innovative approaches to multimodal integration and prior knowledge incorporation. Finally, the review discusses diverse applications of BioTrees in various biological disciplines, from phylogenetics to developmental biology, and outlines future trends in leveraging DL to advance BioTree research. By addressing the challenges of data complexity and prior knowledge integration, this review aims to inspire interdisciplinary innovation at the intersection of biology and DL.